QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnGPU and CUDA are useful for embarrassingly parallel problems such as MC/MLMC and so on. That is well known.What is less well known maybe is that it is less efficient for non SPMD problems. To take an example, tests on 2-factor PDE/ADE problems can have a speedup tat depends on the sizes of NX, NY, NT. For small values speedup can be 50, while for larger values it can converge to a value like 3 (or 1.3).In this regards OpenMP is more suitable for PDE/ADE.That's strange, I would expect the GPU to be much faster for PDEs.Where did you get that low 1.3 value? How many cores was that on, and did you use GPU optimized linear algebra libs?The results will be made public domain soon.Not strange at all, it's what I expected. And AFAIK no one has done it. Otherwise they would be shouting it for the rooftops? PDE in finance are too small.It's nothing got to do with them linear algebra things (that's not the issue); PDE is intrinsically sequential (hint: compute the serial fraction of a typical FD scheme, just take explicit Euler with NT = 10000, NX = 50, not a matrix in sight but no point parallelizing. QED).I would not want too much time on PDE and GPU. Maybe combining OpenMP and GPU is an option. As mentioned, OpenMP for ADE give speedup 5 on 8-core machine without effort!!But how can you speedup sequential code with OpenMP that can't be done on GPU? There *has* be parallel elements in order to have a case for OpenMP,.. and if so then a GPU has probably some hardware extras that regular CPU scattered across multiple machines just don't have (like fast shared memory)... post the paper when it's done! Ok?To answer this question you first have to know how ADE works. A back-of-envelope too
Last edited by Cuchulainn
on September 30th, 2015, 10:00 pm, edited 1 time in total.