Login  Register

Re: Kooql

Posted by Mark joshi-2 on Aug 09, 2010; 1:43am
URL: http://quantlib.414.s1.nabble.com/Kooql-tp7085p7086.html

Dear Kakhkhor,

all tests were done with floats on the GPU -- these seem to be more
than accurate enough.

I have not yet addressed the issues with LS on the GPU!

The project is incremental. Phase 1 was do the paths on the GPU and
the rest on the CPU,
hence I had to transfer the paths to the CPU. The main advantage of
this approach is
that you get all the QuantLib functionality.

Phase 2 will be to price on the GPU and will not require path
transfer. I am hoping that
the total time for this will be less than 10 seconds! I am about to
test the code for this.
The main trickiness was in how to define the product in a sufficiently
generic way that
it was not necessary to do a huge amount of recoding for every new
product. But I think
I now have a reasonable solution.

Phase 3 will be to look at LS on the GPU! I haven't addressed the problems of
parallel regression yet. Very happy to discuss, however, either here,
or one-to-one.

regards

Mark






///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Dear Mark,
Dear QuantLib users,
The Kooderive project looks very interesting. I curious about specs of
GPU/CPU the tests were run. What kind of floating point numbers were
used, double or single?

Currently I am working on multicore implementation of LMM too. One
problem with many factor models is that, Longstaff-Schwarz method
requires huge amount of memory to store paths. I am considering to use
floats to save space and for better speedup with  SSE2. But not sure
about numerical error. Do you thing that single precision accuracy is
enough in most of cases?

How would Longstaff-Schwarz be implemented on GPU? Is there any
scalable implementation of the regression part?

Also, why cuda evolver needs to transfer path to CPU? Can't GPU price
all paths and simply return the results?

With cuda evolver 6 seconds were used for transferring paths. Does it
mean that without transferring path the total time can be reduced from
32 to 26 seconds?

Regards,
Kakhkhor Abdijalilov.




--


Assoc Prof Mark Joshi
Centre for Actuarial Studies
University of Melbourne
My website is www.markjoshi.com

------------------------------------------------------------------------------
This SF.net email is sponsored by

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
QuantLib-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-users