Login  Register

Re: Kooql

Posted by Mark joshi-2 on Sep 02, 2010; 12:57am
URL: http://quantlib.414.s1.nabble.com/Kooql-tp7085p7087.html

Dear All

update on kooql

Ok i've made more progress.

The cash-flow generation and discounting are now done on the GPU. I
have also got the code to work with 2 GPUs. Current timings:

1 million paths,
32 rates
32 steps
5 factors
time to compute price
0.7 seconds with 2 gpus
1 second with 1 gpu.

Rough time with QuantLib market model code: 170 seconds.
so speed up is about 240x and 170x.

The cash-flow generation is templatized on the product so is fairly generic.

Hardware is one Quadro FX5800 and one Tesla C1060. (thank you NVIDIA!)

regards

Mark


On 9 August 2010 11:43, Mark joshi <[hidden email]> wrote:

> Dear Kakhkhor,
>
> all tests were done with floats on the GPU -- these seem to be more
> than accurate enough.
>
> I have not yet addressed the issues with LS on the GPU!
>
> The project is incremental. Phase 1 was do the paths on the GPU and
> the rest on the CPU,
> hence I had to transfer the paths to the CPU. The main advantage of
> this approach is
> that you get all the QuantLib functionality.
>
> Phase 2 will be to price on the GPU and will not require path
> transfer. I am hoping that
> the total time for this will be less than 10 seconds! I am about to
> test the code for this.
> The main trickiness was in how to define the product in a sufficiently
> generic way that
> it was not necessary to do a huge amount of recoding for every new
> product. But I think
> I now have a reasonable solution.
>
> Phase 3 will be to look at LS on the GPU! I haven't addressed the problems of
> parallel regression yet. Very happy to discuss, however, either here,
> or one-to-one.
>
> regards
>
> Mark
>
>
>
>
>
>
> ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
> Dear Mark,
> Dear QuantLib users,
> The Kooderive project looks very interesting. I curious about specs of
> GPU/CPU the tests were run. What kind of floating point numbers were
> used, double or single?
>
> Currently I am working on multicore implementation of LMM too. One
> problem with many factor models is that, Longstaff-Schwarz method
> requires huge amount of memory to store paths. I am considering to use
> floats to save space and for better speedup with  SSE2. But not sure
> about numerical error. Do you thing that single precision accuracy is
> enough in most of cases?
>
> How would Longstaff-Schwarz be implemented on GPU? Is there any
> scalable implementation of the regression part?
>
> Also, why cuda evolver needs to transfer path to CPU? Can't GPU price
> all paths and simply return the results?
>
> With cuda evolver 6 seconds were used for transferring paths. Does it
> mean that without transferring path the total time can be reduced from
> 32 to 26 seconds?
>
> Regards,
> Kakhkhor Abdijalilov.
>
>
>
>
> --
>
>
> Assoc Prof Mark Joshi
> Centre for Actuarial Studies
> University of Melbourne
> My website is www.markjoshi.com
>



--


Assoc Prof Mark Joshi
Centre for Actuarial Studies
University of Melbourne
My website is www.markjoshi.com

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
QuantLib-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-users