Login  Register

答复: 答复: 答复: 答复: 答复: 答复: 答复: Openmp work on mcarlo : Dynamic Creator MT

Posted by cheng li on Oct 27, 2014; 3:44am
URL: http://quantlib.414.s1.nabble.com/Re-Openmp-work-on-mcarlo-Dynamic-Creator-MT-tp15832p15997.html

Hi Peter,

It works great works on Windows.

Try 9999999999 samples:

Original MT: 35.63
Daynamic MT: 37.03

And also I try 100000, 100000000, 1000000000 samples,

The result are similar and the elapsed time grows linearly.

I tried vc++ 2012. The vc++ 2010 will work same in my opnion. I will get back to you when vc++ 2010 test finished.

Regards,
Cheng

-----邮件原件-----
发件人: Peter Caspers [mailto:[hidden email]]
发送时间: 2014年10月27日 3:49
收件人: cheng li
抄送: QuantLib Mailing Lists
主题: Re: 答复: 答复: 答复: 答复: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic Creator MT

Hi,

I think I could further improve the performance of the precomputed twisters (i.e. the ones constructed as

MersenneTwisterCustomRng<Mtdesc19937_5> mt(42);

). Now they seem to be just as fast as the original one (I only tested on Linux). The PR is updated.

Cheng, would you maybe like to double check ?

Thanks a lot
Peter

On 23 September 2014 03:50, cheng li <[hidden email]> wrote:

> Hi Peter,
>
> On my side the performance is also improved. Now around 2.5 slow down. Thanks for your help.
>
> Regards,
> Cheng
>
> -----邮件原件-----
> 发件人: Peter Caspers [mailto:[hidden email]]
> 发送时间: 2014年9月22日 16:05
> 收件人: cheng li
> 抄送: QuantLib Mailing Lists
> 主题: Re: 答复: 答复: 答复: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo :
> Dynamic Creator MT
>
> yes, please. The slowdown on Windows on my office computer is around 1.6 now.
> best regards
> Peter
>
> On 22 September 2014 03:48, cheng li <[hidden email]> wrote:
>> Hi Peter,
>>
>> Thanks for your effort. I'll definitely have a try:)
>>
>> Regards,
>> Cheng
>>
>> -----邮件原件-----
>> 发件人: Peter Caspers [mailto:[hidden email]]
>> 发送时间: 2014年9月21日 23:11
>> 收件人: cheng.li
>> 抄送: QuantLib Mailing Lists
>> 主题: Re: 答复: 答复: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo :
>> Dynamic Creator MT
>>
>> Hi Cheng,
>>
>> I switched to a template class for precomputed twisters, which is
>> faster by a factor of 2 (450ms instead of 870ms). This can be
>> instantiated with
>>
>> MersenneTwisterCustomRng<Mtdesc19937_5> mt(42);
>>
>> with 5 replaceable by 0 to 7 as before. The other is only needed now if you want to create a mt during runtime.
>>
>> The pull request is updated accordingly.
>>
>> Best regards
>> Peter
>>
>>
>>
>>
>> On 21 September 2014 08:11, cheng.li <[hidden email]> wrote:
>>> Hi Peter,
>>>
>>> Thanks for your hard work. I think our results are consistent.
>>>
>>> Regards,
>>> Cheng
>>>
>>> -----邮件原件-----
>>> 发件人: Peter Caspers [mailto:[hidden email]]
>>> 发送时间: 2014年9月21日 0:33
>>> 收件人: cheng li
>>> 抄送: QuantLib Mailing Lists
>>> 主题: Re: 答复: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic
>>> Creator MT
>>>
>>> Hi Cheng,
>>>
>>> sorry, this was my fault, I messed up the timings, because I did not use consistent optimizer flags when compiling the library and the test program.
>>>
>>> Actually on Windows (same machine on which I run Ubuntu, which
>>> doesn't really matter, because my computer in office gives very
>>> similar
>>> timings) I get for 1E8 random numbers generated (with O2)
>>>
>>> 400ms / 1100ms
>>>
>>> for the original ql mt / dynamic creator mt. The ql mt is just as
>>> fast as the boost mt implementation by the way. On Ubuntu with gcc
>>> 4.8.1 and O3 I get
>>>
>>> 290ms / 870ms
>>>
>>> and with O2 a close value, for the creator mt 910ms. Also it makes no difference if I use gcc 4.9.1 or clang 3.6.0.
>>>
>>> If I directly call the original C routine without using the wrapper object, I get 720ms.
>>>
>>> If I use the original library and a C example (both compiled with O3, this is the configuration how the library is shipped (it has a hardcoded make file)) => 730ms.
>>>
>>> This means, the wrapper introduces a slow down by 20% which seems not too bad.
>>>
>>> Otherwise the dcmt is slower by a factor of around 2-3 compared to the original mt in all cases. Since this is already the case with the original library, I wouldn't try to do anything about it at the moment.
>>>
>>> What is your opinion on this ?
>>>
>>> Peter
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> I compared dfiferent platforms again, but now on the _same_ machine - Original MT / Dynamic Creator MT (generation of 1E8 numbers, single threaded, with O2 (MSVC) and O3 (gcc, clang)). I also checked the boost implementation mt19937, which is very close to the ql original mt in all cases.
>>>
>>> Winodws / MSVC 2010 => 400ms / 1100ms Ubuntu / gcc 4.9.1 => 1200 ms
>>> /
>>> 1050 ms Ubuntu / gcc 4.8.1 => 1180 ms / 1040 ms Ubuntu / clang 3.6.0
>>> => 1340 ms / 1150 ms
>>>
>>> clang
>>> 290
>>> 720
>>> 870
>>>
>>> (c 730)
>>>
>>> so it looks like MSVC does a specific optimization for the QL and boost mt19937, which does not apply on the other platforms and not the the dynamic creator mt.
>>>
>>> At the moment I stil don't know what it is.
>>>
>>> On 18 September 2014 03:33, cheng li <[hidden email]> wrote:
>>>> Let me try your statement once I have a time.
>>>>
>>>> Regards,
>>>> Cheng
>>>>
>>>> -----邮件原件-----
>>>> 发件人: cheng li [mailto:[hidden email]]
>>>> 发送时间: 2014年9月18日 9:18
>>>> 收件人: 'Peter Caspers'
>>>> 抄送: 'QuantLib Mailing Lists'
>>>> 主题: 答复: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic
>>>> Creator MT
>>>>
>>>> Hi Peter,
>>>>
>>>> I used gcc 4.8.2.
>>>>
>>>> My result with O3 optimization is still not good. Similar
>>>> performance of new MT ( about 3~4X speed down)
>>>>
>>>> I used such statement to turn on o3 optimization before I do
>>>> ./configure for QuantLib,
>>>>
>>>> Export CXXFLAGS="-g -O3"
>>>>
>>>> Am I right?
>>>>
>>>> Regards,
>>>> Cheng
>>>>
>>>> -----邮件原件-----
>>>> 发件人: Peter Caspers [mailto:[hidden email]]
>>>> 发送时间: 2014年9月18日 0:36
>>>> 收件人: cheng li
>>>> 抄送: QuantLib Mailing Lists
>>>> 主题: Re: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic
>>>> Creator MT
>>>>
>>>> with gcc 4.9.1 and O2 the new mt is a bit slower than the original one (but only by a factor of 1.1).
>>>> I have to add both -frename-registers, -finline-functions to -O2 to get the speed up back I mentioned before.
>>>>
>>>> Which compiler do you use on Ubuntu ?
>>>>
>>>> Peter
>>>>
>>>>
>>>>
>>>> On 17 September 2014 03:26, cheng li <[hidden email]> wrote:
>>>>> Thanks Peter. I test on Ubuntu also, about 3~4X lower with -O2 optiomization.
>>>>>
>>>>> I'll try -O3 on my machine also with Ubuntu.
>>>>>
>>>>> Regards,
>>>>> Cheng
>>>>>
>>>>> -----邮件原件-----
>>>>> 发件人: Peter Caspers [mailto:[hidden email]]
>>>>> 发送时间: 2014年9月17日 0:32
>>>>> 收件人: Cheng Li; QuantLib Mailing Lists
>>>>> 主题: Re: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic Creator
>>>>> MT
>>>>>
>>>>> Hi Cheng,
>>>>>
>>>>> indeed with msvc I get a slow down with a factor of ~2.8x. As I said, under gcc it is a speed up ~ 0.8x (with -O3).
>>>>>
>>>>> Does anyone have an idea where the different behaviour under gcc /
>>>>> linux and msvc might come from (and how to improve the msvc side
>>>>> if
>>>>> possible) ?
>>>>>
>>>>> Kind regards
>>>>> Peter
>>>>>
>>>>>
>>>>>
>>>>> On 13 September 2014 08:27, Cheng Li <[hidden email]> wrote:
>>>>>> Thanks Peter.
>>>>>>
>>>>>> Regards,
>>>>>> Cheng
>>>>>>
>>>>>> 发自我的 iPad
>>>>>>
>>>>>>> 在 2014年9月13日,13:29,Peter Caspers <[hidden email]> 写道:
>>>>>>>
>>>>>>> I will have a look on monday ( I have a Windows machine at work
>>>>>>> ) and see how it works there
>>>>>>>
>>>>>>> Thanks
>>>>>>> Peter
>>>>>>>
>>>>>>> Von meinem iPhone gesendet
>>>>>>>
>>>>>>>> Am 13.09.2014 um 04:41 schrieb Cheng Li <[hidden email]>:
>>>>>>>>
>>>>>>>> I am on Win7 x64bit, using vs 2012 with quantlib 1.4 boost 1.55
>>>>>>>> under release mode
>>>>>>>>
>>>>>>>> 发自我的 iPad
>>>>>>>>
>>>>>>>>> 在 2014年9月13日,0:08,Peter Caspers <[hidden email]> 写道:
>>>>>>>>>
>>>>>>>>> Hi Cheng,
>>>>>>>>>
>>>>>>>>> no, I get better timings with the dcmt implementation, e.g.
>>>>>>>>> for
>>>>>>>>> 1E8 numbers
>>>>>>>>>
>>>>>>>>> dcmt 0.982s
>>>>>>>>> quantlib 1.159s
>>>>>>>>>
>>>>>>>>> on my computer. Can you post your platform and compiler
>>>>>>>>> settings, so that I can try to reproduce ?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>>>> On 12 September 2014 05:29, cheng li <[hidden email]> wrote:
>>>>>>>>>> Hi Peter,
>>>>>>>>>>
>>>>>>>>>> I have used your wrapper dcmt library and test with following
>>>>>>>>>> codes: It seems dcmt in single thread is 4X slower than the
>>>>>>>>>> QL original MT. Is this consistent with your side?
>>>>>>>>>>
>>>>>>>>>> #include <ql/quantlib.hpp>
>>>>>>>>>> #include <boost/timer.hpp>
>>>>>>>>>> #include <iostream>
>>>>>>>>>>
>>>>>>>>>> using namespace QuantLib;
>>>>>>>>>> using namespace std;
>>>>>>>>>>
>>>>>>>>>> int main() {
>>>>>>>>>>
>>>>>>>>>>      int samples;
>>>>>>>>>>      cin >> samples;
>>>>>>>>>>      boost::timer myTimer;
>>>>>>>>>>
>>>>>>>>>>      MersenneTwisterUniformRng orignalMT;
>>>>>>>>>>      for(Size i=0; i<samples; ++i)
>>>>>>>>>>              orignalMT.next();
>>>>>>>>>>
>>>>>>>>>>      cout << myTimer.elapsed() << endl;
>>>>>>>>>>
>>>>>>>>>>      myTimer.restart();
>>>>>>>>>>
>>>>>>>>>>      MersenneTwisterDynamicRng mt( mtdesc_0_8_19937[5] , 1);
>>>>>>>>>>
>>>>>>>>>>      for(Size i=0; i<samples; ++i) {
>>>>>>>>>>              mt.next();
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>>      cout << myTimer.elapsed() << endl;
>>>>>>>>>>
>>>>>>>>>>      int n;
>>>>>>>>>>      std::cin>>n;
>>>>>>>>>>      return 0;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Cheng
>>>>>>>>>>
>>>>>>>>>> -----邮件原件-----
>>>>>>>>>> 发件人: Peter Caspers [mailto:[hidden email]]
>>>>>>>>>> 发送时间: 2014年9月6日 20:48
>>>>>>>>>> 收件人: Joseph Wang
>>>>>>>>>> 抄送: QuantLib Mailing Lists
>>>>>>>>>> 主题: Re: [Quantlib-dev] Openmp work on mcarlo : Dynamic
>>>>>>>>>> Creator MT
>>>>>>>>>>
>>>>>>>>>> Hi Joseph, all,
>>>>>>>>>>
>>>>>>>>>> I added a wrapper for the dcmt library (Dynamic Creator of
>>>>>>>>>> Mersenne Twisters).
>>>>>>>>>>
>>>>>>>>>> https://github.com/lballabio/quantlib/pull/132
>>>>>>>>>>
>>>>>>>>>> I guess this is a useful building block for multithreaded monte carlo.
>>>>>>>>>> Since for bigger p the dynamic creation takes a long time (it
>>>>>>>>>> feels more like mining than computing ...), I precomputed 8 independent instances (i.e.
>>>>>>>>>> for use in at most 8 parallel threads), for the "standard"
>>>>>>>>>> value p = 19937 and word size 32, which one can instantiate
>>>>>>>>>> with
>>>>>>>>>>
>>>>>>>>>> MersenneTwisterDynamicRng mt( mtdesc_0_8_19937[i] , seed_i );
>>>>>>>>>>
>>>>>>>>>> for i = 0, ... , 7.
>>>>>>>>>>
>>>>>>>>>> In addition the speed of random number generation seems a bit
>>>>>>>>>> faster in the dcmt library than with the original ql twister.
>>>>>>>>>> I observe running times scaled by a factor of 0.8 when generating 1E8 numbers.
>>>>>>>>>>
>>>>>>>>>> All this is of course experimental and not well tested, so
>>>>>>>>>> any feedback and experiences are very welcome. I'd be very
>>>>>>>>>> interested in your opinion on the dcmt library and applications in parallel monte carlo.
>>>>>>>>>>
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>>> On 20 October 2013 16:01, Joseph Wang <[hidden email]> wrote:
>>>>>>>>>>> I've done some more parallelization with openmp and quantlib.
>>>>>>>>>>> I've uploaded the changes to the
>>>>>>>>>>> https://github.com/joequant/quantlib.  The branch openmp has some changes that I've issued a pull-request for.
>>>>>>>>>>> openmp-mcario has some changes that need some more work.
>>>>>>>>>>>
>>>>>>>>>>> I've gotten the MC to work by generating the paths in a
>>>>>>>>>>> critical
>>>>>>>>>> situation.
>>>>>>>>>>> Calculating the prices once I have the path is
>>>>>>>>>>> multithreaded, but right now I need to generate the paths in
>>>>>>>>>>> a single thread to make sure that the same sequence is generated.
>>>>>>>>>>>
>>>>>>>>>>> The big issue right now is that there is a race condition in
>>>>>>>>>>> the calculation of barrier options which is causing one
>>>>>>>>>>> regression test to fail.  The problem is that the random
>>>>>>>>>>> number generator is being called in BarrierPathPricer, and
>>>>>>>>>>> since that is run multithread, the sequence that is being
>>>>>>>>>>> pulled will change from run to run based on whether other paths have pulled random numbers already.
>>>>>>>>>>>
>>>>>>>>>>> I think that fixing this is going to need some code
>>>>>>>>>>> restructuring, but I'd like to get some thoughts as to how
>>>>>>>>>>> to do this.  Basically, the interface needs to be changed
>>>>>>>>>>> slightly so that the random numbers are drawn in a fixed
>>>>>>>>>>> order, and that might mean one call to get any additional
>>>>>>>>>>> random numbers in a pricer, which gets called in a critical section, and another to run the pricer with the random numbers.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ------------------------------------------------------------
>>>>>>>>>>> -
>>>>>>>>>>> -
>>>>>>>>>>> -
>>>>>>>>>>> -
>>>>>>>>>>> -
>>>>>>>>>>> -----
>>>>>>>>>>> -------- October Webinars: Code for Performance Free Intel
>>>>>>>>>>> webinars can help you accelerate application performance.
>>>>>>>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more.
>>>>>>>>>>> Get the most from the latest Intel processors and
>>>>>>>>>>> coprocessors. See abstracts and register >
>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4
>>>>>>>>>>> 1
>>>>>>>>>>> 4
>>>>>>>>>>> 0 / o stg.c lktrk
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> QuantLib-dev mailing list
>>>>>>>>>>> [hidden email]
>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>>>>>>>>>
>>>>>>>>>> -------------------------------------------------------------
>>>>>>>>>> -
>>>>>>>>>> -
>>>>>>>>>> -
>>>>>>>>>> -
>>>>>>>>>> -
>>>>>>>>>> ----------
>>>>>>>>>> --
>>>>>>>>>> Slashdot TV.
>>>>>>>>>> Video for Nerds.  Stuff that matters.
>>>>>>>>>> http://tv.slashdot.org/
>>>>>>>>>> _______________________________________________
>>>>>>>>>> QuantLib-dev mailing list
>>>>>>>>>> [hidden email]
>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>>>>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>


------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev