Login  Register

答复: 答复: 答复: Openmp work on mcarlo : Dynamic Creator MT

Posted by cheng li on Sep 18, 2014; 1:33am
URL: http://quantlib.414.s1.nabble.com/Re-Openmp-work-on-mcarlo-Dynamic-Creator-MT-tp15832p15872.html

Let me try your statement once I have a time.

Regards,
Cheng

-----邮件原件-----
发件人: cheng li [mailto:[hidden email]]
发送时间: 2014年9月18日 9:18
收件人: 'Peter Caspers'
抄送: 'QuantLib Mailing Lists'
主题: 答复: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic Creator MT

Hi Peter,

I used gcc 4.8.2.

My result with O3 optimization is still not good. Similar performance of new MT ( about 3~4X speed down)

I used such statement to turn on o3 optimization before I do ./configure for QuantLib,

Export CXXFLAGS="-g -O3"

Am I right?

Regards,
Cheng

-----邮件原件-----
发件人: Peter Caspers [mailto:[hidden email]]
发送时间: 2014年9月18日 0:36
收件人: cheng li
抄送: QuantLib Mailing Lists
主题: Re: 答复: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic Creator MT

with gcc 4.9.1 and O2 the new mt is a bit slower than the original one (but only by a factor of 1.1).
I have to add both -frename-registers, -finline-functions to -O2 to get the speed up back I mentioned before.

Which compiler do you use on Ubuntu ?

Peter



On 17 September 2014 03:26, cheng li <[hidden email]> wrote:

> Thanks Peter. I test on Ubuntu also, about 3~4X lower with -O2 optiomization.
>
> I'll try -O3 on my machine also with Ubuntu.
>
> Regards,
> Cheng
>
> -----邮件原件-----
> 发件人: Peter Caspers [mailto:[hidden email]]
> 发送时间: 2014年9月17日 0:32
> 收件人: Cheng Li; QuantLib Mailing Lists
> 主题: Re: 答复: [Quantlib-dev] Openmp work on mcarlo : Dynamic Creator MT
>
> Hi Cheng,
>
> indeed with msvc I get a slow down with a factor of ~2.8x. As I said, under gcc it is a speed up ~ 0.8x (with -O3).
>
> Does anyone have an idea where the different behaviour under gcc /
> linux and msvc might come from (and how to improve the msvc side if
> possible) ?
>
> Kind regards
> Peter
>
>
>
> On 13 September 2014 08:27, Cheng Li <[hidden email]> wrote:
>> Thanks Peter.
>>
>> Regards,
>> Cheng
>>
>> 发自我的 iPad
>>
>>> 在 2014年9月13日,13:29,Peter Caspers <[hidden email]> 写道:
>>>
>>> I will have a look on monday ( I have a Windows machine at work )
>>> and see how it works there
>>>
>>> Thanks
>>> Peter
>>>
>>> Von meinem iPhone gesendet
>>>
>>>> Am 13.09.2014 um 04:41 schrieb Cheng Li <[hidden email]>:
>>>>
>>>> I am on Win7 x64bit, using vs 2012 with quantlib 1.4 boost 1.55
>>>> under release mode
>>>>
>>>> 发自我的 iPad
>>>>
>>>>> 在 2014年9月13日,0:08,Peter Caspers <[hidden email]> 写道:
>>>>>
>>>>> Hi Cheng,
>>>>>
>>>>> no, I get better timings with the dcmt implementation, e.g. for
>>>>> 1E8 numbers
>>>>>
>>>>> dcmt 0.982s
>>>>> quantlib 1.159s
>>>>>
>>>>> on my computer. Can you post your platform and compiler settings,
>>>>> so that I can try to reproduce ?
>>>>>
>>>>> Thanks
>>>>> Peter
>>>>>
>>>>>> On 12 September 2014 05:29, cheng li <[hidden email]> wrote:
>>>>>> Hi Peter,
>>>>>>
>>>>>> I have used your wrapper dcmt library and test with following
>>>>>> codes: It seems dcmt in single thread is 4X slower than the QL
>>>>>> original MT. Is this consistent with your side?
>>>>>>
>>>>>> #include <ql/quantlib.hpp>
>>>>>> #include <boost/timer.hpp>
>>>>>> #include <iostream>
>>>>>>
>>>>>> using namespace QuantLib;
>>>>>> using namespace std;
>>>>>>
>>>>>> int main() {
>>>>>>
>>>>>>      int samples;
>>>>>>      cin >> samples;
>>>>>>      boost::timer myTimer;
>>>>>>
>>>>>>      MersenneTwisterUniformRng orignalMT;
>>>>>>      for(Size i=0; i<samples; ++i)
>>>>>>              orignalMT.next();
>>>>>>
>>>>>>      cout << myTimer.elapsed() << endl;
>>>>>>
>>>>>>      myTimer.restart();
>>>>>>
>>>>>>      MersenneTwisterDynamicRng mt( mtdesc_0_8_19937[5] , 1);
>>>>>>
>>>>>>      for(Size i=0; i<samples; ++i) {
>>>>>>              mt.next();
>>>>>>      }
>>>>>>
>>>>>>      cout << myTimer.elapsed() << endl;
>>>>>>
>>>>>>      int n;
>>>>>>      std::cin>>n;
>>>>>>      return 0;
>>>>>> }
>>>>>>
>>>>>> Regards,
>>>>>> Cheng
>>>>>>
>>>>>> -----邮件原件-----
>>>>>> 发件人: Peter Caspers [mailto:[hidden email]]
>>>>>> 发送时间: 2014年9月6日 20:48
>>>>>> 收件人: Joseph Wang
>>>>>> 抄送: QuantLib Mailing Lists
>>>>>> 主题: Re: [Quantlib-dev] Openmp work on mcarlo : Dynamic Creator MT
>>>>>>
>>>>>> Hi Joseph, all,
>>>>>>
>>>>>> I added a wrapper for the dcmt library (Dynamic Creator of
>>>>>> Mersenne Twisters).
>>>>>>
>>>>>> https://github.com/lballabio/quantlib/pull/132
>>>>>>
>>>>>> I guess this is a useful building block for multithreaded monte carlo.
>>>>>> Since for bigger p the dynamic creation takes a long time (it
>>>>>> feels more like mining than computing ...), I precomputed 8 independent instances (i.e.
>>>>>> for use in at most 8 parallel threads), for the "standard" value
>>>>>> p = 19937 and word size 32, which one can instantiate with
>>>>>>
>>>>>> MersenneTwisterDynamicRng mt( mtdesc_0_8_19937[i] , seed_i );
>>>>>>
>>>>>> for i = 0, ... , 7.
>>>>>>
>>>>>> In addition the speed of random number generation seems a bit
>>>>>> faster in the dcmt library than with the original ql twister. I
>>>>>> observe running times scaled by a factor of 0.8 when generating 1E8 numbers.
>>>>>>
>>>>>> All this is of course experimental and not well tested, so any
>>>>>> feedback and experiences are very welcome. I'd be very interested
>>>>>> in your opinion on the dcmt library and applications in parallel monte carlo.
>>>>>>
>>>>>> Peter
>>>>>>
>>>>>>> On 20 October 2013 16:01, Joseph Wang <[hidden email]> wrote:
>>>>>>> I've done some more parallelization with openmp and quantlib.
>>>>>>> I've uploaded the changes to the
>>>>>>> https://github.com/joequant/quantlib.  The branch openmp has some changes that I've issued a pull-request for.
>>>>>>> openmp-mcario has some changes that need some more work.
>>>>>>>
>>>>>>> I've gotten the MC to work by generating the paths in a critical
>>>>>> situation.
>>>>>>> Calculating the prices once I have the path is multithreaded,
>>>>>>> but right now I need to generate the paths in a single thread to
>>>>>>> make sure that the same sequence is generated.
>>>>>>>
>>>>>>> The big issue right now is that there is a race condition in the
>>>>>>> calculation of barrier options which is causing one regression
>>>>>>> test to fail.  The problem is that the random number generator
>>>>>>> is being called in BarrierPathPricer, and since that is run
>>>>>>> multithread, the sequence that is being pulled will change from
>>>>>>> run to run based on whether other paths have pulled random numbers already.
>>>>>>>
>>>>>>> I think that fixing this is going to need some code
>>>>>>> restructuring, but I'd like to get some thoughts as to how to do
>>>>>>> this.  Basically, the interface needs to be changed slightly so
>>>>>>> that the random numbers are drawn in a fixed order, and that
>>>>>>> might mean one call to get any additional random numbers in a
>>>>>>> pricer, which gets called in a critical section, and another to run the pricer with the random numbers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----------------------------------------------------------------
>>>>>>> -
>>>>>>> -----
>>>>>>> -------- October Webinars: Code for Performance Free Intel
>>>>>>> webinars can help you accelerate application performance.
>>>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get
>>>>>>> the most from the latest Intel processors and coprocessors. See
>>>>>>> abstracts and register >
>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/
>>>>>>> o stg.c lktrk _______________________________________________
>>>>>>> QuantLib-dev mailing list
>>>>>>> [hidden email]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>>>>>
>>>>>> -----------------------------------------------------------------
>>>>>> -
>>>>>> ----------
>>>>>> --
>>>>>> Slashdot TV.
>>>>>> Video for Nerds.  Stuff that matters.
>>>>>> http://tv.slashdot.org/
>>>>>> _______________________________________________
>>>>>> QuantLib-dev mailing list
>>>>>> [hidden email]
>>>>>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>>>>>
>



------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev