incremental statistics

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

incremental statistics

Peter Caspers-4
Hi,

here

https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56

is a check for a negative variance estimation. Indeed I observe this
happens due to numerical issues sometimes. However, if this is the
only source for the exception to be thrown, couldn't we just omit the
check and return the value or if you want max ( v, 0.0 ) ?

In applications it is somewhat unexpected to get an exception when
just asking for a variance estimation on valid data.

Thank you
Peter

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Peter Caspers-4
... another idea would be to remove the core from
IncrementalStatistics and replace it with boost accumulators (they are
present since 1.36, so it should be ok), just leaving the interface in
place. Shall I do that ?
Peter

On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote:

> Hi,
>
> here
>
> https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>
> is a check for a negative variance estimation. Indeed I observe this
> happens due to numerical issues sometimes. However, if this is the
> only source for the exception to be thrown, couldn't we just omit the
> check and return the value or if you want max ( v, 0.0 ) ?
>
> In applications it is somewhat unexpected to get an exception when
> just asking for a variance estimation on valid data.
>
> Thank you
> Peter

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Luigi Ballabio
Do they have the same behavior? (That is, keep the statistics but discard the data?) If so, yes, it would probably make the code simpler.

Luigi

On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]> wrote:
... another idea would be to remove the core from
IncrementalStatistics and replace it with boost accumulators (they are
present since 1.36, so it should be ok), just leaving the interface in
place. Shall I do that ?
Peter

On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote:
> Hi,
>
> here
>
> https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>
> is a check for a negative variance estimation. Indeed I observe this
> happens due to numerical issues sometimes. However, if this is the
> only source for the exception to be thrown, couldn't we just omit the
> check and return the value or if you want max ( v, 0.0 ) ?
>
> In applications it is somewhat unexpected to get an exception when
> just asking for a variance estimation on valid data.
>
> Thank you
> Peter

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
--

------------------------------------------------------------------------------

_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Peter Caspers-4
This will in general depend on the specific accumulator, but I would
assume so for the ones we need here. The documentation isn't overly
explicit about that, the only hint seems to be

"This works, but some accumulators are not cheap to copy. For example,
the tail andtail_variate<> accumulators must store a std::vector<>, so
copying these accumulators involves a dynamic allocation."

 I will test the memory usage though, to be sure.

Peter


On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> wrote:

> Do they have the same behavior? (That is, keep the statistics but discard
> the data?) If so, yes, it would probably make the code simpler.
>
> Luigi
>
> On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
> wrote:
>>
>> ... another idea would be to remove the core from
>> IncrementalStatistics and replace it with boost accumulators (they are
>> present since 1.36, so it should be ok), just leaving the interface in
>> place. Shall I do that ?
>> Peter
>>
>> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote:
>> > Hi,
>> >
>> > here
>> >
>> >
>> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>> >
>> > is a check for a negative variance estimation. Indeed I observe this
>> > happens due to numerical issues sometimes. However, if this is the
>> > only source for the exception to be thrown, couldn't we just omit the
>> > check and return the value or if you want max ( v, 0.0 ) ?
>> >
>> > In applications it is somewhat unexpected to get an exception when
>> > just asking for a variance estimation on valid data.
>> >
>> > Thank you
>> > Peter
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> QuantLib-dev mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>
> --
>
> <http://leanpub.com/implementingquantlib/>
> <http://implementingquantlib.com>
> <http://twitter.com/lballabio>

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Luigi Ballabio
Ok, thanks. (The reason I asked is that the small memory footprint is the very reason IncrementalStatistics is there in the first place. Otherwise, one would just use GeneralStatistics instead.)


On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]> wrote:
This will in general depend on the specific accumulator, but I would
assume so for the ones we need here. The documentation isn't overly
explicit about that, the only hint seems to be

"This works, but some accumulators are not cheap to copy. For example,
the tail andtail_variate<> accumulators must store a std::vector<>, so
copying these accumulators involves a dynamic allocation."

 I will test the memory usage though, to be sure.

Peter


On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> wrote:
> Do they have the same behavior? (That is, keep the statistics but discard
> the data?) If so, yes, it would probably make the code simpler.
>
> Luigi
>
> On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
> wrote:
>>
>> ... another idea would be to remove the core from
>> IncrementalStatistics and replace it with boost accumulators (they are
>> present since 1.36, so it should be ok), just leaving the interface in
>> place. Shall I do that ?
>> Peter
>>
>> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote:
>> > Hi,
>> >
>> > here
>> >
>> >
>> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>> >
>> > is a check for a negative variance estimation. Indeed I observe this
>> > happens due to numerical issues sometimes. However, if this is the
>> > only source for the exception to be thrown, couldn't we just omit the
>> > check and return the value or if you want max ( v, 0.0 ) ?
>> >
>> > In applications it is somewhat unexpected to get an exception when
>> > just asking for a variance estimation on valid data.
>> >
>> > Thank you
>> > Peter
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> QuantLib-dev mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>
> --
>
> <http://leanpub.com/implementingquantlib/>
> <http://implementingquantlib.com>
> <http://twitter.com/lballabio>
--

------------------------------------------------------------------------------

_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Ferdinando M. Ametrano-2
I'm all in favor of replacing the implementation of QL statistics classes with boost. Then deprecate QL interfaces and switch to boost altogether.We might just have some finance related risk measures to keep. It has always been a dream pet project of mine... if i only had time...

On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio <[hidden email]> wrote:
Ok, thanks. (The reason I asked is that the small memory footprint is the very reason IncrementalStatistics is there in the first place. Otherwise, one would just use GeneralStatistics instead.)


On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]> wrote:
This will in general depend on the specific accumulator, but I would
assume so for the ones we need here. The documentation isn't overly
explicit about that, the only hint seems to be

"This works, but some accumulators are not cheap to copy. For example,
the tail andtail_variate<> accumulators must store a std::vector<>, so
copying these accumulators involves a dynamic allocation."

 I will test the memory usage though, to be sure.

Peter


On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> wrote:
> Do they have the same behavior? (That is, keep the statistics but discard
> the data?) If so, yes, it would probably make the code simpler.
>
> Luigi
>
> On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
> wrote:
>>
>> ... another idea would be to remove the core from
>> IncrementalStatistics and replace it with boost accumulators (they are
>> present since 1.36, so it should be ok), just leaving the interface in
>> place. Shall I do that ?
>> Peter
>>
>> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote:
>> > Hi,
>> >
>> > here
>> >
>> >
>> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>> >
>> > is a check for a negative variance estimation. Indeed I observe this
>> > happens due to numerical issues sometimes. However, if this is the
>> > only source for the exception to be thrown, couldn't we just omit the
>> > check and return the value or if you want max ( v, 0.0 ) ?
>> >
>> > In applications it is somewhat unexpected to get an exception when
>> > just asking for a variance estimation on valid data.
>> >
>> > Thank you
>> > Peter
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> QuantLib-dev mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>
> --
>
> <http://leanpub.com/implementingquantlib/>
> <http://implementingquantlib.com>
> <http://twitter.com/lballabio>
--

------------------------------------------------------------------------------

_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev



------------------------------------------------------------------------------

_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Peter Caspers-4
In reply to this post by Luigi Ballabio
Alight, I plugged the boost accumulators into the ql - class:

Memory footprint:
I added the first 10,000 integers to the original class (and extracted
mean, variance, skewness, kurtosis to be sure later on boost is
actually doing something) and recorded the heap usage with Massif. The
peak memory usage is at 6 KB. In comparison for GeneralStatistics I
get 390 KB. With the boostified version it is again 6 KB. This looks
ok / the same for both implementations.

Performance:
Doing the same with the first 1,000,000 integers the original class
runs 25ms, while the new one takes 35ms, so a bit slower. I guess this
is acceptable though.

Numerical stability:
Seems to be better in boost, in particular for the variance
estimation. Generating 500,000 random numbers drawn from a normal
distribution with mean 1E8 and variance 1E-4 I get estimations for the
variance of 9.9918E-5 in boost and 0 in ql. With mean 1E7 and variance
1E-6 it is 9.9918E-7 in boost and 0.6719 in ql. Or with mean 1E8 and
variance 1E-2 boost says 0.00999 and ql 348.0.

For me it looks as if boost wins on balance, mostly because of the
better numerical stability. I will send a PR after adding a few test
cases which test the new implementation against the old one and some
cases which recognize the better stability.

Peter


On 27 August 2015 at 10:48, Luigi Ballabio <[hidden email]> wrote:

> Ok, thanks. (The reason I asked is that the small memory footprint is the
> very reason IncrementalStatistics is there in the first place. Otherwise,
> one would just use GeneralStatistics instead.)
>
>
> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]>
> wrote:
>>
>> This will in general depend on the specific accumulator, but I would
>> assume so for the ones we need here. The documentation isn't overly
>> explicit about that, the only hint seems to be
>>
>> "This works, but some accumulators are not cheap to copy. For example,
>> the tail andtail_variate<> accumulators must store a std::vector<>, so
>> copying these accumulators involves a dynamic allocation."
>>
>>  I will test the memory usage though, to be sure.
>>
>> Peter
>>
>>
>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]>
>> wrote:
>> > Do they have the same behavior? (That is, keep the statistics but
>> > discard
>> > the data?) If so, yes, it would probably make the code simpler.
>> >
>> > Luigi
>> >
>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
>> > wrote:
>> >>
>> >> ... another idea would be to remove the core from
>> >> IncrementalStatistics and replace it with boost accumulators (they are
>> >> present since 1.36, so it should be ok), just leaving the interface in
>> >> place. Shall I do that ?
>> >> Peter
>> >>
>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > here
>> >> >
>> >> >
>> >> >
>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>> >> >
>> >> > is a check for a negative variance estimation. Indeed I observe this
>> >> > happens due to numerical issues sometimes. However, if this is the
>> >> > only source for the exception to be thrown, couldn't we just omit the
>> >> > check and return the value or if you want max ( v, 0.0 ) ?
>> >> >
>> >> > In applications it is somewhat unexpected to get an exception when
>> >> > just asking for a variance estimation on valid data.
>> >> >
>> >> > Thank you
>> >> > Peter
>> >>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> _______________________________________________
>> >> QuantLib-dev mailing list
>> >> [hidden email]
>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>> >
>> > --
>> >
>> > <http://leanpub.com/implementingquantlib/>
>> > <http://implementingquantlib.com>
>> > <http://twitter.com/lballabio>
>
> --
>
> <http://leanpub.com/implementingquantlib/>
> <http://implementingquantlib.com>
> <http://twitter.com/lballabio>

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Peter Caspers-4
In reply to this post by Ferdinando M. Ametrano-2
Yes sure, I kept the interface as is and just replaced the
implementation for now. Since the member variables are protected this
is not 100% backward compatible because classes might derive from
IncrementalStatistics using these variables. But this is not done in
QuantLib itself, so I would say we just ignore this.

On 28 August 2015 at 20:42, Ferdinando M. Ametrano
<[hidden email]> wrote:

> I'm all in favor of replacing the implementation of QL statistics classes
> with boost. Then deprecate QL interfaces and switch to boost altogether.We
> might just have some finance related risk measures to keep. It has always
> been a dream pet project of mine... if i only had time...
>
> On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio <[hidden email]>
> wrote:
>>
>> Ok, thanks. (The reason I asked is that the small memory footprint is the
>> very reason IncrementalStatistics is there in the first place. Otherwise,
>> one would just use GeneralStatistics instead.)
>>
>>
>> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]>
>> wrote:
>>>
>>> This will in general depend on the specific accumulator, but I would
>>> assume so for the ones we need here. The documentation isn't overly
>>> explicit about that, the only hint seems to be
>>>
>>> "This works, but some accumulators are not cheap to copy. For example,
>>> the tail andtail_variate<> accumulators must store a std::vector<>, so
>>> copying these accumulators involves a dynamic allocation."
>>>
>>>  I will test the memory usage though, to be sure.
>>>
>>> Peter
>>>
>>>
>>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]>
>>> wrote:
>>> > Do they have the same behavior? (That is, keep the statistics but
>>> > discard
>>> > the data?) If so, yes, it would probably make the code simpler.
>>> >
>>> > Luigi
>>> >
>>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
>>> > wrote:
>>> >>
>>> >> ... another idea would be to remove the core from
>>> >> IncrementalStatistics and replace it with boost accumulators (they are
>>> >> present since 1.36, so it should be ok), just leaving the interface in
>>> >> place. Shall I do that ?
>>> >> Peter
>>> >>
>>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]>
>>> >> wrote:
>>> >> > Hi,
>>> >> >
>>> >> > here
>>> >> >
>>> >> >
>>> >> >
>>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>>> >> >
>>> >> > is a check for a negative variance estimation. Indeed I observe this
>>> >> > happens due to numerical issues sometimes. However, if this is the
>>> >> > only source for the exception to be thrown, couldn't we just omit
>>> >> > the
>>> >> > check and return the value or if you want max ( v, 0.0 ) ?
>>> >> >
>>> >> > In applications it is somewhat unexpected to get an exception when
>>> >> > just asking for a variance estimation on valid data.
>>> >> >
>>> >> > Thank you
>>> >> > Peter
>>> >>
>>> >>
>>> >>
>>> >> ------------------------------------------------------------------------------
>>> >> _______________________________________________
>>> >> QuantLib-dev mailing list
>>> >> [hidden email]
>>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>> >
>>> > --
>>> >
>>> > <http://leanpub.com/implementingquantlib/>
>>> > <http://implementingquantlib.com>
>>> > <http://twitter.com/lballabio>
>>
>> --
>>
>> <http://leanpub.com/implementingquantlib/>
>> <http://implementingquantlib.com>
>> <http://twitter.com/lballabio>
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> QuantLib-dev mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>
>

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Luigi Ballabio
I suppose there's a Boost accumulator that can replace GeneralStatistics, too? (That is, one that stores and can return the whole set of samples?)

As for switching: I'm all for it, but the switch itself is going to need some thinking if we're to let the two interfaces coexist for a release or two. MonteCarloModel<MC, RNG, Statistics> and MonteCarloModel<MC, RNG, Boost.Whatever> should both work. I guess we could use traits of enable_if underneath...

(The second leg of this would be to give our random-number generators the same interface as the ones in Boost and the C++11 standard so that we can eventually replace those, too. Performance would be more of a sensitive issue in this case, though...)

Luigi


On Fri, Aug 28, 2015 at 9:30 PM Peter Caspers <[hidden email]> wrote:
Yes sure, I kept the interface as is and just replaced the
implementation for now. Since the member variables are protected this
is not 100% backward compatible because classes might derive from
IncrementalStatistics using these variables. But this is not done in
QuantLib itself, so I would say we just ignore this.

On 28 August 2015 at 20:42, Ferdinando M. Ametrano
<[hidden email]> wrote:
> I'm all in favor of replacing the implementation of QL statistics classes
> with boost. Then deprecate QL interfaces and switch to boost altogether.We
> might just have some finance related risk measures to keep. It has always
> been a dream pet project of mine... if i only had time...
>
> On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio <[hidden email]>
> wrote:
>>
>> Ok, thanks. (The reason I asked is that the small memory footprint is the
>> very reason IncrementalStatistics is there in the first place. Otherwise,
>> one would just use GeneralStatistics instead.)
>>
>>
>> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]>
>> wrote:
>>>
>>> This will in general depend on the specific accumulator, but I would
>>> assume so for the ones we need here. The documentation isn't overly
>>> explicit about that, the only hint seems to be
>>>
>>> "This works, but some accumulators are not cheap to copy. For example,
>>> the tail andtail_variate<> accumulators must store a std::vector<>, so
>>> copying these accumulators involves a dynamic allocation."
>>>
>>>  I will test the memory usage though, to be sure.
>>>
>>> Peter
>>>
>>>
>>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]>
>>> wrote:
>>> > Do they have the same behavior? (That is, keep the statistics but
>>> > discard
>>> > the data?) If so, yes, it would probably make the code simpler.
>>> >
>>> > Luigi
>>> >
>>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
>>> > wrote:
>>> >>
>>> >> ... another idea would be to remove the core from
>>> >> IncrementalStatistics and replace it with boost accumulators (they are
>>> >> present since 1.36, so it should be ok), just leaving the interface in
>>> >> place. Shall I do that ?
>>> >> Peter
>>> >>
>>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]>
>>> >> wrote:
>>> >> > Hi,
>>> >> >
>>> >> > here
>>> >> >
>>> >> >
>>> >> >
>>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>>> >> >
>>> >> > is a check for a negative variance estimation. Indeed I observe this
>>> >> > happens due to numerical issues sometimes. However, if this is the
>>> >> > only source for the exception to be thrown, couldn't we just omit
>>> >> > the
>>> >> > check and return the value or if you want max ( v, 0.0 ) ?
>>> >> >
>>> >> > In applications it is somewhat unexpected to get an exception when
>>> >> > just asking for a variance estimation on valid data.
>>> >> >
>>> >> > Thank you
>>> >> > Peter
>>> >>
>>> >>
>>> >>
>>> >> ------------------------------------------------------------------------------
>>> >> _______________________________________________
>>> >> QuantLib-dev mailing list
>>> >> [hidden email]
>>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>> >
>>> > --
>>> >
>>> > <http://leanpub.com/implementingquantlib/>
>>> > <http://implementingquantlib.com>
>>> > <http://twitter.com/lballabio>
>>
>> --
>>
>> <http://leanpub.com/implementingquantlib/>
>> <http://implementingquantlib.com>
>> <http://twitter.com/lballabio>
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> QuantLib-dev mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>
>
--

------------------------------------------------------------------------------

_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Ferdinando M. Ametrano-2

Statistics, random numbers, and distributions! I might even suggest that stats and distributions should be boostified at the same time.

Wouldn't this be a good student assignement? Too bad Google summer of code is just over...

On Aug 29, 2015 2:34 PM, "Luigi Ballabio" <[hidden email]> wrote:
I suppose there's a Boost accumulator that can replace GeneralStatistics, too? (That is, one that stores and can return the whole set of samples?)

As for switching: I'm all for it, but the switch itself is going to need some thinking if we're to let the two interfaces coexist for a release or two. MonteCarloModel<MC, RNG, Statistics> and MonteCarloModel<MC, RNG, Boost.Whatever> should both work. I guess we could use traits of enable_if underneath...

(The second leg of this would be to give our random-number generators the same interface as the ones in Boost and the C++11 standard so that we can eventually replace those, too. Performance would be more of a sensitive issue in this case, though...)

Luigi


On Fri, Aug 28, 2015 at 9:30 PM Peter Caspers <[hidden email]> wrote:
Yes sure, I kept the interface as is and just replaced the
implementation for now. Since the member variables are protected this
is not 100% backward compatible because classes might derive from
IncrementalStatistics using these variables. But this is not done in
QuantLib itself, so I would say we just ignore this.

On 28 August 2015 at 20:42, Ferdinando M. Ametrano
<[hidden email]> wrote:
> I'm all in favor of replacing the implementation of QL statistics classes
> with boost. Then deprecate QL interfaces and switch to boost altogether.We
> might just have some finance related risk measures to keep. It has always
> been a dream pet project of mine... if i only had time...
>
> On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio <[hidden email]>
> wrote:
>>
>> Ok, thanks. (The reason I asked is that the small memory footprint is the
>> very reason IncrementalStatistics is there in the first place. Otherwise,
>> one would just use GeneralStatistics instead.)
>>
>>
>> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]>
>> wrote:
>>>
>>> This will in general depend on the specific accumulator, but I would
>>> assume so for the ones we need here. The documentation isn't overly
>>> explicit about that, the only hint seems to be
>>>
>>> "This works, but some accumulators are not cheap to copy. For example,
>>> the tail andtail_variate<> accumulators must store a std::vector<>, so
>>> copying these accumulators involves a dynamic allocation."
>>>
>>>  I will test the memory usage though, to be sure.
>>>
>>> Peter
>>>
>>>
>>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]>
>>> wrote:
>>> > Do they have the same behavior? (That is, keep the statistics but
>>> > discard
>>> > the data?) If so, yes, it would probably make the code simpler.
>>> >
>>> > Luigi
>>> >
>>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
>>> > wrote:
>>> >>
>>> >> ... another idea would be to remove the core from
>>> >> IncrementalStatistics and replace it with boost accumulators (they are
>>> >> present since 1.36, so it should be ok), just leaving the interface in
>>> >> place. Shall I do that ?
>>> >> Peter
>>> >>
>>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]>
>>> >> wrote:
>>> >> > Hi,
>>> >> >
>>> >> > here
>>> >> >
>>> >> >
>>> >> >
>>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>>> >> >
>>> >> > is a check for a negative variance estimation. Indeed I observe this
>>> >> > happens due to numerical issues sometimes. However, if this is the
>>> >> > only source for the exception to be thrown, couldn't we just omit
>>> >> > the
>>> >> > check and return the value or if you want max ( v, 0.0 ) ?
>>> >> >
>>> >> > In applications it is somewhat unexpected to get an exception when
>>> >> > just asking for a variance estimation on valid data.
>>> >> >
>>> >> > Thank you
>>> >> > Peter
>>> >>
>>> >>
>>> >>
>>> >> ------------------------------------------------------------------------------
>>> >> _______________________________________________
>>> >> QuantLib-dev mailing list
>>> >> [hidden email]
>>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>> >
>>> > --
>>> >
>>> > <http://leanpub.com/implementingquantlib/>
>>> > <http://implementingquantlib.com>
>>> > <http://twitter.com/lballabio>
>>
>> --
>>
>> <http://leanpub.com/implementingquantlib/>
>> <http://implementingquantlib.com>
>> <http://twitter.com/lballabio>
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> QuantLib-dev mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>
>
--

------------------------------------------------------------------------------

_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev
Reply | Threaded
Open this post in threaded view
|

Re: incremental statistics

Peter Caspers-4
You can implement your own accumulators (I am not aware of one working
like GeneralStatistics), but I am not sure how returning the data
would fit into the general concept. Maybe it is better to just leave
the two tasks of storing the data / sorting it / return values ...
etc. and the statistical computations separated in the respective
responsibility of STL containers and accumulators in the end ?

Also we have to keep in mind that the boost standard accumulators are
returning different flavors of estimations compared to QuantLib
(variance, skewness, kurtosis, error estimate). The right way would
probably be be to provide own (boost based) accumulators for those,
either as an extension provided within QuantLib or as a direct
contribution to boost. I just placed appropriate factors in the
wrapper for now.

Peter



On 29 August 2015 at 14:52, Ferdinando M. Ametrano
<[hidden email]> wrote:

> Statistics, random numbers, and distributions! I might even suggest that
> stats and distributions should be boostified at the same time.
>
> Wouldn't this be a good student assignement? Too bad Google summer of code
> is just over...
>
> On Aug 29, 2015 2:34 PM, "Luigi Ballabio" <[hidden email]> wrote:
>>
>> I suppose there's a Boost accumulator that can replace GeneralStatistics,
>> too? (That is, one that stores and can return the whole set of samples?)
>>
>> As for switching: I'm all for it, but the switch itself is going to need
>> some thinking if we're to let the two interfaces coexist for a release or
>> two. MonteCarloModel<MC, RNG, Statistics> and MonteCarloModel<MC, RNG,
>> Boost.Whatever> should both work. I guess we could use traits of enable_if
>> underneath...
>>
>> (The second leg of this would be to give our random-number generators the
>> same interface as the ones in Boost and the C++11 standard so that we can
>> eventually replace those, too. Performance would be more of a sensitive
>> issue in this case, though...)
>>
>> Luigi
>>
>>
>> On Fri, Aug 28, 2015 at 9:30 PM Peter Caspers <[hidden email]>
>> wrote:
>>>
>>> Yes sure, I kept the interface as is and just replaced the
>>> implementation for now. Since the member variables are protected this
>>> is not 100% backward compatible because classes might derive from
>>> IncrementalStatistics using these variables. But this is not done in
>>> QuantLib itself, so I would say we just ignore this.
>>>
>>> On 28 August 2015 at 20:42, Ferdinando M. Ametrano
>>> <[hidden email]> wrote:
>>> > I'm all in favor of replacing the implementation of QL statistics
>>> > classes
>>> > with boost. Then deprecate QL interfaces and switch to boost
>>> > altogether.We
>>> > might just have some finance related risk measures to keep. It has
>>> > always
>>> > been a dream pet project of mine... if i only had time...
>>> >
>>> > On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio
>>> > <[hidden email]>
>>> > wrote:
>>> >>
>>> >> Ok, thanks. (The reason I asked is that the small memory footprint is
>>> >> the
>>> >> very reason IncrementalStatistics is there in the first place.
>>> >> Otherwise,
>>> >> one would just use GeneralStatistics instead.)
>>> >>
>>> >>
>>> >> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers
>>> >> <[hidden email]>
>>> >> wrote:
>>> >>>
>>> >>> This will in general depend on the specific accumulator, but I would
>>> >>> assume so for the ones we need here. The documentation isn't overly
>>> >>> explicit about that, the only hint seems to be
>>> >>>
>>> >>> "This works, but some accumulators are not cheap to copy. For
>>> >>> example,
>>> >>> the tail andtail_variate<> accumulators must store a std::vector<>,
>>> >>> so
>>> >>> copying these accumulators involves a dynamic allocation."
>>> >>>
>>> >>>  I will test the memory usage though, to be sure.
>>> >>>
>>> >>> Peter
>>> >>>
>>> >>>
>>> >>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]>
>>> >>> wrote:
>>> >>> > Do they have the same behavior? (That is, keep the statistics but
>>> >>> > discard
>>> >>> > the data?) If so, yes, it would probably make the code simpler.
>>> >>> >
>>> >>> > Luigi
>>> >>> >
>>> >>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers
>>> >>> > <[hidden email]>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> ... another idea would be to remove the core from
>>> >>> >> IncrementalStatistics and replace it with boost accumulators (they
>>> >>> >> are
>>> >>> >> present since 1.36, so it should be ok), just leaving the
>>> >>> >> interface in
>>> >>> >> place. Shall I do that ?
>>> >>> >> Peter
>>> >>> >>
>>> >>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]>
>>> >>> >> wrote:
>>> >>> >> > Hi,
>>> >>> >> >
>>> >>> >> > here
>>> >>> >> >
>>> >>> >> >
>>> >>> >> >
>>> >>> >> >
>>> >>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>>> >>> >> >
>>> >>> >> > is a check for a negative variance estimation. Indeed I observe
>>> >>> >> > this
>>> >>> >> > happens due to numerical issues sometimes. However, if this is
>>> >>> >> > the
>>> >>> >> > only source for the exception to be thrown, couldn't we just
>>> >>> >> > omit
>>> >>> >> > the
>>> >>> >> > check and return the value or if you want max ( v, 0.0 ) ?
>>> >>> >> >
>>> >>> >> > In applications it is somewhat unexpected to get an exception
>>> >>> >> > when
>>> >>> >> > just asking for a variance estimation on valid data.
>>> >>> >> >
>>> >>> >> > Thank you
>>> >>> >> > Peter
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> ------------------------------------------------------------------------------
>>> >>> >> _______________________________________________
>>> >>> >> QuantLib-dev mailing list
>>> >>> >> [hidden email]
>>> >>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>> >>> >
>>> >>> > --
>>> >>> >
>>> >>> > <http://leanpub.com/implementingquantlib/>
>>> >>> > <http://implementingquantlib.com>
>>> >>> > <http://twitter.com/lballabio>
>>> >>
>>> >> --
>>> >>
>>> >> <http://leanpub.com/implementingquantlib/>
>>> >> <http://implementingquantlib.com>
>>> >> <http://twitter.com/lballabio>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> ------------------------------------------------------------------------------
>>> >>
>>> >> _______________________________________________
>>> >> QuantLib-dev mailing list
>>> >> [hidden email]
>>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>>> >>
>>> >
>>
>> --
>>
>> <http://leanpub.com/implementingquantlib/>
>> <http://implementingquantlib.com>
>> <http://twitter.com/lballabio>

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev