Login  Register

Re: incremental statistics

Posted by Peter Caspers-4 on Aug 28, 2015; 6:54pm
URL: http://quantlib.414.s1.nabble.com/incremental-statistics-tp16828p16860.html

Alight, I plugged the boost accumulators into the ql - class:

Memory footprint:
I added the first 10,000 integers to the original class (and extracted
mean, variance, skewness, kurtosis to be sure later on boost is
actually doing something) and recorded the heap usage with Massif. The
peak memory usage is at 6 KB. In comparison for GeneralStatistics I
get 390 KB. With the boostified version it is again 6 KB. This looks
ok / the same for both implementations.

Performance:
Doing the same with the first 1,000,000 integers the original class
runs 25ms, while the new one takes 35ms, so a bit slower. I guess this
is acceptable though.

Numerical stability:
Seems to be better in boost, in particular for the variance
estimation. Generating 500,000 random numbers drawn from a normal
distribution with mean 1E8 and variance 1E-4 I get estimations for the
variance of 9.9918E-5 in boost and 0 in ql. With mean 1E7 and variance
1E-6 it is 9.9918E-7 in boost and 0.6719 in ql. Or with mean 1E8 and
variance 1E-2 boost says 0.00999 and ql 348.0.

For me it looks as if boost wins on balance, mostly because of the
better numerical stability. I will send a PR after adding a few test
cases which test the new implementation against the old one and some
cases which recognize the better stability.

Peter


On 27 August 2015 at 10:48, Luigi Ballabio <[hidden email]> wrote:

> Ok, thanks. (The reason I asked is that the small memory footprint is the
> very reason IncrementalStatistics is there in the first place. Otherwise,
> one would just use GeneralStatistics instead.)
>
>
> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]>
> wrote:
>>
>> This will in general depend on the specific accumulator, but I would
>> assume so for the ones we need here. The documentation isn't overly
>> explicit about that, the only hint seems to be
>>
>> "This works, but some accumulators are not cheap to copy. For example,
>> the tail andtail_variate<> accumulators must store a std::vector<>, so
>> copying these accumulators involves a dynamic allocation."
>>
>>  I will test the memory usage though, to be sure.
>>
>> Peter
>>
>>
>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]>
>> wrote:
>> > Do they have the same behavior? (That is, keep the statistics but
>> > discard
>> > the data?) If so, yes, it would probably make the code simpler.
>> >
>> > Luigi
>> >
>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]>
>> > wrote:
>> >>
>> >> ... another idea would be to remove the core from
>> >> IncrementalStatistics and replace it with boost accumulators (they are
>> >> present since 1.36, so it should be ok), just leaving the interface in
>> >> place. Shall I do that ?
>> >> Peter
>> >>
>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > here
>> >> >
>> >> >
>> >> >
>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56
>> >> >
>> >> > is a check for a negative variance estimation. Indeed I observe this
>> >> > happens due to numerical issues sometimes. However, if this is the
>> >> > only source for the exception to be thrown, couldn't we just omit the
>> >> > check and return the value or if you want max ( v, 0.0 ) ?
>> >> >
>> >> > In applications it is somewhat unexpected to get an exception when
>> >> > just asking for a variance estimation on valid data.
>> >> >
>> >> > Thank you
>> >> > Peter
>> >>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> _______________________________________________
>> >> QuantLib-dev mailing list
>> >> [hidden email]
>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev
>> >
>> > --
>> >
>> > <http://leanpub.com/implementingquantlib/>
>> > <http://implementingquantlib.com>
>> > <http://twitter.com/lballabio>
>
> --
>
> <http://leanpub.com/implementingquantlib/>
> <http://implementingquantlib.com>
> <http://twitter.com/lballabio>

------------------------------------------------------------------------------
_______________________________________________
QuantLib-dev mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-dev