Hi,
here https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56 is a check for a negative variance estimation. Indeed I observe this happens due to numerical issues sometimes. However, if this is the only source for the exception to be thrown, couldn't we just omit the check and return the value or if you want max ( v, 0.0 ) ? In applications it is somewhat unexpected to get an exception when just asking for a variance estimation on valid data. Thank you Peter ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
... another idea would be to remove the core from
IncrementalStatistics and replace it with boost accumulators (they are present since 1.36, so it should be ok), just leaving the interface in place. Shall I do that ? Peter On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote: > Hi, > > here > > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56 > > is a check for a negative variance estimation. Indeed I observe this > happens due to numerical issues sometimes. However, if this is the > only source for the exception to be thrown, couldn't we just omit the > check and return the value or if you want max ( v, 0.0 ) ? > > In applications it is somewhat unexpected to get an exception when > just asking for a variance estimation on valid data. > > Thank you > Peter ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
Do they have the same behavior? (That is, keep the statistics but discard the data?) If so, yes, it would probably make the code simpler. Luigi On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]> wrote: ... another idea would be to remove the core from -- <http://leanpub.com/implementingquantlib/> ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
This will in general depend on the specific accumulator, but I would
assume so for the ones we need here. The documentation isn't overly explicit about that, the only hint seems to be "This works, but some accumulators are not cheap to copy. For example, the tail andtail_variate<> accumulators must store a std::vector<>, so copying these accumulators involves a dynamic allocation." I will test the memory usage though, to be sure. Peter On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> wrote: > Do they have the same behavior? (That is, keep the statistics but discard > the data?) If so, yes, it would probably make the code simpler. > > Luigi > > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]> > wrote: >> >> ... another idea would be to remove the core from >> IncrementalStatistics and replace it with boost accumulators (they are >> present since 1.36, so it should be ok), just leaving the interface in >> place. Shall I do that ? >> Peter >> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> wrote: >> > Hi, >> > >> > here >> > >> > >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56 >> > >> > is a check for a negative variance estimation. Indeed I observe this >> > happens due to numerical issues sometimes. However, if this is the >> > only source for the exception to be thrown, couldn't we just omit the >> > check and return the value or if you want max ( v, 0.0 ) ? >> > >> > In applications it is somewhat unexpected to get an exception when >> > just asking for a variance estimation on valid data. >> > >> > Thank you >> > Peter >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> QuantLib-dev mailing list >> [hidden email] >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev > > -- > > <http://leanpub.com/implementingquantlib/> > <http://implementingquantlib.com> > <http://twitter.com/lballabio> ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
Ok, thanks. (The reason I asked is that the small memory footprint is the very reason IncrementalStatistics is there in the first place. Otherwise, one would just use GeneralStatistics instead.) On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]> wrote: This will in general depend on the specific accumulator, but I would -- <http://leanpub.com/implementingquantlib/> ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
I'm all in favor of replacing the implementation of QL statistics classes with boost. Then deprecate QL interfaces and switch to boost altogether.We might just have some finance related risk measures to keep. It has always been a dream pet project of mine... if i only had time... On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio <[hidden email]> wrote:
------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
In reply to this post by Luigi Ballabio
Alight, I plugged the boost accumulators into the ql - class:
Memory footprint: I added the first 10,000 integers to the original class (and extracted mean, variance, skewness, kurtosis to be sure later on boost is actually doing something) and recorded the heap usage with Massif. The peak memory usage is at 6 KB. In comparison for GeneralStatistics I get 390 KB. With the boostified version it is again 6 KB. This looks ok / the same for both implementations. Performance: Doing the same with the first 1,000,000 integers the original class runs 25ms, while the new one takes 35ms, so a bit slower. I guess this is acceptable though. Numerical stability: Seems to be better in boost, in particular for the variance estimation. Generating 500,000 random numbers drawn from a normal distribution with mean 1E8 and variance 1E-4 I get estimations for the variance of 9.9918E-5 in boost and 0 in ql. With mean 1E7 and variance 1E-6 it is 9.9918E-7 in boost and 0.6719 in ql. Or with mean 1E8 and variance 1E-2 boost says 0.00999 and ql 348.0. For me it looks as if boost wins on balance, mostly because of the better numerical stability. I will send a PR after adding a few test cases which test the new implementation against the old one and some cases which recognize the better stability. Peter On 27 August 2015 at 10:48, Luigi Ballabio <[hidden email]> wrote: > Ok, thanks. (The reason I asked is that the small memory footprint is the > very reason IncrementalStatistics is there in the first place. Otherwise, > one would just use GeneralStatistics instead.) > > > On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]> > wrote: >> >> This will in general depend on the specific accumulator, but I would >> assume so for the ones we need here. The documentation isn't overly >> explicit about that, the only hint seems to be >> >> "This works, but some accumulators are not cheap to copy. For example, >> the tail andtail_variate<> accumulators must store a std::vector<>, so >> copying these accumulators involves a dynamic allocation." >> >> I will test the memory usage though, to be sure. >> >> Peter >> >> >> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> >> wrote: >> > Do they have the same behavior? (That is, keep the statistics but >> > discard >> > the data?) If so, yes, it would probably make the code simpler. >> > >> > Luigi >> > >> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]> >> > wrote: >> >> >> >> ... another idea would be to remove the core from >> >> IncrementalStatistics and replace it with boost accumulators (they are >> >> present since 1.36, so it should be ok), just leaving the interface in >> >> place. Shall I do that ? >> >> Peter >> >> >> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> >> >> wrote: >> >> > Hi, >> >> > >> >> > here >> >> > >> >> > >> >> > >> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56 >> >> > >> >> > is a check for a negative variance estimation. Indeed I observe this >> >> > happens due to numerical issues sometimes. However, if this is the >> >> > only source for the exception to be thrown, couldn't we just omit the >> >> > check and return the value or if you want max ( v, 0.0 ) ? >> >> > >> >> > In applications it is somewhat unexpected to get an exception when >> >> > just asking for a variance estimation on valid data. >> >> > >> >> > Thank you >> >> > Peter >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> >> QuantLib-dev mailing list >> >> [hidden email] >> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev >> > >> > -- >> > >> > <http://leanpub.com/implementingquantlib/> >> > <http://implementingquantlib.com> >> > <http://twitter.com/lballabio> > > -- > > <http://leanpub.com/implementingquantlib/> > <http://implementingquantlib.com> > <http://twitter.com/lballabio> ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
In reply to this post by Ferdinando M. Ametrano-2
Yes sure, I kept the interface as is and just replaced the
implementation for now. Since the member variables are protected this is not 100% backward compatible because classes might derive from IncrementalStatistics using these variables. But this is not done in QuantLib itself, so I would say we just ignore this. On 28 August 2015 at 20:42, Ferdinando M. Ametrano <[hidden email]> wrote: > I'm all in favor of replacing the implementation of QL statistics classes > with boost. Then deprecate QL interfaces and switch to boost altogether.We > might just have some finance related risk measures to keep. It has always > been a dream pet project of mine... if i only had time... > > On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio <[hidden email]> > wrote: >> >> Ok, thanks. (The reason I asked is that the small memory footprint is the >> very reason IncrementalStatistics is there in the first place. Otherwise, >> one would just use GeneralStatistics instead.) >> >> >> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers <[hidden email]> >> wrote: >>> >>> This will in general depend on the specific accumulator, but I would >>> assume so for the ones we need here. The documentation isn't overly >>> explicit about that, the only hint seems to be >>> >>> "This works, but some accumulators are not cheap to copy. For example, >>> the tail andtail_variate<> accumulators must store a std::vector<>, so >>> copying these accumulators involves a dynamic allocation." >>> >>> I will test the memory usage though, to be sure. >>> >>> Peter >>> >>> >>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> >>> wrote: >>> > Do they have the same behavior? (That is, keep the statistics but >>> > discard >>> > the data?) If so, yes, it would probably make the code simpler. >>> > >>> > Luigi >>> > >>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers <[hidden email]> >>> > wrote: >>> >> >>> >> ... another idea would be to remove the core from >>> >> IncrementalStatistics and replace it with boost accumulators (they are >>> >> present since 1.36, so it should be ok), just leaving the interface in >>> >> place. Shall I do that ? >>> >> Peter >>> >> >>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> >>> >> wrote: >>> >> > Hi, >>> >> > >>> >> > here >>> >> > >>> >> > >>> >> > >>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56 >>> >> > >>> >> > is a check for a negative variance estimation. Indeed I observe this >>> >> > happens due to numerical issues sometimes. However, if this is the >>> >> > only source for the exception to be thrown, couldn't we just omit >>> >> > the >>> >> > check and return the value or if you want max ( v, 0.0 ) ? >>> >> > >>> >> > In applications it is somewhat unexpected to get an exception when >>> >> > just asking for a variance estimation on valid data. >>> >> > >>> >> > Thank you >>> >> > Peter >>> >> >>> >> >>> >> >>> >> ------------------------------------------------------------------------------ >>> >> _______________________________________________ >>> >> QuantLib-dev mailing list >>> >> [hidden email] >>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev >>> > >>> > -- >>> > >>> > <http://leanpub.com/implementingquantlib/> >>> > <http://implementingquantlib.com> >>> > <http://twitter.com/lballabio> >> >> -- >> >> <http://leanpub.com/implementingquantlib/> >> <http://implementingquantlib.com> >> <http://twitter.com/lballabio> >> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> QuantLib-dev mailing list >> [hidden email] >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev >> > ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
I suppose there's a Boost accumulator that can replace GeneralStatistics, too? (That is, one that stores and can return the whole set of samples?) As for switching: I'm all for it, but the switch itself is going to need some thinking if we're to let the two interfaces coexist for a release or two. MonteCarloModel<MC, RNG, Statistics> and MonteCarloModel<MC, RNG, Boost.Whatever> should both work. I guess we could use traits of enable_if underneath... (The second leg of this would be to give our random-number generators the same interface as the ones in Boost and the C++11 standard so that we can eventually replace those, too. Performance would be more of a sensitive issue in this case, though...) Luigi On Fri, Aug 28, 2015 at 9:30 PM Peter Caspers <[hidden email]> wrote: Yes sure, I kept the interface as is and just replaced the -- <http://leanpub.com/implementingquantlib/> ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
Statistics, random numbers, and distributions! I might even suggest that stats and distributions should be boostified at the same time. Wouldn't this be a good student assignement? Too bad Google summer of code is just over... On Aug 29, 2015 2:34 PM, "Luigi Ballabio" <[hidden email]> wrote:
------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
You can implement your own accumulators (I am not aware of one working
like GeneralStatistics), but I am not sure how returning the data would fit into the general concept. Maybe it is better to just leave the two tasks of storing the data / sorting it / return values ... etc. and the statistical computations separated in the respective responsibility of STL containers and accumulators in the end ? Also we have to keep in mind that the boost standard accumulators are returning different flavors of estimations compared to QuantLib (variance, skewness, kurtosis, error estimate). The right way would probably be be to provide own (boost based) accumulators for those, either as an extension provided within QuantLib or as a direct contribution to boost. I just placed appropriate factors in the wrapper for now. Peter On 29 August 2015 at 14:52, Ferdinando M. Ametrano <[hidden email]> wrote: > Statistics, random numbers, and distributions! I might even suggest that > stats and distributions should be boostified at the same time. > > Wouldn't this be a good student assignement? Too bad Google summer of code > is just over... > > On Aug 29, 2015 2:34 PM, "Luigi Ballabio" <[hidden email]> wrote: >> >> I suppose there's a Boost accumulator that can replace GeneralStatistics, >> too? (That is, one that stores and can return the whole set of samples?) >> >> As for switching: I'm all for it, but the switch itself is going to need >> some thinking if we're to let the two interfaces coexist for a release or >> two. MonteCarloModel<MC, RNG, Statistics> and MonteCarloModel<MC, RNG, >> Boost.Whatever> should both work. I guess we could use traits of enable_if >> underneath... >> >> (The second leg of this would be to give our random-number generators the >> same interface as the ones in Boost and the C++11 standard so that we can >> eventually replace those, too. Performance would be more of a sensitive >> issue in this case, though...) >> >> Luigi >> >> >> On Fri, Aug 28, 2015 at 9:30 PM Peter Caspers <[hidden email]> >> wrote: >>> >>> Yes sure, I kept the interface as is and just replaced the >>> implementation for now. Since the member variables are protected this >>> is not 100% backward compatible because classes might derive from >>> IncrementalStatistics using these variables. But this is not done in >>> QuantLib itself, so I would say we just ignore this. >>> >>> On 28 August 2015 at 20:42, Ferdinando M. Ametrano >>> <[hidden email]> wrote: >>> > I'm all in favor of replacing the implementation of QL statistics >>> > classes >>> > with boost. Then deprecate QL interfaces and switch to boost >>> > altogether.We >>> > might just have some finance related risk measures to keep. It has >>> > always >>> > been a dream pet project of mine... if i only had time... >>> > >>> > On Thu, Aug 27, 2015 at 10:48 AM, Luigi Ballabio >>> > <[hidden email]> >>> > wrote: >>> >> >>> >> Ok, thanks. (The reason I asked is that the small memory footprint is >>> >> the >>> >> very reason IncrementalStatistics is there in the first place. >>> >> Otherwise, >>> >> one would just use GeneralStatistics instead.) >>> >> >>> >> >>> >> On Thu, Aug 27, 2015 at 10:45 AM Peter Caspers >>> >> <[hidden email]> >>> >> wrote: >>> >>> >>> >>> This will in general depend on the specific accumulator, but I would >>> >>> assume so for the ones we need here. The documentation isn't overly >>> >>> explicit about that, the only hint seems to be >>> >>> >>> >>> "This works, but some accumulators are not cheap to copy. For >>> >>> example, >>> >>> the tail andtail_variate<> accumulators must store a std::vector<>, >>> >>> so >>> >>> copying these accumulators involves a dynamic allocation." >>> >>> >>> >>> I will test the memory usage though, to be sure. >>> >>> >>> >>> Peter >>> >>> >>> >>> >>> >>> On 27 August 2015 at 09:58, Luigi Ballabio <[hidden email]> >>> >>> wrote: >>> >>> > Do they have the same behavior? (That is, keep the statistics but >>> >>> > discard >>> >>> > the data?) If so, yes, it would probably make the code simpler. >>> >>> > >>> >>> > Luigi >>> >>> > >>> >>> > On Thu, Aug 27, 2015 at 9:17 AM Peter Caspers >>> >>> > <[hidden email]> >>> >>> > wrote: >>> >>> >> >>> >>> >> ... another idea would be to remove the core from >>> >>> >> IncrementalStatistics and replace it with boost accumulators (they >>> >>> >> are >>> >>> >> present since 1.36, so it should be ok), just leaving the >>> >>> >> interface in >>> >>> >> place. Shall I do that ? >>> >>> >> Peter >>> >>> >> >>> >>> >> On 20 August 2015 at 17:56, Peter Caspers <[hidden email]> >>> >>> >> wrote: >>> >>> >> > Hi, >>> >>> >> > >>> >>> >> > here >>> >>> >> > >>> >>> >> > >>> >>> >> > >>> >>> >> > >>> >>> >> > https://github.com/lballabio/quantlib/blob/master/QuantLib/ql/math/statistics/incrementalstatistics.cpp#L56 >>> >>> >> > >>> >>> >> > is a check for a negative variance estimation. Indeed I observe >>> >>> >> > this >>> >>> >> > happens due to numerical issues sometimes. However, if this is >>> >>> >> > the >>> >>> >> > only source for the exception to be thrown, couldn't we just >>> >>> >> > omit >>> >>> >> > the >>> >>> >> > check and return the value or if you want max ( v, 0.0 ) ? >>> >>> >> > >>> >>> >> > In applications it is somewhat unexpected to get an exception >>> >>> >> > when >>> >>> >> > just asking for a variance estimation on valid data. >>> >>> >> > >>> >>> >> > Thank you >>> >>> >> > Peter >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> ------------------------------------------------------------------------------ >>> >>> >> _______________________________________________ >>> >>> >> QuantLib-dev mailing list >>> >>> >> [hidden email] >>> >>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev >>> >>> > >>> >>> > -- >>> >>> > >>> >>> > <http://leanpub.com/implementingquantlib/> >>> >>> > <http://implementingquantlib.com> >>> >>> > <http://twitter.com/lballabio> >>> >> >>> >> -- >>> >> >>> >> <http://leanpub.com/implementingquantlib/> >>> >> <http://implementingquantlib.com> >>> >> <http://twitter.com/lballabio> >>> >> >>> >> >>> >> >>> >> >>> >> ------------------------------------------------------------------------------ >>> >> >>> >> _______________________________________________ >>> >> QuantLib-dev mailing list >>> >> [hidden email] >>> >> https://lists.sourceforge.net/lists/listinfo/quantlib-dev >>> >> >>> > >> >> -- >> >> <http://leanpub.com/implementingquantlib/> >> <http://implementingquantlib.com> >> <http://twitter.com/lballabio> ------------------------------------------------------------------------------ _______________________________________________ QuantLib-dev mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-dev |
Free forum by Nabble | Edit this page |