Multiple linear regression/weighted regression

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple linear regression/weighted regression

Boris Skorodumov
Dear QuantLib Users,

I just wondering does QuantLib.0.9.7 has capabilities to calculate multiple linear regression and weighted linear regression with several predictors?
I see, there is a <ql/math/linearleastsquaresregression.hpp> which is capable to carry out something like y~f(x). I was wondering whether I can do something like y ~ b0 + b1*x1+b2*x2+.... LinearLeastSquaresRegression accepts only one vector x and one vector y.

Also for the case of y ~ b0 + b1*x, errors of estimated coefficients somewhat different from those in R for example.

Here is an example:

 x          y
2.40    7.80
1.80    5.50
2.50    8.00
3.00    9.00
2.10    6.50
1.20    4.00
2.00    6.30
2.70    8.40
3.60    10.20

R will give:

           Estimate   Std Error
b0        0.9448     0.3654  
b1        2.6853     0.1487 


while LinearLeastSquaresRegression will give:

          Estimate   Std Error
b0        0.9448     1.2380  
b1        2.6853     0.5034 

I used the following code to test it:


int main()
{
 try
 {
    std::vector<double> x,y;
    x.push_back(2.4);
    x.push_back(1.8);
    x.push_back(2.5);
    x.push_back(3.0);
    x.push_back(2.1);
    x.push_back(1.2);
    x.push_back(2.0);
    x.push_back(2.7);
    x.push_back(3.6);

    y.push_back(7.8);
    y.push_back(5.5);
    y.push_back(8.0);
    y.push_back(9.0);
    y.push_back(6.5);
    y.push_back(4.0);
    y.push_back(6.3);
    y.push_back(8.4);
    y.push_back(10.2);

    std::vector<boost::function1<Real, Real> > v;
    v.push_back(constant<Real, Real>(1.0));
    v.push_back(identity<Real>());

    LinearLeastSquaresRegression<> m(x, y, v);

     for (unsigned int i=0; i < v.size(); ++i)
        cout << m.error()[i] <<"  " << m.a()[i] <<  endl;

        return 0;
    } catch (std::exception& e)
    {
        std::cout << e.what() << std::endl;
        return 1;
    } catch (...)
    {
        std::cout << "unknown error" << std::endl;
        return 1;
    }
}




Thank you in advance for suggestions,

Boris.

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
QuantLib-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-users
Reply | Threaded
Open this post in threaded view
|

Re: Multiple linear regression/weighted regression

Klaus Spanderen-2
Hi Boris

On Wednesday 14 January 2009 19:20:58 Boris Skorodumov wrote:
> I see, there is a *<ql/math/linearleastsquaresregression.hpp>* which is
> capable to carry out something like *y~f(x)*. I was wondering whether I can
> do something like *y ~ b0 + b1*x1+b2*x2+...*. LinearLeastSquaresRegression
> accepts only one vector x and one vector y.

You have to use objects of type LinearLeastSquaresRegression<Array>
or LinearLeastSquaresRegression<std::vector<Real> >. I've just uploaded a test
case for multi dimensional least squares regression and hope that illustrates
the usage. Please have a look at
http://quantlib.svn.sourceforge.net/viewvc/quantlib/trunk/QuantLib/test-suite/linearleastsquaresregression.cpp?revision=15855&view=markup

> Also for the case of *y ~ b0 + b1*x*, errors of estimated coefficients
> somewhat different from those in R for example.

IMO R "assumes a good fit" under the assumption that the model
y = b0 + b1*x + eps is correct, whereas the errors calculated by the quantlib
follow the definition given in the Numerical Recipes §15.2 and equation
15.4.19 (but w/o indvidual weights or \sigma_i for the design matrix of the
fitting problem).  You can get the errors as calculated by R via

error(R) = \sqrt( \chi^2/(N-2)) * error(QL)

The last lines of your example are

    Real chiSq = 0.0;
    for (Size i=0; i < y.size(); ++i)
        chiSq += square<Real>()(y[i] - (m.a()[0] + m.a()[1]*x[i]))
                    /(x.size()-2);
    for (unsigned int i=0; i < v.size(); ++i)
        std::cout << m.error()[i]*::sqrt(chiSq)
                  << " " << m.a()[i] << std::endl;



hope that helps
 Klaus


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
QuantLib-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-users
Reply | Threaded
Open this post in threaded view
|

Re: Multiple linear regression/weighted regression

Boris Skorodumov
Dear Klaus,

Sorry for the late reply and thank you very much for explanations.

>You have to use objects of type LinearLeastSquaresRegression<
>Array>
>or LinearLeastSquaresRegression<std::vector<Real> >. I've just uploaded a test
>case for multi dimensional least squares regression and hope that illustrates
>the usage. Please have a look at
>http://quantlib.svn.sourceforge.net/viewvc/quantlib/trunk/QuantLib/test-suite/linearleastsquaresregression.>cpp?revision=15855&view=markup

I just tested it, it working perfectly. Also, I run test for the y~b0 + b1*x1+b2*x2+e test taking into account
your recommendations for R way to calculate standard deviation and it match with R output.


Sincerely,
Boris.

On Thu, Jan 15, 2009 at 5:44 PM, Klaus Spanderen <[hidden email]> wrote:
Hi Boris

On Wednesday 14 January 2009 19:20:58 Boris Skorodumov wrote:
> I see, there is a *<ql/math/linearleastsquaresregression.hpp>* which is
> capable to carry out something like *y~f(x)*. I was wondering whether I can
> do something like *y ~ b0 + b1*x1+b2*x2+...*. LinearLeastSquaresRegression
> accepts only one vector x and one vector y.

You have to use objects of type LinearLeastSquaresRegression<Array>
or LinearLeastSquaresRegression<std::vector<Real> >. I've just uploaded a test
case for multi dimensional least squares regression and hope that illustrates
the usage. Please have a look at
http://quantlib.svn.sourceforge.net/viewvc/quantlib/trunk/QuantLib/test-suite/linearleastsquaresregression.cpp?revision=15855&view=markup

> Also for the case of *y ~ b0 + b1*x*, errors of estimated coefficients
> somewhat different from those in R for example.

IMO R "assumes a good fit" under the assumption that the model
y = b0 + b1*x + eps is correct, whereas the errors calculated by the quantlib
follow the definition given in the Numerical Recipes §15.2 and equation
15.4.19 (but w/o indvidual weights or \sigma_i for the design matrix of the
fitting problem).  You can get the errors as calculated by R via

error(R) = \sqrt( \chi^2/(N-2)) * error(QL)

The last lines of your example are

   Real chiSq = 0.0;
   for (Size i=0; i < y.size(); ++i)
       chiSq += square<Real>()(y[i] - (m.a()[0] + m.a()[1]*x[i]))
                   /(x.size()-2);
   for (unsigned int i=0; i < v.size(); ++i)
       std::cout << m.error()[i]*::sqrt(chiSq)
                 << " " << m.a()[i] << std::endl;



hope that helps
 Klaus


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
QuantLib-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-users


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
QuantLib-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quantlib-users