g02daf
g02daf
© Numerical Algorithms Group, 2002.
Purpose
G02DAF Fits a general (multiple) linear regression model
Synopsis
[rss,idf,b,se,cov,res,h,q,svd,irank,p,wk,ifail] = g02daf(x,isx,y<,wt,mean,...
tol,weight,ifail>)
Description
The general linear regression model is defined by:
y=X(beta)+(epsilon)
where y is a vector of n observations on the dependent variable,
X is a n by p matrix of the independent variables of
column rank k,
(beta) is a vector of length p of unknown parameters,
and (epsilon) is a vector of length n of unknown random errors
2
such that var (epsilon)=V(sigma) , where V is a known
diagonal matrix.
If V=I, the identity matrix, then least-squares estimation is
used. If V/=I, then for a given weight matrix
-1
Wproportional to V , weighted least-squares estimation is used.
^^^^^^
The least-squares estimates (beta) of the parameters (beta)
T
minimize (y-X(beta)) (y-X(beta)) while the weighted least-squares
T
estimates minimize (y-X(beta)) W(y-X(beta)).
1/2
G02DAF finds a QR decomposition of X (or W X in weighted case),
i.e.,
* 1/2 *
X=QR (or W X=QR )
* (R)
where R =(0) and R is a p by p upper triangular matrix and Q is
^^^^^^
an n by n orthogonal matrix. If R is of full rank, then (beta) is
the solution to:
^^^^^^
R(beta)=c
1
T T 1/2
where c=Q y (or Q W y) and c is the first p elements of c. If
1
R is not of full rank a solution is obtained by means of a
singular value decomposition (SVD) of R,
(D 0) T
R=Q (0 0)P ,
*
where D is a k by k diagonal matrix with non-zero diagonal
elements, k being the rank of R and Q and P are p by p
*
orthogonal matrices. This gives the solution
^^^^^^ -1 T
(beta)=P D Q c
1 * 1
1
P being the first k columns of P, i.e.,P=(P P ) and Q being
1 1 0 *
1
the first k columns of Q .
*
Details of the SVD, are made available, in the form of the matrix
*
P :
( -1 T)
(D P )
( 1)
* ( T )
P =( P ).
( 0 )
This will be only one of the possible solutions. Only
certain linear combinations of the parameters will have
unique estimates, these are known as estimable functions.
The fit of the model can be examined by considering the
^ ^ ^^^^^^
residuals, r =y -y, where y=X(beta) are the fitted values. The
i i
fitted values can be written as Hy for an n by n matrix H. The i
th diagonal elements of H, h , give a measure of the influence of
i
the ith values of the independent variables on the fitted
regression model. The values h are sometimes known as leverages.
i
Both r and h are provided by G02DAF.
i i
^^^^^^
The output of G02DAF also includes (beta), the residual sum of
squares and associated degrees of freedom, (n-k), the standard
errors of the parameter estimates and the variance-covariance
matrix of the parameter estimates.
In many linear regression models the first term is taken as a
mean term or an intercept, i.e., X ,1 = 1, for i=1,2,...,n. This
i
is provided as an option. Also only some of the possible
independent variables are required to be included in a model, a
facility to select variables to be included in the model is
provided.
Details of the QR decomposition and, if used, the SVD, are made
available. These allow estimating and testing an estimable
function using G02DNF.
Parameters
g02daf
Required Input Arguments:
x (:,:) real
isx (:) integer
y (:) real
Optional Input Arguments: <Default>
wt (:) real zeros(length(y),1)
mean (1) string 'm'
tol real 1e-6
weight (1) string g02daf02(wt)
ifail integer -1
Output Arguments:
rss real
idf integer
b (:) real
se (:) real
cov (:) real
res (:) real
h (:) real
q (:,:) real
svd logical
irank integer
p (:) real
wk (:) real
ifail integer