g13bef

g13bef © Numerical Algorithms Group, 2002.

Purpose

G13BEF Multivariate time series, estimation of multi-input model

Synopsis

[para,xxy,itc,sd,cm,s,d,ndf,res,sttf,nsttf,zsp,ifail] = g13bef(mr,mt,para,...
xxy<,kzsp,zsp,kfc,kpriv,kef,kzef,nit,ifail>)

Description

 
 3.1. The Multi-input Model
 
 The output series y , for t=1,2,...,n, is assumed to be the sum 
                    t                                           
 of (unobserved) components z    which are due respectively to the
                             i,t                               
 inputs x ,t, for i=1,2,...,m.
         i                   
 
 Thus y =z   +...+z   +n  where n  is the error, or output noise 
       t  1,t      m,t  t        t                              
 component.
 
 A typical component z  may be either:
                      t        
 
 (a)   A simple regression component, z =(omega)x  (here x  is 
                                       t         t        t   
       called a simple input) or
 
 (b)   A transfer function model component which allows for the 
       effect of lagged values of the variable, related to x  by
                                                            t 
       
       z =(delta) z   +(delta) z   +...+(delta) z   +(omega) x   
        t        1 t-1        2 t-2            p t-p        0 t-b
       -(omega) x     -...-(omega) x     .
               1 t-b-1            q t-b-q
 
 The noise n  is assumed to follow a (possibly seasonal) ARIMA 
            t                                                 
 model, i.e., may be represented in terms of an uncorrelated 
 series, a , by the hierarchy of equations:
          t                      
 
              d       D      
 (c)   (nabla) (nabla) n =c+w 
                      s t    t
 
 (d)   w  = (Phi) w   +(Phi) w     +...
        t        1 t-s      2 t-2*s    
 
             +(Phi) w     +e -(Theta) e   -(Theta) e     -...
                   P t-P*s  t        1 t-s        2 t-2*s  
 
             -(Theta) e 
                     Q t-Q*s
 
 (e)   e  = (phi) e   +(phi) e   +...
        t        1 t-1      2 t-2    
 
            +(phi) e   +a -(theta) a   -(theta) a   -...
                  p t-p  t        1 t-1        2 t-2    
 
            -(theta) a   .
                    q t-q
 
 
 Note: the orders p,q appearing in each of the transfer function 
 models and the ARIMA model are not necessarily the same; 
        d       D                                         
 (nabla) (nabla) n  is the result of applying non-seasonal 
                s t                                       
 differencing of order d and seasonal differencing of seasonality 
 s and order D to the series n , the differenced series is then of
                              t                                 
 length N=n-d-s*D; the constant term parameter c may optionally be
 held fixed at its initial value (usually, but not necessarily 
 zero) rather than being estimated.
 
 For the purpose of defining an estimation criterion it is assumed
 that the series a  is a sequence of independent Normal variates 
                  t                                             
                                   2                             
 having mean 0 and variance (sigma) . An allowance has to be made 
                                   a                             
 for the effects of unobserved data prior to the observation 
 period. For the noise component an allowance is always made using
 a form of backforecasting.
 
 For each transfer function input, the user has to decide what 
 values are to be assumed for the pre-period terms z ,z  ,...,z 
                                                    0  -1      1-p
 and x ,x  ,...,x      which are in theory necessary to re-create 
      0  -1      1-b-q                                           
 the component series z ,z ,...,z , during the estimation 
                       1  2      n                       
 procedure.
 
 The first choice is to assume that all these values are zero. In 
 this case in order to avoid undesirable transient distortion of 
 the early values z ,z ,..., the user is advised first to correct 
                   1  2                                          
 the input series x  by subtracting from all the terms a suitable 
                   t                                             
 constant to make the early values x ,x ,..., close to zero. The 
                                    1  2                        
             _                                                 
 series mean x is one possibility, but for a series with strong 
 trend, the constant might be simply x .
                                      1
 
 The second choice is to treat the unknown pre-period terms as 
 nuisance parameters and estimate them along with the other 
 parameters. This choice should be used with caution. For example,
 if p=1 and b=q=0, it is equivalent to fitting to the data a 
                                              t                  
 decaying geometric curve of the form A(delta) , for t=1,2,3,..., 
 along with the other inputs, this being the form of the 
 transient. If the output y  contains a strong trend of this form,
                           t                                 
 which is not otherwise represented in the model, it will have a 
 tendency to influence the estimate of (delta) away from the value
 appropriate to the transfer function model.
 
 In most applications the first choice should be adequate, with 
 the option possibly being used as a refinement at the end of the 
 modelling process. The number of nuisance parameters is then 
 max(p,b+q), with a corresponding loss of degrees of freedom in 
 the residuals. If the user aligns the input x  with the output by
                                              t                 
 using in its place the shifted series x   , then setting b=0 in 
                                        t-b                     
 the transfer function model, there is some improvement in 
 efficiency. On some occasions when the model contains two or more
 inputs, each with estimation of pre-period nuisance parameters, 
 these parameters may be co-linear and lead to failure of the 
 routine. The option must then be 'switched off' for one or more 
 inputs.
 
 3.2. The Estimation Criterion
 
 This is a measure of how well a proposed set of parameters in the
 transfer function and noise ARIMA models, matches the data. The 
 estimation routine searches for parameter values which minimize 
 this criterion. For a proposed set of parameter values it is 
 derived by calculating:
 
 (i)   the components z   ,z   ,...,z    as the responses to the 
                       1,t  2,t      m,t                        
       input series x   ,x   ...,x    using the equations (a) or 
                     1,t  2,t     m,t                           
       (b) above,
 
 (ii)  the discrepancy between the output and the sum of these 
       components, as the noise
                       n =y -(z   +z   +...+z   ),
                        t  t   1,t  2,t      m,t
 
 (iii) the residual series a  from n  by reversing the recursive 
                            t       t                           
       equations (c), (d) and (e) above.
 
 This last step again requires treatment of the effect of unknown 
 pre-period values of n  and other terms in the equations 
                       t                                 
 regenerating a  and leads to a criterion which is a sum of 
               t                                             
 squares function S, of the residuals a . It may be shown that  
                                       t           
 the finite algorithm presented there is equivalent to taking the 
 infinite set of past values n ,n  ,n  ,..., as (linear) nuisance 
                              0  -1  -2        
 parameters. There is no loss of degrees of freedom however, 
 because the sum of squares function S may be expressed as 
 including the corresponding set of past residuals --
 
                              n       
                              --     2
                           S= >     a .
                              --     t
                              -infty  
 
 The function D=S is the first of the three possible criteria, and
 is quite adequate for moderate to long series with no seasonal 
 parameters. The second is the exact likelihood criterion which 
 considers the past set n ,n  ,n   not as simple nuisance 
                         0  -1  -2                       
 parameters, but as unobserved random variables with known 
 distribution. Calculation of the likelihood of the observed set 
 n ,n ,...,n  requires theoretical integration over the range of 
  1  2      n                                                   
 the past set. Fortunately this yields a criterion of the form 
 D=M*S (whose minimization is equivalent to maximizing the exact 
 likelihood of the data), where S is exactly as before, and the 
 multiplier M is a function calculated from the ARIMA model 
 parameters. The value of M is always >= 1, and M tends to 1 for 
 any fixed parameter set as the sample size n tends to infty. 
 There is a moderate computational overhead in using this option, 
 but its use avoids appreciable bias in the ARIMA model parameters
 and yields a better conditioned estimation problem.
 
 The third criterion of marginal likelihood treats the 
 coefficients of the simple inputs in a manner analogous to that 
 given to the past set n ,n  ,n  ,.... These coefficients, 
                        0  -1  -2                         
 together with the constant term c used to represent the mean of 
 w , are in effect treated as random variables with highly 
  t                                                       
 dispersed distributions. This leads to the criterion D=M*S again,
 but with a different value of M which now depends on the simple 
 input series values x . In the presence of a moderate to large 
                      t                                        
 number of simple inputs, the marginal likelihood criterion can 
 counteract bias in the ARIMA model parameters which is caused by 
 estimation of the simple inputs. This is particularly important 
 in relatively short series.
 
 G13BEF can be used with input series present, to estimate a 
 univariate ARIMA model for the ouput alone. The marginal 
 likelihood criterion is then distinct from exact likelihood only 
 if a constant term is being estimated in the model, because this 
 is treated as an implicit simple input.
 
 3.3. The Estimation Procedure
 
 This is the minimization of the estimation criterion or objective
 function D (for deviance). The routine uses an extension of the 
 algorithm of Marquardt. The step size in the minimization is 
 inversely related to a parameter (alpha), which is increased or 
 decreased by a factor (beta) at successive iterations, depending 
 on the progress of the minimization. Convergence is deemed to 
 have occurred if the fractional reduction of D in successive 
 iterations is less than a value (gamma), while (alpha)<1.
 
 Certain model parameters (in fact all excluding the (omega)'s) 
 are subject to stability constraints which are checked throughout
 to within a specified tolerance multiple (delta) of machine 
 accuracy. Using the least-squares criterion, the minimization may
 halt prematurely when some parameters 'stick' at a constraint 
 boundary. This can happen particularly with short seasonal series
 (with a small number of whole seasons). It will not happen using 
 the exact likelihood criterion, although convergence to a point 
 on the boundary may sometimes be rather slow, because the 
 criterion function may be very flat in such a region. There is 
 also a smaller risk of a premature halt at a constraint boundary 
 when marginal likelihood is used.
 
 A positive, or zero number of iterations can be specified. In 
 either case, the value D of the objective function at iteration 
 zero is presented at the initial parameter values, except for 
 estimation of any pre-period terms for the input series, 
 backforecasts for the noise series, and the coefficients of any 
 simple inputs, and the constant term (unless this is held fixed).
 
 At any later iteration, the value of D is supplied after re-
 estimation of the backforecasts to their optimal values, 
 corresponding to the model parameters presented at that 
 iteration. This is not true for any pre-period terms for the 
 input series which, although they are updated from the previous 
 iteration, may not be precisely optimal for the parameter values 
 presented, unless convergence of those parameters has occurred. 
 However, in the case of marginal likelihood being specified, the 
 coefficients of the simple inputs and the constant term are also 
 re-estimated together with the backforecasts at each iteration, 
 to values which are optimal for the other parameter values 
 presented.
 
 3.4. Further Results
 
                                        S                   
 The residual variance is taken as erv= -- where df=N-(total 
                                        df                  
 number of parameters estimated), is the residual degrees of 
 freedom. The pre-period nuisance parameters for the input series 
 are included in the reduction of df, as is the constant if it is 
 estimated.
 
 The covariance matrix of the vector of model parameter estimates 
 is given by
 
                                   -1
                              erv*H 
 
 where H is the linearised least-squares matrix taken from the 
 final iteration of the algorithm of Marquardt. From this 
 expression are derived the vector of standard deviations, and the
 correlation matrix of parameter estimates. These are 
 approximations which are only valid asymptotically, and must be 
 treated with great caution when the parameter estimates are close
 to their constraint boundaries.
 
 The residual series a  is available upon completion of the 
                      t                                    
 iterations over the range t=1+d+s*D,...,n corresponding to the 
 differenced noise series w .
                           t
 
 Because of the algorithm used for backforecasting, these are only
 true residuals for t>=1+q+s*Q-p-s*P-d-s*D, provided this is 
 positive. Estimation of pre-period terms for the inputs will also
 tend to reduce the magnitude of the early residuals, sometimes 
 severely.
 
 The model component series z   ,...,z    and n  may optionally be
                             1,t      m,t      t                
 returned in place of the supplied series values, in order to 
 assess the effects of the various inputs on the output.

Parameters

g13bef

Required Input Arguments:

mr (7)                                integer
mt (4,:)                              integer
para (:)                              real
xxy (:,:)                             real

Optional Input Arguments:                       <Default>

kzsp                                  integer  0
zsp (4)                               real     zeros(4,1)
kfc                                   integer  1
kpriv                                 integer  0
kef                                   integer  3
kzef                                  integer  0
nit                                   integer  50
ifail                                 integer  -1

Output Arguments:

para (:)                              real
xxy (:,:)                             real
itc                                   integer
sd (:)                                real
cm (:,:)                              real
s                                     real
d                                     real
ndf                                   integer
res (:)                               real
sttf (:)                              real
nsttf                                 integer
zsp (4)                               real
ifail                                 integer