g01aff

g01aff © Numerical Algorithms Group, 2002.

Purpose

2 G01AFF Two-way contingency table analysis, with chi /Fisher's exact test

Synopsis

[nobs,pred,chis,p,npos,ndf,m1,n1,ifail] = g01aff(nobs<,num,ifail>)

Description

 
 The data consist of the frequencies for the two-way 
 classification, denoted by n  , for i=1,2,...,m; j=1,2,...,n with
                             ij                               
 m,n>1.
 
 A check is made to see whether any row or column of the matrix of
 frequencies consists entirely of zeros, and if so, the matrix of 
 frequencies is reduced by omitting that row or column. Suppose 
 the final size of the matrix is m  by n  (m ,n >1), and let
                                  1     1   1  1         
 
          n                                           
           1                                          
          --                                          
      R = >  n  , the total frequency for the ith row, 
       i  --  ij                                      
          j=1                                         
 
      i=1,2,...,m ,
                 1
 
          m                                              
           1                                             
          --                                             
      C = >  n  , the total frequency for the jth column, 
       j  --  ij                                         
          i=1                                            
 
      j=1,2,...,n ,
                 1
 
          m      n                
           1      1               
          --     --               
 and   T= >  R = >  C , the total frequency.
          --  i  --  j            
          i=1    j=1              
 
 There are two situations:
 
 (a)   If m >2 and/or n >2, or m =n =2 and T>40, then the matrix 
           1           1        1  1                            
       of expected frequencies, denoted by r  , for i=1,2,...,m ; 
                                            ij                 1 
                                              2               
       j=1,2,...,n , and the test statistic, X , are computed, 
                  1                                           
       where
                r  =R C /T , i=1,2,...,m  ; j=1,2,...,n 
                 ij  i j                1              1
       and
                         m   n                    
                          1   1                   
                      2  --  --              2    
                     X = >   >  [|r  -n  |-Y] /r  ,
                         --  --    ij  ij       ij
                         i=1 j=1                  
       where
                              { 1           
                              { - if m =n =2
                         Y =  { 2     1  1  
                              {
                              { 0 otherwise  
 
       is Yates' correction for continuity.
       
       Under the assumption that there is no association between 
                                 2                          
       the two classifications, X  will have approximately a chi-
       square distribution with (m -1)*(n -1) degrees of freedom.
                                  1      1               
       
       An option exists which allows for further 'shrinkage' of 
       the matrix of frequencies in the case where r  <1 for the (
                                                    ij           
       i,j)th cell. If this is the case, then row i or column j 
       will be combined with the adjacent row or column with 
       smaller total. Row i is selected for combination if 
       R *m <=C *n . This 'shrinking' process is continued until 
        i  1   j  1                                             
       r  >=1 for all cells (i,j).
        ij                       
 
 (b)   If m =n =2 and T<=40, the probabilities to enable Fisher's 
           1  1                                                  
       exact test to be made are computed.
       
       The matrix of frequencies may be rearranged so that R  is 
                                                            1   
       the smallest marginal (i.e., column and row) total, and 
       C >=C . Under the assumption of no association between the 
        2   1                                                    
       classifications, the probability of obtaining r entries in 
       cell (1,1) is computed where
                         R !R !C !C !                        
                          1  2  1  2                         
          P   = ------------------------------ , r=0,1,...,R .
           r+1  T!r!(R -r)!(C -r)!(T-C -R +r)!              1
                      1      1        1  1                   
       The probability of obtaining the table of given frequencies
       is returned. A test of the assumption against some 
       alternative may then be made by summing the relevant values
       of P .
           r
 

Parameters

g01aff

Required Input Arguments:

nobs (:,:)                            integer

Optional Input Arguments:                       <Default>

num                                   integer  0
ifail                                 integer  -1

Output Arguments:

nobs (:,:)                            integer
pred (:,:)                            real
chis                                  real
p (21)                                real
npos                                  integer
ndf                                   integer
m1                                    integer
n1                                    integer
ifail                                 integer