Home > Numerical Examples
ML FITTING of MPH and HLP MODELS using MPH.FIT (version 3.0): NUMERICAL EXAMPLES
Author: Joseph B. Lang, Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242 USA (September 27, 2002, last updated: April 15, 2005, February 7, 2007, May 22, 2009)
The numerical examples below were fitted using mph.fit, VERSION 3.0. Please read about the changes from version 2.0 in the header of the file mph.fit.
For another collection of more basic examples, click here.
Primary References:
Lang, J.B (2004). "Multinomial-Poisson Homogeneous Models for Contingency Tables," Annals of Statistics, 32, 340-383 .
Lang, J.B. (2005). "Homogeneous Linear Predictor Models for Contingency Tables," JASA, 100, 121-134
See also, other on-line documentation starting at www.stat.uiowa.edu/~jblang/mph.fitting/
Numerical Examples:
EXAMPLE 1. Mean Response Model (colds in children)
EXAMPLE 2. Dispersion Linear Trend Model (Danish marital status)
EXAMPLE 3. Marginal Homogeneity Model (eye grade)
EXAMPLE 4. Loglinear Models under Different Sampling Plans (bike helmet use)
EXAMPLE 5. Loglinear Models of Independence (bike helmet use)
EXAMPLE 6. Logit Models under Different Sampling Plans (bike helmet use)
EXAMPLE 7. Sparse Table Example (fine tuning the fitting algorithm)
EXAMPLE 8. Given Marginal Distributions and Odds Ratios, Compute Joint Distribution in a Two-Way Table.
EXAMPLE 9. Model Constraint: ROC Area Equal to a Constant
EXAMPLE 10. Marginal Homogeneity of Multidimensional Contingency Tables
EXAMPLE 11. Contingency Tables with Given Marginals.
EXAMPLE 12. Testing for (Conditional) Pairwise Independence
EXAMPLE 13. Kappa Regression using Direct and Indirect Smoothing Models.
EXAMPLE 14. Marginal Cumulative Probit Model (in Conjunction with Loglinear Association Model)
EXAMPLE 15. Marginal Cumulative Logit and Linear-by-Linear Loglinear Association Model (Generalized Loglinear and Indirect Smoothing Model Specification)
EXAMPLE 1. ML fit of mean-response models
Table 13.1 Colds in Children, source: Stokes, Davis, and Koch
(2000), "Categorical Data Analysis Using the SAS System."
Periods with Colds Sex Residence 0 1 2 Total ---- --------- ------------------- ----- F Rural 45 64 71 180 F Urban 80 104 116 300 M Rural 84 124 82 290 M Urban 106 117 87 310
Models fitted below:
Sampling Distribution: Product Multinomial
Conditional on row totals,
(yij1, yij2, yij3) <- (Yij1, Yij2, Yij3) indep ~ mult(nij, pij1, pij2, pij3), i, j = 1,2.
Here, n11 = 180, n12 = 300, n21 = 290, n22 = 310 and the table probabilities are interpreted as pijk = P(Periods = k|SEX=i, RES=j).
Systematic Components (Linear Predictor Models):
Saturated mean-response model:
Mij = b(0) + b(SEX)i + b(RES)j + b(SEX*RES)ij,
No-interaction mean-response model:
Mij = b(0) + b(SEX)i + b(RES)j.
Here, Mij = 0*pij1 + 1*pij2 + 2*pij3 = mean number of periods for SEX=i, RES=j.
These mean-response models have the generic form L(p) = Xb. These are examples of 0-order HLP models.
d <- scan(what=list(sex="",residence="",periods=0,count=0)) F R 0 45 F R 1 64 F R 2 71 F U 0 80 F U 1 104 F U 2 116 M R 0 84 M R 1 124 M R 2 82 M U 0 106 M U 1 117 M U 2 87 d <- data.frame(d) y <- d$count sex.residence <- paste(d$sex,d$residence) #[Strata are determined by (SEX, RESIDENCE) values] L.fct <- function(p) { p <- matrix(p,4,3,byrow=T) mean1 <- sum( 0:2*p[1,] ) mean2 <- sum( 0:2*p[2,] ) mean3 <- sum( 0:2*p[3,] ) mean4 <- sum( 0:2*p[4,] ) L <- matrix(c(mean1,mean2,mean3,mean4)) rownames(L) <- c("mean.FR","mean.FU","mean.MR","mean.MU") L } d0 <- subset(d,periods==0) # SATURATED MEAN-RESPONSE MODEL.... a1 <- mph.fit(y, link=L.fct,X=model.matrix(~sex*residence,d0),strata=sex.residence)
REMARK: Don't forget to input the strata variable, as the strata determine how the table probabilities are defined. |
mph.fit, version 3.0, 5/1/09 , running... Convergence criteria have been met. Time Elapsed: 0.03 seconds , Iterations: 1 mph.summary(a1) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 0 ): Gsq = 0 Pearson's Score Stat (df= 0 ): Xsq = 0 Generalized Wald Stat (df= 0 ): Wsq = 0 Adj Resids: 0 0 ... 0 0 , Number |Adj Resid| > 2: 0 SAMPLING PLAN INFORMATION... Number of strata: 4 Strata identifiers: F R, F U, M R, M U Strata with fixed sample sizes: all Observed strata sample sizes: 180, 300, 290, 310 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) 1.14444444 0.05885860 19.4439633 0.00000000 sexM -0.15134100 0.07374287 -2.0522796 0.04014250 residenceU -0.02444444 0.07479380 -0.3268245 0.74380065 sexM:residenceU -0.02994933 0.09779569 -0.3062438 0.75941899 OBS LINK ML LINK StdErr(L) LINK RESID mean.FR 1.1444444 1.1444444 0.05885860 0 mean.FU 1.1200000 1.1200000 0.04614952 0 mean.MR 0.9931034 0.9931034 0.04442608 0 mean.MU 0.9387097 0.9387097 0.04467893 0 CONVERGENCE INFORMATION... Original counts used. iterations = 1 , time elapsed = 0.03 norm.diff = 0 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 6.7408e-14 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09 # NO-INTERACTION MEAN-RESPONSE MODEL... b <- mph.fit(y, link=L.fct,X=model.matrix(~sex+residence,d0),strata=sex.residence) mph.summary(b,T,T) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 1 ): Gsq = 0.09383 (p = 0.75936 ) Pearson's Score Stat (df= 1 ): Xsq = 0.09386 (p = 0.75933 ) Generalized Wald Stat (df= 1 ): Wsq = 0.09379 (p = 0.75942 ) Adj Resids: -0.306 -0.306 ... 0.306 0.306 , Number |Adj Resid| > 2: 0
REMARK: The no-interaction mean-response model fits well, Xsq = 0.09386 (df=1, p-value=0.75933) and all the adjusted cell-specific residuals are between -0.306 and +0.306. |
SAMPLING PLAN INFORMATION... Number of strata: 4 Strata identifiers: F R, F U, M R, M U Strata with fixed sample sizes: all Observed strata sample sizes: 180, 300, 290, 310 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) 1.15526154 0.04695725 24.6024112 0.0000000000 sexM -0.16834269 0.04842903 -3.4760696 0.0005088201 residenceU -0.04194803 0.04817685 -0.8707093 0.3839129012 OBS LINK ML LINK StdErr(L) LINK RESID mean.FR 1.1444444 1.1552615 0.04695725 -0.3063624 mean.FU 1.1200000 1.1133135 0.04070944 0.3063624 mean.MR 0.9931034 0.9869188 0.03957190 0.3063624 mean.MU 0.9387097 0.9449708 0.03975182 -0.3063624 CELL-SPECIFIC STATISTICS... strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS y1 F R 45 44.11273 4.991427 0.2450707 0.02773015 0.3063624 y2 F R 64 63.82746 6.393533 0.3545970 0.03551963 0.3063624 y3 F R 71 72.05981 5.589724 0.4003323 0.03105402 -0.3063624 y4 F U 80 80.94135 7.047109 0.2698045 0.02349036 -0.3063624 y5 F U 104 104.12325 8.235446 0.3470775 0.02745149 -0.3063624 y6 F U 116 114.93540 7.669822 0.3831180 0.02556607 0.3063624 y7 M R 84 84.90553 7.163143 0.2927777 0.02470049 -0.3063624 y8 M R 124 123.98247 8.424577 0.4275258 0.02905027 0.3063624 y9 M R 82 81.11200 7.072743 0.2796965 0.02438877 0.3063624 y10 M U 106 104.99696 7.662589 0.3386999 0.02471803 0.3063624 y11 M U 117 117.06512 8.533036 0.3776294 0.02752592 -0.3063624 y12 M U 87 87.93791 7.322569 0.2836707 0.02362119 -0.3063624 CONVERGENCE INFORMATION... Original counts used. iterations = 4 , time elapsed = 0.05 norm.diff = 2.15676e-07 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 9.74239e-09 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. MODEL INFORMATION... Linear Predictor Model Link Function L.fct (L.mean= FALSE ): function(m) { p <- m*c(1/Z%*%t(Z)%*%m) as.matrix(L.input.fct(p)) } <environment: 0x0157e07c> Link information as originally input, L.input.fct: function(p) { p <- matrix(p,4,3,byrow=T) m1 <- sum( 0:2*p[1,] ) m2 <- sum( 0:2*p[2,] ) m3 <- sum( 0:2*p[3,] ) m4 <- sum( 0:2*p[4,] ) L <- matrix(c(m1,m2,m3,m4)) rownames(L) <- c("mean.FR","mean.FU","mean.MR","mean.MU") L } Derivative of Transpose Link Function derLt.fct: [1] "Numerical derivatives used." Linear Predictor Model Design Matrix X: (Intercept) sexM residenceU 1 1 0 0 4 1 0 1 7 1 1 0 10 1 1 1 attr(,"assign") [1] 0 1 2 attr(,"contrasts") attr(,"contrasts")$sex [1] "contr.treatment" attr(,"contrasts")$residence [1] "contr.treatment" U = Orthogonal Complement of X: [,1] 1 -1.773473 4 1.773473 7 1.773473 10 -1.773473 Constraint Function h.fct (h.mean= FALSE ): function(m) { t(U)%*%L.fct(m) } <environment: 0x0157e07c> Constraint information as originally input, h.input.fct: NULL Derivative of Transpose Constraint Function derht.fct: [1] "Numerical derivatives used." Population (Strata) Matrix Z: F R F U M R M U 1 1 0 0 0 2 1 0 0 0 3 1 0 0 0 4 0 1 0 0 5 0 1 0 0 6 0 1 0 0 7 0 0 1 0 8 0 0 1 0 9 0 0 1 0 10 0 0 0 1 11 0 0 0 1 12 0 0 0 1 attr(,"assign") [1] 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" Sampling Constraint Matrix ZF: F R F U M R M U 1 1 0 0 0 2 1 0 0 0 3 1 0 0 0 4 0 1 0 0 5 0 1 0 0 6 0 1 0 0 7 0 0 1 0 8 0 0 1 0 9 0 0 1 0 10 0 0 0 1 11 0 0 0 1 12 0 0 0 1 attr(,"assign") [1] 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" Fixed Sample Sizes n: OBS F R 180 F U 300 M R 290 M U 310 strata: strata [1,] "F R" [2,] "F R" [3,] "F R" [4,] "F U" [5,] "F U" [6,] "F U" [7,] "M R" [8,] "M R" [9,] "M R" [10,] "M U" [11,] "M U" [12,] "M U" fixed.strata: [1] "all" FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 2. Dispersion Linear Trend Model.
Marital Status of Danes. Source: Lloyd, C. (1999), "Statistical Analysis of Categorical Data," p. 72.
Age Single Married Divorced Total 17-21 17 1 0 18 21-25 16 8 0 24 25-30 8 17 1 26 30-40 6 22 4 32 40-50 5 21 6 32 50-60 3 17 8 28 60-70 2 8 6 16 70+ 1 3 5 9
Candidate Models:
(yi1, yi2, yi3) <- (Yi1, Yi2, Yi3) indep ~ mult(ni, pi1, pi2, pi3), i=1,...,8.
Here, n1 = 18,..., n8 = 9, and the table probabilities are defined as pij = P(Marital=j|Age Stratum= i).
The table probabilities are modeled as:
Linear-in-Age Model: Gi = b(0) + b(AGE)*xi or Saturated Model: Gi = bi,
where Gi = 1 - (pi12 + pi22 + pi32) = Gini dispersion for age category i and x = (x1,...x8) = (19,23,27,35,45,55,65,80) is the vector of AGE scores [midpoints of the age intervals].
d <- scan(what=list(Age="",Marital = "",Freq=0)) 17-21 Single 17 17-21 Married 1 17-21 Divorced 0 21-25 Single 16 21-25 Married 8 21-25 Divorced 0 25-30 Single 8 25-30 Married 17 25-30 Divorced 1 30-40 Single 6 30-40 Married 22 30-40 Divorced 4 40-50 Single 5 40-50 Married 21 40-50 Divorced 6 50-60 Single 3 50-60 Married 17 50-60 Divorced 8 60-70 Single 2 60-70 Married 8 60-70 Divorced 6 70+ Single 1 70+ Married 3 70+ Divorced 5 d <- data.frame(d) y <- d$Freq strata <- d$Age L.fct <- function(p) { p <- matrix(p,8,3,byrow=T) L <- c() for (i in 1:8) { Gi <- 1- sum(p[i,]*p[i,]) L <- rbind(L, Gi) } rownames(L) <- c("G.17","G.21","G.25","G.30","G.40","G.50","G.60","G.70+") L } d1 <- subset(d, Marital=="Single") Agescore <- scan() 19 23 27 35 45 55 65 80 d1 <- data.frame(d1,Agescore) d1 Age Marital Freq Agescore 1 17-21 Single 17 19 2 21-25 Single 16 23 3 25-30 Single 8 27 4 30-40 Single 6 35 5 40-50 Single 5 45 6 50-60 Single 3 55 7 60-70 Single 2 65 8 70+ Single 1 80 a <- mph.fit(y,link=L.fct,X=model.matrix(~Agescore,d1),strata=d$Age)
REMARK: Don't forget to input the strata variable, as the strata determine how the table probabilities are defined. |
mph.summary(a,T) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 6 ): Gsq = 6.61522 (p = 0.3579 ) Pearson's Score Stat (df= 6 ): Xsq = 4.96987 (p = 0.54768 ) Generalized Wald Stat (df= 6 ): Wsq = 10.09688 (p = 0.12063 ) Adj Resids: -2.144 -2.144 ... 0.873 2.144 , Number |Adj Resid| > 2: 3 SAMPLING PLAN INFORMATION... Number of strata: 8 Strata identifiers: 17-21, 21-25, 25-30, 30-40, 40-50, 50-60, 60-70, 70+ Strata with fixed sample sizes: all Observed strata sample sizes: 18, 24, 26, 32, 32, 28, 16, 9 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) 0.331544223 0.072438461 4.576909 4.718973e-06 Agescore 0.003625855 0.001398731 2.592245 9.535181e-03 OBS LINK ML LINK StdErr(L) LINK RESID G.17 0.1049383 0.4004355 0.04914697 -2.7342474 G.21 0.4444444 0.4149389 0.04471273 0.4734820 G.25 0.4763314 0.4294423 0.04056645 0.6290227 G.30 0.4765625 0.4584491 0.03356001 0.2198400 G.40 0.5097656 0.4947077 0.02879638 0.1907121 G.50 0.5382653 0.5309662 0.03038880 0.1102546 G.60 0.5937500 0.5672248 0.03753687 0.4409538 G.70+ 0.5679012 0.6216126 0.05358163 -1.0999550 CELL-SPECIFIC STATISTICS... strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS y1 17-21 17 1.342063e+01 7.917228e-01 7.455907e-01 4.398460e-02 2.143869e+00 y2 17-21 1 3.642503e+00 1.177341e+00 2.023613e-01 6.540785e-02 -2.143869e+00 y3 17-21 0 9.368645e-01 8.349467e-01 5.204803e-02 4.638593e-02 -2.143869e+00 y4 21-25 16 1.694951e+01 1.300864e+00 7.062294e-01 5.420266e-02 -5.237165e-01 y5 21-25 8 7.050495e+00 1.300864e+00 2.937706e-01 5.420266e-02 5.237165e-01 y6 21-25 0 1.015483e-38 1.007712e-19 4.231178e-40 4.198798e-21 -1.015483e-38 y7 25-30 8 6.835339e+00 1.495050e+00 2.628976e-01 5.750194e-02 6.956249e-01 y8 25-30 17 1.839519e+01 1.165229e+00 7.075074e-01 4.481652e-02 -6.956249e-01 y9 25-30 1 7.694700e-01 7.980423e-01 2.959500e-02 3.069394e-02 6.956249e-01 y10 30-40 6 5.696848e+00 1.693282e+00 1.780265e-01 5.291507e-02 2.249922e-01 y11 30-40 22 2.253683e+01 9.857207e-01 7.042760e-01 3.080377e-02 -2.249922e-01 y12 30-40 4 3.766318e+00 1.498098e+00 1.176974e-01 4.681558e-02 2.249922e-01 y13 40-50 5 4.768420e+00 1.627600e+00 1.490131e-01 5.086250e-02 1.951100e-01 y14 40-50 21 2.148669e+01 9.149380e-01 6.714590e-01 2.859181e-02 -1.951100e-01 y15 40-50 6 5.744893e+00 1.733194e+00 1.795279e-01 5.416232e-02 1.951100e-01 y16 50-60 3 2.894242e+00 1.305963e+00 1.033658e-01 4.664152e-02 1.121326e-01 y17 50-60 17 1.725375e+01 1.225208e+00 6.162052e-01 4.375743e-02 -1.121326e-01 y18 50-60 8 7.852012e+00 1.976947e+00 2.804290e-01 7.060526e-02 1.121326e-01 y19 60-70 2 1.607440e+00 8.987154e-01 1.004650e-01 5.616971e-02 4.913694e-01 y20 60-70 8 8.718400e+00 1.352845e+00 5.449000e-01 8.455284e-02 -4.913694e-01 y21 60-70 6 5.674160e+00 1.795040e+00 3.546350e-01 1.121900e-01 4.913694e-01 y22 70+ 1 1.577521e+00 9.291630e-01 1.752801e-01 1.032403e-01 -8.729608e-01 y23 70+ 3 3.157069e+00 1.420296e+00 3.507855e-01 1.578107e-01 -8.729608e-01 y24 70+ 5 4.265410e+00 1.239264e+00 4.739345e-01 1.376960e-01 8.729608e-01 CONVERGENCE INFORMATION... Original counts used. iterations = 65 , time elapsed = 0.89 norm.diff = 1.27049 = dist between last and second last iterates. Did NOT meet norm diff convergence criterion [1e-06]! norm.score = 6.42349e-08 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
REMARK: The fact that norm.diff does not converge to 0 usually means that the ML estimates do not exist owing to 0 counts; indeed, for this example, a look at the fitted values indicates that the iterate estimates approach 0 for the 6th cell. The resulting estimates can be regarded as "extended" ML estimates, which allow for fitted values equal to 0, the boundary value. For the saturated model fitted below, the print.iter=T option is used. We can monitor the iterations and see how the norm.diff criterion is not met. |
#SATURATED MODEL... a.sat <- mph.fit(y,link=L.fct,X=model.matrix(~Age,d1),print.iter=T, norm.score.conv=1e-9, strata=d$Age) mph.fit, version 3.0, 5/1/09 , running... iter= 1 norm.diff= 1.414214 norm.score= 0.005202601 iter= 2 norm.diff= 1.414214 norm.score= 0.00191393 iter= 3 norm.diff= 1.414214 norm.score= 0.0007040955 iter= 4 norm.diff= 1.414214 norm.score= 0.0002590222 iter= 5 norm.diff= 1.414214 norm.score= 9.528896e-05 iter= 6 norm.diff= 1.414214 norm.score= 3.505485e-05 iter= 7 norm.diff= 1.414214 norm.score= 1.289596e-05 iter= 8 norm.diff= 1.414214 norm.score= 4.744158e-06 iter= 9 norm.diff= 1.414214 norm.score= 1.745278e-06 iter= 10 norm.diff= 1.414214 norm.score= 6.42052e-07 iter= 11 norm.diff= 1.414214 norm.score= 2.361977e-07 iter= 12 norm.diff= 1.414214 norm.score= 8.689228e-08 iter= 13 norm.diff= 1.414214 norm.score= 3.196589e-08 iter= 14 norm.diff= 1.414214 norm.score= 1.175959e-08 iter= 15 norm.diff= 1.414214 norm.score= 4.326112e-09 iter= 16 norm.diff= 1.414214 norm.score= 1.591488e-09 iter= 17 norm.diff= 1.414214 norm.score= 5.854756e-10 iter= 18 norm.diff= 1.414214 norm.score= 2.153844e-10 iter= 19 norm.diff= 1.414214 norm.score= 7.92355e-11 iter= 20 norm.diff= 1.414214 norm.score= 2.914911e-11 iter= 21 norm.diff= 1.414214 norm.score= 1.072336e-11 iter= 22 norm.diff= 1.414214 norm.score= 3.944907e-12 iter= 23 norm.diff= 1.414214 norm.score= 1.451259e-12 iter= 24 norm.diff= 1.414214 norm.score= 5.339131e-13 iter= 25 norm.diff= 1.414214 norm.score= 1.964825e-13 iter= 26 norm.diff= 1.414214 norm.score= 7.24633e-14 Did NOT meet norm diff convergence criterion [1e-06]! Norm score convergence criterion [1e-09] was met. Time Elapsed: 0.06 seconds , Iterations: 26 mph.summary(a.sat) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 0 ): Gsq = 0 Pearson's Score Stat (df= 0 ): Xsq = 0 Generalized Wald Stat (df= 0 ): Wsq = 0 Adj Resids: 0 0 ... 0 0 , Number |Adj Resid| > 2: 0 SAMPLING PLAN INFORMATION... Number of strata: 8 Strata identifiers: 17-21, 21-25, 25-30, 30-40, 40-50, 50-60, 60-70, 70+ Strata with fixed sample sizes: all Observed strata sample sizes: 18, 24, 26, 32, 32, 28, 16, 9 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) 0.1049383 0.09598275 1.093303 2.742606e-01 Age21-25 0.3395062 0.11544659 2.940807 3.273580e-03 Age25-30 0.3713931 0.12049262 3.082289 2.054152e-03 Age30-40 0.3716242 0.12904010 2.879913 3.977852e-03 Age40-50 0.4048274 0.12569882 3.220614 1.279164e-03 Age50-60 0.4333270 0.11931348 3.631836 2.814116e-04 Age60-70 0.4888117 0.11346716 4.307958 1.647690e-05 Age70+ 0.4629630 0.13967541 3.314563 9.178648e-04 OBS LINK ML LINK StdErr(L) LINK RESID G.17 0.1049383 0.1049383 0.09598275 0 G.21 0.4444444 0.4444444 0.06415003 0 G.25 0.4763314 0.4763314 0.07284080 0 G.30 0.4765625 0.4765625 0.08624766 0 G.40 0.5097656 0.5097656 0.08116345 0 G.50 0.5382653 0.5382653 0.07087326 0 G.60 0.5937500 0.5937500 0.06051536 0 G.70+ 0.5679012 0.5679012 0.10147184 0 CONVERGENCE INFORMATION... Original counts used. iterations = 26 , time elapsed = 0.06 norm.diff = 1.41421 = dist between last and second last iterates. Did NOT meet norm diff convergence criterion [1e-06]! norm.score = 7.24633e-14 = norm of score at last iteration. Norm score convergence criterion [1e-09] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 3. ML Fit of Marginal Homogeneity Model
Display of unaided distance vision data from 30-39 year old FEMALE employees of United Kingdom Royal Ordnance factories during 1943-1946 (Kendall and Stuart 1961, "The Advanced Theory of Statistics, Vol. 2: Inference and Relationship," p564 and p 586. See also, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System," p. 443)
Left Grade Right Grade 1 2 3 4 Total ----------- --------------------- ----- 1 1520 266 124 66 1976 2 234 1512 432 78 2256 3 117 362 1772 205 2456 4 36 82 179 492 789 --------------------- ----- Total 1907 2222 2507 841 7477
Model of Marginal Homogeneity:
It is assumed that y <- Y ~ multinomial( n=7477, p), where the table probabilities satisfy:
pi+ = p+i, i=1,2,3, 4. That is, h(p) = [p1+ - p+1, p2+- p+2, p3+ - p+3]' = 0.
Note: The last constraint p4+ = p+4 is implied by the first three, so is omitted to avoid redundancies.
Equivalently, the model can be specified in terms of expected counts m:
mi+ = m+i, i=1,2,3, 4. That is, h(m) = [m1+ - m+1, m2+- m+2, m3+ - m+3]' = 0.
Alternatively, the marginal homogeneity model can be expressed in HLP form:
L(p) = [p1+, p2+, p3+, p4+, p+1, p+2, p+3, p+4 ]' = [b(1), b(2), b(3), b(4), b(1), b(2), b(3), b(5)]' = X b
Note: Given that (p1+ = p+1, p2+ = p+2 , p3+ = p+3) it follows that p4+ = p+4. That is, the last constraint is redundant. Hence, the introduction of b(5) in the HLP model.
y <- scan() 1520 266 124 66 234 1512 432 78 117 362 1772 205 36 82 179 492 RGrade <- gl(4,4,16) LGrade <- gl(4,1,16) data.frame(RGrade,LGrade,y) strata <- rep(1,16) MR <- M.fct(RGrade) #M.fct is included in mph.supp.Rcode.txt ML <- M.fct(LGrade) # Alternative code... # MR <- kronecker(diag(4),matrix(1,1,4)) # ML <- kronecker(matrix(1,1,4),diag(4)) h.fct <- function(p) { as.matrix(c(MR%*%p - ML%*%p)[-4]) } # Fit the h(p) = 0 version of the model... ap <- mph.fit(y,constraint=h.fct) # Fit the h(m) = 0 version of the model... am <- mph.fit(y,constraint=h.fct,h.mean=T)
#NOTE: When the model is specified in terms of m, rather than p, # mph.fit checks for Z homogeneity. In the HLP setting, mph.fit also # checks whether the link is an HLP link. # # Submitting, am <- mph.fit(y,constraint=h.fct,h.mean=T) # results in the following message: # # CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed. # # The necessary condition is met. There is no evidence that h(.) is not Z homogeneous. # # If the necessary condition alluded to in the message is not passed, it could be # for one of two reasons: (1) h(.) is actually not Z homogeneous or (2) the # necessary condition check was subject to too much round-off error. You can # try using larger tolerance values (default=1e-09), e.g. check.homog.tol = 1e-4. # If the condition does not pass with relatively large tolerance value, then it is # likely that h(.) is not Z homogeneous; which means that the model not only constrains the # table probabilities, it also constrains the expected sample sizes. # In this non-Z-homogeneous case, mph.fit can not be used to fit the model. # cbind(ap$Xsq,am$Xsq) PEARSON SCORE STATISTIC PEARSON SCORE STATISTIC 11.96980 11.96980 cbind(ap$m,am$m,ap$p,am$p) FV FV PROB PROB m1 1520.00000 1520.00000 0.203290090 0.203290090 m2 252.48209 252.48209 0.033767833 0.033767833 m3 111.84293 111.84293 0.014958263 0.014958263 m4 56.96589 56.96589 0.007618817 0.007618817 m5 247.23710 247.23710 0.033066350 0.033066350 m6 1512.00000 1512.00000 0.202220142 0.202220142 m7 409.41751 409.41751 0.054756923 0.054756923 m8 70.58517 70.58517 0.009440307 0.009440307 m9 131.26859 131.26859 0.017556318 0.017556318 m10 383.13268 383.13268 0.051241497 0.051241497 m11 1772.00000 1772.00000 0.236993447 0.236993447 m12 195.25849 195.25849 0.026114549 0.026114549 m13 42.78522 42.78522 0.005722245 0.005722245 m14 91.62502 91.62502 0.012254249 0.012254249 m15 188.39931 188.39931 0.025197179 0.025197179 m16 492.00000 492.00000 0.065801792 0.065801792
mph.summary(am) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 3 ): Gsq = 11.9872 (p = 0.0074271 ) Pearson's Score Stat (df= 3 ): Xsq = 11.9698 (p = 0.0074873 ) Generalized Wald Stat (df= 3 ): Wsq = 11.95657 (p = 0.0075334 ) Adj Resids: -3.179 -2.733 ... 2.733 3.179 , Number |Adj Resid| > 2: 6 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 7477 CONVERGENCE INFORMATION... Original counts used. iterations = 8 , time elapsed = 0.04 norm.diff = 9.67905e-08 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 7.67784e-07 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09 # Alternative to numerically computing the derivative of h() with respect to m, # input the derivative explicitly...
derht.fct <- function(m) { t((MR- ML)[-4,]) } am2 <- mph.fit(y,constraint=h.fct, h.mean=T, derht.fct=derht.fct) CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed. mph.fit, version 3.0, 5/1/09 , running... Convergence criteria have been met. Time Elapsed: 0.03 seconds , Iterations: 8
# Fit the HLP version of the marginal homogeneity model...
L.fct <- function(p) { L <- as.matrix(c(MR%*%p,ML%*%p)) rownames(L) <- c("P(R=1)","P(R=2)","P(R=3)","P(R=4)", "P(L=1)","P(L=2)","P(L=3)","P(L=4)") L } X <- scan() 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 X <- matrix(X, 8,5,byrow=T) #L.fct = X beta, constrains the first three marginal probs to be equal. # This implies the fourth marginal probs are also equal. # ap.HLP <- mph.fit(y,link=L.fct,X=X) mph.summary(ap.HLP) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 3 ): Gsq = 11.9872 (p = 0.0074271 ) Pearson's Score Stat (df= 3 ): Xsq = 11.9698 (p = 0.0074873 ) Generalized Wald Stat (df= 3 ): Wsq = 11.97572 (p = 0.0074668 ) Adj Resids: -3.179 -2.733 ... 2.733 3.179 , Number |Adj Resid| > 2: 6 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 7477 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value beta1 0.2596350 0.004683916 55.43119 0 beta2 0.2994837 0.004642237 64.51281 0 beta3 0.3319058 0.004827707 68.75019 0 beta4 0.1089755 0.003177685 34.29398 0 beta5 0.1089755 0.003177685 34.29398 0 OBS LINK ML LINK StdErr(L) LINK RESID P(R=1) 0.2642771 0.2596350 0.004683916 2.3908957 P(R=2) 0.3017253 0.2994837 0.004642237 0.8786678 P(R=3) 0.3284740 0.3319058 0.004827707 -1.3618671 P(R=4) 0.1055236 0.1089755 0.003177685 -2.0309320 P(L=1) 0.2550488 0.2596350 0.004683916 -2.3620900 P(L=2) 0.2971780 0.2994837 0.004642237 -0.9038095 P(L=3) 0.3352949 0.3319058 0.004827707 1.3449095 P(L=4) 0.1124783 0.1089755 0.003177685 2.0609046 CONVERGENCE INFORMATION... Original counts used. iterations = 37 , time elapsed = 0.39 norm.diff = 8.28358e-07 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 1.56281e-07 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 4. Loglinear models under different sampling plans
Bicycle Data (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")
Wearing Helmet? Bicycle Type Yes No Mountain 34 32 Other 10 24
Fit the independence loglinear model under a variety of sampling plans.
d <- scan(what=list(bike="",helmet="",count=0)) Mountain Yes 34 Mountain No 32 Other Yes 10 Other No 24 d <- data.frame(d) y <- d$count a1 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1) # Full Multinomial a2 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1,fixed.strata="none") # Full Poisson a3 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=d$bike) # Prod (Row) Multinomial a4 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=d$helmet) # Prod (Col) Multinomial
c(a1$Xsq,a2$Xsq,a3$Xsq,a4$Xsq) [1] 4.449383 4.449383 4.449383 4.449383
cbind(a1$m,a2$m,a3$m,a4$m) FV FV FV FV m1 29.04 29.04 29.04 29.04 [Fitted values are identical] m2 36.96 36.96 36.96 36.96 m3 14.96 14.96 14.96 14.96 m4 19.04 19.04 19.04 19.04
cbind(sqrt(diag(a1$covm)), sqrt(diag(a2$covm)), sqrt(diag(a3$covm)), sqrt(diag(a4$covm))) [,1] [,2] [,3] [,4] m1 3.882984 4.848792 3.276154 2.084319 [Approx Std Errors of fitted values depend on sampling plan] m2 4.215491 5.606316 3.276154 2.652769 m3 2.681934 3.070958 1.687716 2.084319 m4 3.144132 3.675702 1.687716 2.652769
cbind(a1$beta,a2$beta,a3$beta,a4$beta) BETA BETA BETA BETA (Intercept) 3.6098362 3.6098362 3.6098362 3.6098362 bikeOther -0.6632942 -0.6632942 -0.6632942 -0.6632942 helmetYes -0.2411621 -0.2411621 -0.2411621 -0.2411621
cbind(sqrt(diag(a1$covbeta)),sqrt(diag(a2$covbeta)),sqrt(diag(a3$covbeta)),sqrt(diag(a4$covbeta))) [,1] [,2] [,3] [,4] (Intercept) 0.1140555 0.1516861 8.864053e-02 7.177406e-02 bikeOther 0.2111002 0.2111002 4.121552e-07 2.111002e-01 helmetYes 0.2014557 0.2014557 2.014557e-01 5.281921e-07
cbind(a1$p,sqrt(diag(a1$covp)), a2$p,sqrt(diag(a2$covp)), a3$p,sqrt(diag(a3$covp)),a4$p,sqrt(diag(a4$covp))) PROB PROB PROB PROB p1 0.2904 0.03882984 0.2904 0.03882984 0.44 0.04963869 0.66 0.04737088 p2 0.3696 0.04215491 0.3696 0.04215491 0.56 0.04963869 0.66 0.04737088 p3 0.1496 0.02681934 0.1496 0.02681934 0.44 0.04963869 0.34 0.04737088 p4 0.1904 0.03144132 0.1904 0.03144132 0.56 0.04963869 0.34 0.04737088
# Summarize the result for the product (row) multinomial sampling case... mph.summary(a3,T) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 1 ): Gsq = 4.55692 (p = 0.032786 ) Pearson's Score Stat (df= 1 ): Xsq = 4.44938 (p = 0.034914 ) Generalized Wald Stat (df= 1 ): Wsq = 4.33093 (p = 0.037426 ) Adj Resids: -2.109 -2.109 2.109 2.109 , Number |Adj Resid| > 2: 4 SAMPLING PLAN INFORMATION... Number of strata: 2 Strata identifiers: Mountain, Other Strata with fixed sample sizes: all Observed strata sample sizes: 66, 34 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) 3.6098362 8.864053e-02 4.072445e+01 0.0000000 bikeOther -0.6632942 4.121552e-07 -1.609331e+06 0.0000000 helmetYes -0.2411621 2.014557e-01 -1.197097e+00 0.2312688 OBS LINK ML LINK StdErr(L) LINK RESID link1 3.526361 3.368674 0.11281521 1.947417 link2 3.465736 3.609836 0.08864053 -2.264984 link3 2.302585 2.705380 0.11281521 -2.562617 link4 3.178054 2.946542 0.08864053 1.874599 CELL-SPECIFIC STATISTICS... strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS y1 Mountain 34 29.04 3.276154 0.44 0.04963869 2.109356 y2 Mountain 32 36.96 3.276154 0.56 0.04963869 -2.109356 y3 Other 10 14.96 1.687716 0.44 0.04963869 -2.109356 y4 Other 24 19.04 1.687716 0.56 0.04963869 2.109356 CONVERGENCE INFORMATION... Original counts used. iterations = 5 , time elapsed = 0.05 norm.diff = 5.82122e-11 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 2.37238e-12 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
# By default, the derivative of the link was numerically computed in # the mph.fit calls above. In the log-linear model setting, the link # derivative has a simple explicit form. This derivative with # respect to m can be included as input. For example.... derLt.fct <- function(m) { diag(c(1/m)) } a11 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1,derLt.fct=derLt.fct)
EXAMPLE 5. Loglinear models of independence
Bicycle Data (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")
Wearing Helmet? Bicycle Type Yes No Mountain 34 32 Other 10 24
Loglinear Model of Independence:
Assume that y <- multinomial(n=100, p), where the table probabilities p satisfy:
log(pij) = b(0) + b(Bike)i + b(Helmet)j ; generically log(p) = X b.
or equivalently, in constraint form,
h(p) = U'log(p) = 0, where U is an orthogonal complement of X.
This model can also be re-expressed in terms of the expected counts m:
log(mij) = b(0) + b(Bike)i + b(Helmet)j ; generically log(m) = X b.
or equivalently, in constraint form,
h(m) = U'log(m) = 0, where U is an orthogonal complement of X.
d <- scan(what=list(bike="",helmet="",count=0)) Mountain Yes 34 Mountain No 32 Other Yes 10 Other No 24 d <- data.frame(d) y <- d$count # Fit the log-linear models of independence... ap <- mph.fit(y,link="logp",X=model.matrix(~bike+helmet,d),strata=1) #log(p) = Xb am <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1) #log(m) = Xb # Fit the constraint specification of the independence model... U <- create.U(model.matrix(~bike+helmet,d)) h.fct <- function(p) { t(U)%*%log(p) } ap2 <- mph.fit(y,constraint=h.fct, strata=1) #h(p) = U'log(p) = 0 am2 <- mph.fit(y,constraint=h.fct,h.mean=T,strata=1) #h(m) = U'log(m) = 0 cbind(ap$m,am$m,ap2$m,am2$m,ap$p,am$p,ap2$p,am2$p) FV FV FV FV PROB PROB PROB PROB m1 29.04 29.04 29.04 29.04 0.2904 0.2904 0.2904 0.2904 [Same fitted values and probability estimates.] m2 36.96 36.96 36.96 36.96 0.3696 0.3696 0.3696 0.3696 m3 14.96 14.96 14.96 14.96 0.1496 0.1496 0.1496 0.1496 m4 19.04 19.04 19.04 19.04 0.1904 0.1904 0.1904 0.1904 #NOTE: When the models are specified in terms of m, rather than p, # mph.fit checks for Z homogeneity. In the HLP setting, mph.fit also # checks whether the link is an HLP link. # # Submitting, am <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1) # results in the following messages: # # CHECKING whether L(.) is an HLP link function...Necessary condition [tol=1e-09] passed. # CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed. # # The necessary conditions are met. There is no evidence that the model is not an HLP model. # # If the necessary conditions alluded to in the messages are not passed, you can # try using larger tolerance values, e.g. check.homog.tol = 1e-4 and check.HLP.tol= 1e-4. #
EXAMPLE 6. Logit models under different sampling plans
Bicycle Data (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")
Wearing Helmet? Bicycle Type Yes No Mountain 34 32 Other 10 24
Logit Model:
log(pi1/pi2) = b(0) + b(Bike)i.
Candidate Sampling Distributions:
If the data are the result of a simple random sample from a single population, then pij = P(Bike=i, Helmet = j). The expected counts have the form mij = ν pij, where ν is the expected sample size, which may [full-multinomial sampling plan] or may not [full-Poisson sampling plan] be fixed a priori.
If the data are the result of a row-stratified random sample, then pij = P(Helmet = j | Bike = i) . The expected counts have the form mij = νi pij, where νi is the expected sample size for the stratum [Bike = i]. These expected sample sizes may or may not be fixed a priori.
If the data are the result of a column-stratified random sample, then pij = P(Bike= i | Helmet = j) . The expected counts have the form mij = νj pij, where νj is the expected sample size for the stratum [Helmet = j]. These expected sample sizes may or may not be fixed a priori. Technically, under this sampling plan, the "logit" model is not a log odds model at all. It actually has the form log(pi1/pi2) = log(P(Bike=i|Helmet=1)/P(Bike=i|Helmet=2)) = log(P(Helmet=1|Bike=i)/P(Helmet=2|Bike=i)) + log(P(Helmet=2)/P(Helmet=1)).
d <- scan(what=list(bike="",helmet="",count=0)) Mountain Yes 34 Mountain No 32 Other Yes 10 Other No 24 d <- data.frame(d) y <- d$count L.fct <- function(p) { p <- matrix(p,2,2,byrow=T) L <- as.matrix(c(log(p[1,1]/p[1,2]), log(p[2,1]/p[2,2]))) L } d1 <- subset(d,helmet=="Yes") d1 bike helmet count 1 Mountain Yes 34 3 Other Yes 10 a1 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=1) #Full Multinomial a2 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=1, fixed.strata="none") #Full Poisson a3 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=d$bike) #Product Multinomial a4 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=d$helmet) #Product Multinomial
cbind(a1$beta,a2$beta,a3$beta,a4$beta) BETA BETA BETA BETA (Intercept) 0.06062462 0.06062462 0.06062462 0.3017867 bikeOther -0.93609336 -0.93609336 -0.93609336 -0.9360934
cbind(sqrt(diag(a1$covbeta)),sqrt(diag(a2$covbeta)),sqrt(diag(a3$covbeta)),sqrt(diag(a4$covbeta))) [,1] [,2] [,3] [,4] (Intercept) 0.2462961 0.2462961 0.2462961 0.1416946 bikeOther 0.4498093 0.4498093 0.4498093 0.4498093 cbind(a1$p,sqrt(diag(a1$covp)),a2$p,sqrt(diag(a2$covp)),a3$p,sqrt(diag(a3$covp)),a4$p,sqrt(diag(a4$covp))) PROB PROB PROB PROB p1 0.34 0.04737088 0.34 0.04737088 0.5151515 0.06151748 0.7727273 0.06317721 p2 0.32 0.04664762 0.32 0.04664762 0.4848485 0.06151748 0.5714286 0.06613001 p3 0.10 0.03000000 0.10 0.03000000 0.2941176 0.07814249 0.2272727 0.06317721 p4 0.24 0.04270831 0.24 0.04270831 0.7058824 0.07814249 0.4285714 0.06613001
EXAMPLE 7. Very sparse table: fine-tuning the fitting algorithm
y <- c(rep(0,499),1) y [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [60] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [119] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [178] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [237] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [296] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [355] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [414] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [473] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Model Fitted: y <- Y ~ multinomial(1, p), where p1 =
p2 = ... = p500 , equivalently m1 = ... =
m500. (This is the model that states all the cell probabilities are 0.002.)
C <- cbind(1,-diag(499)) h.fct <- function(m) { C%*%m } derht.fct <- function(m) { t(C) } a <- mph.fit(y,h.fct=h.fct, h.mean=T, derht.fct=derht.fct, print.iter=T) #Does not converge!
REMARK: The MLe's exist for this model of equal probabilities. However, the many zero counts cause fitting problems. One way to mitigate fitting problems with zeroes is to begin the iterations using modified (non-zero) counts such as y+0.1. The next run does just this. By default, the modified counts y+y.eps are used until the 5th iteration, at which time the original counts y are used. |
a <- mph.fit(y,h.fct=h.fct,derht.fct=derht.fct,h.mean=T, y.eps=0.1) #Temporarily add 0.1 to counts # # Note: If derht.fct were not entered, then numerical derivatives of h wrt m would be used, and the # fitting would be quite a bit slower. # c(a$m) [1] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [20] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [39] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 ....[omitted output] [362] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [381] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [400] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [419] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [438] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [457] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [476] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [495] 0.002 0.002 0.002 0.002 0.002 0.002
# Another way to fine tune the algorithm is to set the 'step' parameter to a # smaller value (default is step=1). As an example, # a <- mph.fit(y,h.fct=h.fct,derht.fct=derht.fct,h.mean=T, step=0.5) does converge.
mph.summary(a) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 499 ): Gsq = 12.42922 (p = 1 ) Pearson's Score Stat (df= 499 ): Xsq = 499 (p = 0.49158 ) Generalized Wald Stat (df= 499 ): Wsq = 0.98008 (p = 1 ) WARNING: 100% of expected counts are less than 5. Chi-square approximation may be questionable. Adj Resids: -0.045 -0.045 ... -0.045 22.338 , Number |Adj Resid| > 2: 1 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 1 CONVERGENCE INFORMATION... Original counts used. iterations = 52 , time elapsed = 61.01 norm.diff = 1.1344e-06 = dist between last and second last iterates. Did NOT meet norm diff convergence criterion [1e-06]! norm.score = 2.43078e-09 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
REMARK. Notice that the CONVERGENCE INFORMATION output tells you that the original counts y (not the modified counts y+y.eps) were used. |
EXAMPLE 8. Given row marginal probs = (0.2, 0.3, 0.5), column marginal probs = (0.4, 0.1, 0.5), and local odds ratios or11 = 3, or12=2, or21=1, or22=4, compute the joint distribution.
y <- matrix(1,9,1) # any reasonable positive numbers will work here A <- gl(3,3,9) B <- gl(3,1,9) MA <- M.fct(A) MB <- M.fct(B) # Alternative way to create MA and MB... # MA <- t(kronecker(diag(3),matrix(1,3,1))) # MB <- t(kronecker(matrix(1,3,1),diag(3))) # h.fct <- function(p) { or11 <- p[1]*p[5]/(p[2]*p[4]) or12 <- p[2]*p[6]/(p[3]*p[5]) or21 <- p[4]*p[8]/(p[5]*p[7]) or22 <- p[5]*p[9]/(p[6]*p[8]) h <- c( MA%*%p - c(0.2,0.3,0.5), MB%*%p - c(0.4,0.1,0.5), or11-3,or12-2,or21-1,or22-4 ) as.matrix(h[c(-3,-6)]) #Delete the two redundant marginal prob constraints. } a <- mph.fit(y,h.fct=h.fct) a$p PROB p1 0.15913038 p2 0.01804733 p3 0.02282230 p4 0.13631718 p5 0.04638010 p6 0.11730272 p7 0.10455244 p8 0.03557257 p9 0.35987499 # # When both h.fct and (L.fct, X) are input, mph.fit ignores any constraints imposed by L.fct = X b, but # it does compute ML estimates of L and b (under the model with constraint h.fct=0). # # This feature of mph.fit can be exploited to output estimates of functions of p. # For example,.... # L.fct <- function(p) { or11 <- p[1]*p[5]/(p[2]*p[4]) or12 <- p[2]*p[6]/(p[3]*p[5]) or21 <- p[4]*p[8]/(p[5]*p[7]) or22 <- p[5]*p[9]/(p[6]*p[8]) L <- as.matrix(c(MA%*%p,MB%*%p, or11,or12,or21,or22)) rownames(L) <- c("prow1","prow2","prow3","pcol1","pcol2","pcol3", "or11","or12","or21","or22") L } X <- diag(10) b <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=X) mph.summary(b)
MODEL GOODNESS OF FIT: Test of Ho: h(p)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 8 ): Gsq = 7.36137 (pval = 0.4982 ) Pearson's Score Stat (df= 8 ): Xsq = 11.37638 (pval = 0.1813 ) Generalized Wald Stat (df= 8 ): Wsq = 13.37778 (pval = 0.0995 ) WARNING: 100% of expected counts are less than 5. Chi-square approximation may be questionable. Adj Resids: -1.555 -0.394 ... 1.774 2.097 , Number |Adj Resid| > 2: 1 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 9 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value beta1 0.2 0 Inf 0 beta2 0.3 0 535095116 0 beta3 0.5 0 Inf 0 beta4 0.4 0 217202655 0 beta5 0.1 0 Inf 0 beta6 0.5 0 Inf 0 beta7 3.0 0 Inf 0 beta8 2.0 0 Inf 0 beta9 1.0 0 27493920 0 beta10 4.0 0 51205638 0 OBS LINK ML LINK StdErr(L) LINK RESID prow1 0.3333 0.2 0 1.0000 prow2 0.3333 0.3 0 0.2182 prow3 0.3333 0.5 0 -1.0000 pcol1 0.3333 0.4 0 -0.4082 pcol2 0.3333 0.1 0 2.3333 pcol3 0.3333 0.5 0 -1.0000 or11 1.0000 3.0 0 -0.2101 or12 1.0000 2.0 0 -0.1319 or21 1.0000 1.0 0 0.0000 or22 1.0000 4.0 0 -0.2881 CONVERGENCE INFORMATION... Original counts used. iterations = 7 , time elapsed = 0.03 norm.diff = 9.18893e-09 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 1.23277e-10 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/16/09
EXAMPLE 9. Model constraint: area under ROC manifest "curve" (AUC) equal to a constant.
Neonatal Radiograph Data (source: see Table 1, Lang and Aspelund (2001), Statistical Modelling--An International Journal)
Video Image Rating 1 2 3 4 5 Total ------------------- ----- Normals 6 17 4 5 1 33 Abnormals 4 5 5 15 38 67
Models Fitted Below...
Sampling Distribution: Product Multinomial
(Yi1, ..., Yi5) indep ~ mult(ni, pi1, ... , pi5), i=1,2.
Here, n1 = 33 and n2 = 67. The table probabilities are defined as pij = P(Rating=j| Strata=i).
Systematic Component:
Constraint: In words, "Area under the manifest ROC `curve' equals 0.9." In symbols, h(p) = AUC(p) - 0.9 = 0, where AUC(p) = the area under the manifest ROC "curve." Specifically,
AUC(p) = p11*(p22 + p23 + p24 + p25) + p12*(p23 + p24 + p25) + p13*(p24 + p25) + p14*p25 + 0.5*(p11*p21 + p12*p22 + ... + p15*p25) = P(B > A) + 0.5*P(B=A), where A ~ (p11, ..., p15) and B ~ (p21, ..., p25).
y <- scan() 6 17 4 5 1 4 5 5 15 38 strata <- c(rep("Normals",5),rep("Abnormals",5)) h.fct <- function(p) { p[1]*(p[7]+p[8]+p[9]+p[10]) + p[2]*(p[8]+p[9]+p[10]) + p[3]*(p[9]+p[10]) + p[4]*(p[10]) + 0.5*(p[1]*p[6]+p[2]*p[7]+p[3]*p[8]+p[4]*p[9]+p[5]*p[10]) - 0.90 } # # To have mph.fit output the observed and fitted value of AUC(p), we augment the model # specification by including both h.fct and (L.fct, X) as input. When h.fct and (L.fct,X), # are input, mph.fit ignores any constraints imposed by L.fct = X b, but # it does compute ML estimates of L(p) and b (under the model with constraint h.fct=0). # # This feature of mph.fit can be exploited to output estimates of function of p, here, AUC(p)... # L.fct <- function(p) { L <- as.matrix( p[1]*(p[7]+p[8]+p[9]+p[10]) + p[2]*(p[8]+p[9]+p[10]) + p[3]*(p[9]+p[10]) + p[4]*(p[10]) + 0.5*(p[1]*p[6]+p[2]*p[7]+p[3]*p[8]+p[4]*p[9]+p[5]*p[10]) ) rownames(L) <- "AUC" L } a <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=as.matrix(1),strata=strata) mph.summary(a,T) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 1 ): Gsq = 1.94502 (p = 0.16312 ) Pearson's Score Stat (df= 1 ): Xsq = 2.19474 (p = 0.13848 ) Generalized Wald Stat (df= 1 ): Wsq = 1.52065 (p = 0.21752 ) Adj Resids: -1.481 -1.481 ... 1.481 1.481 , Number |Adj Resid| > 2: 0 SAMPLING PLAN INFORMATION... Number of strata: 2 Strata identifiers: Abnormals, Normals Strata with fixed sample sizes: all Observed strata sample sizes: 67, 33 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value beta1 0.9 4.418826e-10 2036740159 0 OBS LINK ML LINK StdErr(L) LINK RESID AUC 0.85346 0.9 4.418826e-10 -1.505451 CELL-SPECIFIC STATISTICS... strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS y1 Normals 6 6.7996921 2.2599169 0.20605128 0.06848233 -1.481464 y2 Normals 17 17.8647239 2.8022957 0.54135527 0.08491805 -1.481464 y3 Normals 4 3.8316273 1.8367962 0.11610992 0.05566049 1.481464 y4 Normals 5 3.9681015 1.7337208 0.12024550 0.05253699 1.481464 y5 Normals 1 0.5358551 0.6549779 0.01623803 0.01984781 1.481464 y6 Abnormals 4 2.5477157 1.2205911 0.03802561 0.01821778 1.481464 y7 Abnormals 5 3.8380538 1.7329256 0.05728439 0.02586456 1.481464 y8 Abnormals 5 4.6833212 2.0761170 0.06990032 0.03098682 1.481464 y9 Abnormals 15 15.2579803 3.4282561 0.22773105 0.05116800 -1.481464 y10 Abnormals 38 40.6729290 3.5674591 0.60705864 0.05324566 -1.481464 CONVERGENCE INFORMATION... Original counts used. iterations = 56 , time elapsed = 0.25 norm.diff = 8.33704e-07 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 1.07391e-07 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 10. Marginal homogeneity of multidimensional contingency tables
3x3x3 Table. See S. Kullback, Annals of Mathematical Statistics, Vol. 42, No. 2. (Apr., 1971), pp. 594-606.
y <- scan() 223 24 6 40 42 2 19 4 12 28 6 9 25 218 6 3 13 9 26 3 18 18 30 24 12 16 164 A <- gl(3,9,27) B <- gl(3,3,27) C <- gl(3,1,27) data.frame(A,B,C,y) MA <- M.fct(A) MB <- M.fct(B) MC <- M.fct(C) # Alternatively, the M matrices can be computed as follows: # MA <- kronecker(diag(3),matrix(1,1,9))) # MB <- kronecker(matrix(1,1,3),kronecker(diag(3),matrix(1,1,3))) # MC <- kronecker(matrix(1,1,9),diag(3)) C.mat <- scan() 1 0 0 -1 0 0 0 0 0 1 0 0 0 0 0 -1 0 0 0 1 0 0 -1 0 0 0 0 0 1 0 0 0 0 0 -1 0 C.mat <- matrix(C.mat,4,9,byrow=T) L.fct <- function(p) { pmA <- MA%*%p pmB <- MB%*%p pmC <- MC%*%p L <- as.matrix(c(pmA,pmB,pmC)) rownames(L) <- c("P(A=1)","P(A=2)","P(A=3)","P(B=1)","P(B=2)","P(B=3)","P(C=1)","P(C=2)","P(C=3)") L } h.fct <- function(p) { C.mat%*%L.fct(p) } a <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=diag(9)) mph.summary(a) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 4 ): Gsq = 50.39723 (p = 2.983e-10 ) Pearson's Score Stat (df= 4 ): Xsq = 48.9876 (p = 5.8737e-10 ) Generalized Wald Stat (df= 4 ): Wsq = 49.87851 (p = 3.828e-10 ) Adj Resids: -5.556 -4.739 ... 5.556 6.357 , Number |Adj Resid| > 2: 16 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 1000 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value beta1 0.3687506 0.01291861 28.54413 0 beta2 0.3569100 0.01271853 28.06221 0 beta3 0.2743394 0.01210117 22.67047 0 beta4 0.3687506 0.01291861 28.54413 0 beta5 0.3569100 0.01271853 28.06221 0 beta6 0.2743394 0.01210117 22.67047 0 beta7 0.3687506 0.01291861 28.54413 0 beta8 0.3569100 0.01271853 28.06221 0 beta9 0.2743394 0.01210117 22.67047 0 OBS LINK ML LINK StdErr(L) LINK RESID P(A=1) 0.372 0.3687506 0.01291861 0.4003281 P(A=2) 0.317 0.3569100 0.01271853 -4.8482153 P(A=3) 0.311 0.2743394 0.01210117 5.0529720 P(B=1) 0.343 0.3687506 0.01291861 -3.1724923 P(B=2) 0.405 0.3569100 0.01271853 5.8419001 P(B=3) 0.252 0.2743394 0.01210117 -3.0790551 P(C=1) 0.394 0.3687506 0.01291861 3.1107435 P(C=2) 0.356 0.3569100 0.01271853 -0.1105505 P(C=3) 0.250 0.2743394 0.01210117 -3.3547171 CONVERGENCE INFORMATION... Original counts used. iterations = 56 , time elapsed = 0.8 norm.diff = 7.23424e-07 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 1.90481e-07 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 11. Contingency tables with given marginals.
3x4 Table. See C. T. Ireland, S. Kullback, Biometrika, Vol. 55, No. 1. (Mar., 1968), pp. 179-188.
Model Fitted Below:
y <- Y ~ multinomial(n, p), where the table probabilities p satisfy:
h(p) = [p1+, p2+, p+1, p+2, p+3] - [0.7837288, 0.14831812, 0.07827901, 0.46148631, 0.29658409] = 0.
y <- scan() 783 7426 4709 2145 517 928 622 703 207 373 337 425 V1 <- gl(3,4,12) V2 <- gl(4,1,12) M1 <- M.fct(V1) M2 <- M.fct(V2) # Alternatively, the M matrices can be computed as # M1 <- kronecker(diag(3),matrix(1,1,4)) # M2 <- kronecker(matrix(1,1,3),diag(4)) marg <- scan() 15028 2844 1303 1501 8849 5687 3138 margprob <- marg/19175 A <- scan() 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 A <- matrix(A,5,7,byrow=T) h.fct <- function(p) { obsmp <- rbind(M1%*%p,M2%*%p) A%*%(obsmp - margprob) } a <- mph.fit(y=y,h.fct=h.fct) # Using the augmentation trick as described in Examples 8 and 9... L.fct <- function(p) { L <- as.matrix(c(M1%*%p,M2%*%p)) rownames(L) <- c("p[1,+]","p[2,+]","p[3,+]","p[+,1]","p[+,2]","p[+,3]","p[+,4]") L } a2 <- mph.fit(y=y,h.fct=h.fct,L.fct=L.fct,X=diag(7)) mph.summary(a2)
MODEL GOODNESS OF FIT: Test of Ho: h(p)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 5 ): Gsq = 11.30702 (pval = 0.04562 ) Pearson's Score Stat (df= 5 ): Xsq = 11.37322 (pval = 0.04446 ) Generalized Wald Stat (df= 5 ): Wsq = 11.17887 (pval = 0.04795 ) Adj Resids: -2.372 -1.736 ... 2.15 2.816 , Number |Adj Resid| > 2: 3 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 19175 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value beta1 0.7837 0 15782181923 0 beta2 0.1483 0 6489036323 0 beta3 0.0680 0 1838678421 0 beta4 0.0783 0 Inf 0 beta5 0.4615 0 17166239010 0 beta6 0.2966 0 7509947313 0 beta7 0.1637 0 3517050151 0 OBS LINK ML LINK StdErr(L) LINK RESID p[1,+] 0.7856 0.7837 0 0.6139 p[2,+] 0.1445 0.1483 0 -1.5036 p[3,+] 0.0700 0.0680 0 1.1191 p[+,1] 0.0786 0.0783 0 0.1613 p[+,2] 0.4551 0.4615 0 -1.7673 p[+,3] 0.2956 0.2966 0 -0.3004 p[+,4] 0.1707 0.1637 0 2.6352 CONVERGENCE INFORMATION... Original counts used. iterations = 38 , time elapsed = 0.15 norm.diff = 4.5344e-07 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 5.67182e-08 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/16/09 cbind(a2$L,margprob) ML LINK margprob p[1,+] 0.78372881 0.78372881 p[2,+] 0.14831812 0.14831812 p[3,+] 0.06795306 0.06795306 p[+,1] 0.07827901 0.07827901 p[+,2] 0.46148631 0.46148631 p[+,3] 0.29658409 0.29658409 p[+,4] 0.16365059 0.16365059 Note: The fitted marginal probabilities do match the given marginal probabilities
EXAMPLE 12. Testing for (conditional) pairwise independence when response includes "unknown" category.
3x3x3 Table. For a description of the model, see Michael Haber's Example 2, p. 433, in Biometrics (in Shorter Communications), Vol. 42, No. 2. (Jun., 1986), pp. 429-435.
y <- scan() 201 28 37 21 8 7 12 0 0 27 9 5 14 4 4 2 0 0 142 15 0 27 12 0 0 0 0 #Counts are entered as y[i,j,k], where right most subscripts move fastest, i,j,k=3. A <- gl(3,9,27) B <- gl(3,3,27) C <- gl(3,1,27) MAB <- M.fct(paste(A,B)) MAC <- M.fct(paste(A,C)) MBC <- M.fct(paste(B,C)) # Alternatively, # MAB <- kronecker(diag(9),matrix(1,1,3)) # MAC <- kronecker(diag(3),kronecker(matrix(1,1,3),diag(3))) # MBC <- kronecker(matrix(1,1,3),diag(9)) M <- rbind(MAB,MAC,MBC) Mr <- M[-c(3,6,7,8,9,12,15,16,17,18,21,24,25,26,27),] # Mr*p gives only those marginal probabilities that do NOT involve the last unknown category. C <- scan() 1 -1 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 -1 1 C <- matrix(C,3,12,byrow=T) L.fct <- function(m) { p <- m/sum(m) C%*%log(Mr%*%p) } X <- scan() 1 1 0 1 0 1 1 0 0 X <- matrix(X,3,3,byrow=T) h.fct <- function(m) { L.fct(m) } a <- mph.fit(y=y,h.fct=h.fct,step=0.5,maxiter=200,print.iter=T) #zero counts require some fine tuning mph.fit, version 3.0, 5/1/09 , running... iter= 1 [ 0 ] norm.diff= 9.519322 norm.score= 1.945013 iter= 2 [ 0 ] norm.diff= 5.295804 norm.score= 2.219742 iter= 3 [ 0 ] norm.diff= 2.827484 norm.score= 1.714237 iter= 4 [ 0 ] norm.diff= 1.75119 norm.score= 1.205030 iter= 5 [ 0 ] norm.diff= 1.603112 norm.score= 0.7939237 iter= 6 [ 0 ] norm.diff= 1.595768 norm.score= 0.5042983 iter= 7 [ 0 ] norm.diff= 1.596907 norm.score= 0.310447 iter= 8 [ 0 ] norm.diff= 1.596947 norm.score= 0.1872117 iter= 9 [ 0 ] norm.diff= 1.596726 norm.score= 0.1111047 iter= 10 [ 0 ] norm.diff= 1.596618 norm.score= 0.06525863 iter= 11 [ 0 ] norm.diff= 1.596599 norm.score= 0.03803102 iter= 12 [ 0 ] norm.diff= 1.596611 norm.score= 0.02205917 [omitted lines] iter= 39 [ 0 ] norm.diff= 1.596669 norm.score= 3.832624e-08 iter= 40 [ 0 ] norm.diff= 1.596669 norm.score= 2.224625e-08 iter= 41 [ 0 ] norm.diff= 1.596669 norm.score= 1.443151e-08 Did NOT meet norm diff convergence criterion [1e-06]! Norm score convergence criterion [1e-06] was met. Time Elapsed: 0.28 seconds , Iterations: 41 mph.summary(a,T) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 3 ): Gsq = 33.25029 (p = 2.852e-07 ) Pearson's Score Stat (df= 3 ): Xsq = 38.11701 (p = 2.6698e-08 ) Generalized Wald Stat (df= 3 ): Wsq = 33.77626 (p = 2.2088e-07 ) Adj Resids: -5.526 -4.59 ... 5.749 5.779 , Number |Adj Resid| > 2: 15 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 575 CELL-SPECIFIC STATISTICS... strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS y1 1 201 1.826391e+02 1.069712e+01 3.176332e-01 1.860368e-02 5.749431e+00 y2 1 28 3.712722e+01 5.169310e+00 6.456908e-02 8.990104e-03 -3.225310e+00 y3 1 37 3.531513e+01 5.745554e+00 6.141762e-02 9.992268e-03 4.589575e+00 y4 1 21 3.395460e+01 5.143368e+00 5.905148e-02 8.944987e-03 -5.526221e+00 y5 1 8 5.326702e+00 1.879169e+00 9.263829e-03 3.268120e-03 2.023091e+00 y6 1 7 9.337218e+00 2.987685e+00 1.623864e-02 5.195974e-03 -4.589575e+00 y7 1 12 1.174490e+01 3.389471e+00 2.042592e-02 5.894732e-03 1.986529e+00 y8 1 0 1.413466e-10 1.188893e-05 2.458203e-13 2.067640e-08 -1.413466e-10 y9 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 y10 1 27 3.866533e+01 5.286542e+00 6.724404e-02 9.193986e-03 -4.094285e+00 y11 1 9 7.836827e+00 1.643760e+00 1.362926e-02 2.858713e-03 5.187327e-01 y12 1 5 6.490560e+00 2.512333e+00 1.128793e-02 4.369275e-03 -4.589575e+00 y13 1 14 7.183925e+00 1.831879e+00 1.249378e-02 3.185877e-03 3.525268e+00 y14 1 4 1.102329e+00 9.212980e-01 1.917094e-03 1.602257e-03 5.778889e+00 y15 1 4 1.814156e+00 1.257619e+00 3.155055e-03 2.187164e-03 4.589575e+00 y16 1 2 2.230031e+00 1.485925e+00 3.878315e-03 2.584218e-03 -1.986529e+00 y17 1 0 1.754106e-16 1.324427e-08 3.050619e-19 2.303351e-11 -1.754106e-16 y18 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 y19 1 142 1.377167e+02 1.016841e+01 2.395074e-01 1.768419e-02 3.705642e+00 y20 1 15 1.821994e+01 4.109449e+00 3.168686e-02 7.146868e-03 -3.705642e+00 y21 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 y22 1 27 3.187476e+01 5.327032e+00 5.543437e-02 9.264404e-03 -3.705642e+00 y23 1 12 6.420549e+00 2.020355e+00 1.116617e-02 3.513660e-03 3.705642e+00 y24 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 y25 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 y26 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 y27 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11 CONVERGENCE INFORMATION... Original counts used. iterations = 41 , time elapsed = 0.28 norm.diff = 1.59667 = dist between last and second last iterates. Did NOT meet norm diff convergence criterion [1e-06]! norm.score = 1.44315e-08 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 13. Kappa regression models.
Analysis of Multiple Sclerosis Data. See Landis and Koch (Biometrics 1977) for data source and a WLS analysis. See Lang (HLP 2002) for a more detailed description of the models and the corresponding ML analyses outlined below.
Model Systematic Components:
DIRECT SMOOTHING MODEL: K(p) = Xb, where K(p) is the vector of eight weighted kappa statistics. The design matrix X forces certain kappa coefficients to be equal.
INDIRECT SMOOTHING MODEL: L(p) = [L1(p), L2(p)], where L1(p) = log(p), L2(p) = K(p), the eight kappa coefficients. The model constrains the data parameters through
L1(p) = Xb, L2(p) = X2 κ, where X2 = diag(8), the 8x8 identity matrix.
The model L1(p) = Xb has (k,i,j) component of the form
log(pkij) = bk + b(A)ki + b(B)kj + b(A*B)k*i*j + b(agree)k*I(i=j), where k indexes the two populations, i and j are the responses by neurologist A and B.
The models have the generic form L(p) = Xb. They can be seen to be HLP models.
#CREATE Weighted Kappa function... kappa.weighted.fct <- function(w,p,nr) { # # Computes Weighted Kappa Statistic # See Landis and Koch 1977. # Author: Joseph B. Lang # Created: June 7, 2002 # # Input: w = vectorized version of two-way # table of weights, w = (w11,w12,...,wRR) # p = vectorized version of two-way # probabilities, p = (p11,p12,...,pRR) # nr = number of rows (and columns) in the table. # Output: Weighted kappa. # p <- matrix(p,nr,nr,byrow=T) prow <- apply(p,1,sum) pcol <- apply(p,2,sum) pind <- prow%*%t(pcol) pind <- matrix(t(pind),nr*nr,1) p <- as.matrix(c(t(p))) w <- as.matrix(w) obsa <- t(w)%*%p expa <- t(w)%*%pind c((obsa - expa)/(1-expa)) } w1 <- scan() 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 w2 <- scan() 1 1 0 0 1 1 0 0 0 0 1 0 0 0 0 1 w3 <- scan() 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 w4 <- scan() 1 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 y <- scan() 38 5 0 1 33 11 3 0 10 14 5 6 3 7 3 10 5 3 0 0 3 11 4 0 2 13 3 4 1 2 4 14 strata <- c(rep("Popn1",16),rep("Popn2",16)) # Fit the DIRECT SMOOTHING MODEL L.fct <- function(p) { p1 <- p[1:16] p2 <- p[17:32] L <- rbind( kappa.weighted.fct(w1,p1 ,4), kappa.weighted.fct(w2,p1 ,4), kappa.weighted.fct(w3,p1 ,4), kappa.weighted.fct(w4,p1 ,4), kappa.weighted.fct(w1,p2 ,4), kappa.weighted.fct(w2,p2 ,4), kappa.weighted.fct(w3,p2 ,4), kappa.weighted.fct(w4,p2 ,4)) rownames(L) <- c("K(w1,Popn1)","K(w2,Popn1)","K(w3,Popn1)","K(w4,Popn1)", "K(w1,Popn2)","K(w2,Popn2)","K(w3,Popn2)","K(w4,Popn2)") L } XK <- scan() 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 XK <- matrix(XK,8,5,byrow=T) a <- mph.fit(y,L.fct=L.fct,X=XK,norm.diff.conv=10,norm.score.conv=1e-8,strata=strata) # Don't forget to input the strata variable, as this determines the definition of the table probabilities. mph.summary(a) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 3 ): Gsq = 2.16255 (p = 0.53936 ) Pearson's Score Stat (df= 3 ): Xsq = 2.11354 (p = 0.54918 ) Generalized Wald Stat (df= 3 ): Wsq = 2.26792 (p = 0.51869 ) Adj Resids: -1.392 -1.379 ... 1.412 1.413 , Number |Adj Resid| > 2: 0 SAMPLING PLAN INFORMATION... Number of strata: 2 Strata identifiers: Popn1, Popn2 Strata with fixed sample sizes: all Observed strata sample sizes: 149, 69 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value beta1 0.2349553 0.04241687 5.539195 3.038648e-08 beta2 0.3150272 0.04935842 6.382442 1.742859e-10 beta3 0.3888738 0.05743424 6.770766 1.281020e-11 beta4 0.5831200 0.06909629 8.439236 0.000000e+00 beta5 0.7934268 0.08080265 9.819316 0.000000e+00 OBS LINK ML LINK StdErr(L) LINK RESID K(w1,Popn1) 0.2079425 0.2349553 0.04241687 -0.97835070 K(w2,Popn1) 0.3275399 0.3150272 0.04935842 0.31584009 K(w3,Popn1) 0.4081121 0.3888738 0.05743424 0.44185351 K(w4,Popn1) 0.5964663 0.5831200 0.06909629 0.41284558 K(w1,Popn2) 0.2965166 0.2349553 0.04241687 0.94238307 K(w2,Popn2) 0.3324734 0.3150272 0.04935842 0.26087642 K(w3,Popn2) 0.3864188 0.3888738 0.05743424 -0.02982431 K(w4,Popn2) 0.7893773 0.7934268 0.08080265 -0.12157636 CONVERGENCE INFORMATION... Original counts used. iterations = 16 , time elapsed = 5.17 norm.diff = 2.26061 = dist between last and second last iterates. Norm diff convergence criterion [10] was met. norm.score = 5.97723e-09 = norm of score at last iteration. Norm score convergence criterion [1e-08] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09 # Fit the INDIRECT SMOOTHING MODEL L.fct <- function(p) { p1 <- p[1:16] p2 <- p[17:32] L <- rbind(log(p), kappa.weighted.fct(w1,p1 ,4), kappa.weighted.fct(w2,p1 ,4), kappa.weighted.fct(w3,p1 ,4), kappa.weighted.fct(w4,p1 ,4), kappa.weighted.fct(w1,p2 ,4), kappa.weighted.fct(w2,p2 ,4), kappa.weighted.fct(w3,p2 ,4), kappa.weighted.fct(w4,p2 ,4)) rownames(L) <- c(paste(sep=".","logp",1:32), "K(w1,Popn1)","K(w2,Popn1)","K(w3,Popn1)","K(w4,Popn1)", "K(w1,Popn2)","K(w2,Popn2)","K(w3,Popn2)","K(w4,Popn2)") L } A <- factor(gl(4,4)) B <- factor(gl(4,1,16)) Ascore <- c(A) Bscore <- c(B) agree <- scan() 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 XA1 <- model.matrix(~A + B + Ascore:Bscore + agree) XA2 <- XA1 X <- block.fct(XA1,XA2,diag(8)) #block.fct is available in mph.supp.Rcode.txt b <- mph.fit(y,L.fct=L.fct,X=X,strata=strata) mph.summary(b) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 14 ): Gsq = 18.25286 (p = 0.19551 ) Pearson's Score Stat (df= 14 ): Xsq = 22.48127 (p = 0.069252 ) Generalized Wald Stat (df= 14 ): Wsq = 13.13214 (p = 0.51615 ) Adj Resids: -1.735 -1.675 ... 2.314 2.748 , Number |Adj Resid| > 2: 3 SAMPLING PLAN INFORMATION... Number of strata: 2 Strata identifiers: Popn1, Popn2 Strata with fixed sample sizes: all Observed strata sample sizes: 149, 69 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) -2.18195695 0.21789484 -10.01380733 0.000000e+00 A2 -0.98252487 0.34280421 -2.86614005 4.155104e-03 A3 -2.63574417 0.59030845 -4.46502870 8.005802e-06 A4 -5.07053731 1.00283130 -5.05622162 4.276444e-07 B2 -2.50209975 0.36908649 -6.77916906 1.208700e-11 B3 -5.95129014 0.85057662 -6.99677137 2.619238e-12 B4 -8.20966143 1.38875396 -5.91153053 3.389434e-09 agree -0.02783032 0.24285556 -0.11459617 9.087652e-01 Ascore:Bscore 0.80384747 0.15516362 5.18064414 2.211210e-07 (Intercept) -3.82657082 0.40135119 -9.53422059 0.000000e+00 A2 -0.93504420 0.59981172 -1.55889618 1.190210e-01 A3 -3.00645110 1.21010157 -2.48446177 1.297474e-02 A4 -6.18631027 2.08586141 -2.96582997 3.018673e-03 B2 -1.26080589 0.64291838 -1.96106681 4.987123e-02 B3 -5.23293410 1.44644377 -3.61779298 2.971259e-04 B4 -8.36055037 2.47380527 -3.37963156 7.258306e-04 agree 0.02769186 0.34868656 0.07941764 9.367004e-01 Ascore:Bscore 1.04115569 0.29708388 3.50458495 4.573196e-04 [3.1] 0.20794246 0.05113013 4.06692644 4.763727e-05 [3.2] 0.33160222 0.05237560 6.33123511 2.432063e-10 [3.3] 0.41743362 0.06030132 6.92246222 4.438672e-12 [3.4] 0.55622648 0.07116497 7.81601533 5.551115e-15 [3.5] 0.29651657 0.07818321 3.79258641 1.490863e-04 [3.6] 0.38035962 0.07129014 5.33537446 9.534758e-08 [3.7] 0.47547451 0.07625832 6.23505065 4.516318e-10 [3.8] 0.73473413 0.09436958 7.78570910 6.883383e-15 OBS LINK ML LINK StdErr(L) LINK RESID logp.1 -1.3663601 -1.4059398 0.13749315 9.427814e-01 logp.2 -3.3945084 -3.0763617 0.28963410 -1.357818e+00 logp.3 -Inf -5.7217047 0.54095671 -Inf logp.4 -5.0039463 -7.1762285 0.87175123 7.674599e-01 logp.5 -1.5074387 -1.5567869 0.15132358 1.046012e+00 logp.6 -2.6060510 -2.4790220 0.24826770 -1.173576e+00 logp.7 -3.9053340 -4.2926871 0.37191660 6.585275e-01 logp.8 -Inf -4.9433634 0.51585603 -Inf logp.9 -2.7013612 -2.4061587 0.20273772 -1.809184e+00 logp.10 -2.3648890 -2.4967160 0.21627385 7.877068e-01 logp.11 -3.3945084 -3.5621943 0.36104443 5.317376e-01 logp.12 -3.2121868 -3.3811928 0.31556472 5.600586e-01 logp.13 -3.9053340 -4.0371044 0.44794027 3.168800e-01 logp.14 -3.0580362 -3.3238142 0.27927295 8.336371e-01 logp.15 -3.9053340 -3.5576147 0.36606789 -1.129845e+00 logp.16 -2.7013612 -2.6284264 0.26455789 -5.719039e-01 logp.17 -2.6246686 -2.7577233 0.40026306 5.738569e-01 logp.18 -3.1354942 -3.0050653 0.42768254 -4.227939e-01 logp.19 -Inf -5.9360378 0.76941354 -Inf logp.20 -Inf -8.0224984 1.42386504 -Inf logp.21 -3.1354942 -2.6793036 0.38587064 -2.085649e+00 logp.22 -1.8362112 -1.8301063 0.25548478 -5.932316e-02 logp.23 -2.8478121 -3.7476149 0.45024654 1.427064e+00 logp.24 -Inf -4.7929198 0.71595805 -Inf logp.25 -3.5409593 -3.7095548 0.48406780 2.878691e-01 logp.26 -1.6691571 -1.8468936 0.24432611 1.336046e+00 logp.27 -3.1354942 -2.6678629 0.38466892 -2.171738e+00 logp.28 -2.8478121 -2.6997039 0.35137338 -5.315800e-01 logp.29 -4.2341065 -5.8482583 1.00711518 8.075958e-01 logp.30 -3.5409593 -2.9444414 0.40136378 -1.888459e+00 logp.31 -2.8478121 -2.7519469 0.38094578 -3.688923e-01 logp.32 -1.5950492 -1.6872485 0.23696240 1.051840e+00 K(w1,Popn1) 0.2079425 0.2079425 0.05113013 1.854186e-11 K(w2,Popn1) 0.3275399 0.3316022 0.05237560 -1.162164e-01 K(w3,Popn1) 0.4081121 0.4174336 0.06030132 -2.387584e-01 K(w4,Popn1) 0.5964663 0.5562265 0.07116497 1.160200e+00 K(w1,Popn2) 0.2965166 0.2965166 0.07818321 9.511947e-12 K(w2,Popn2) 0.3324734 0.3803596 0.07129014 -1.198556e+00 K(w3,Popn2) 0.3864188 0.4754745 0.07625832 -1.556305e+00 K(w4,Popn2) 0.7893773 0.7347341 0.09436958 1.884445e+00 CONVERGENCE INFORMATION... Original counts used. iterations = 6 , time elapsed = 2.36 norm.diff = 1.27439e-08 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 1.20776e-08 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 14. Marginal cumulative probit model (in conjunction with loglinear association model)
Marijuana use among U.S. male teens. Data Source: U.S. National Youth Survey (Elliott et al. 1989).
Use at 17 0 1 2 -------------- 0 56 13 7 Use at 15 1 6 4 10 2 1 5 15 0 = None, 1=Used no more than once, 2=Used more than once per month in the past year
The models that are fitted below have the following form...
L(p) = [L1(p), L2(p)] = [X1b, X2a], where L1(p) = Φ-1[p2++ p3+, p3+, p+2 + p+3, p+3] and L2(p) = log(p).
Here, Φ(.) is the standard normal pdf.
The linear predictor for the marginal probit model is defined as
X1b = [b(cut1), b(cut2), b(cut1)+b(AGE), b(cut2)+b(AGE)],
and the loglinear association model L2(p) = log(p) = X2a has (i,j) component
log(pij) = a + a(Use15)i + a(Use17)j + a*Use15.scorei*Use17.scorej [linear-by-linear association model] or
log(pij) = a + a(Use15)i + a(Use17)j [independence association model].
The models, which have the generic form L(p) = Xb, are HLP models.
d <- scan(what=list(Use15="",Use17="",count=0)) 0 0 56 0 1 13 0 2 7 1 0 6 1 1 4 1 2 10 2 0 1 2 1 5 2 2 15 d <- data.frame(d) Use15.score <- c(d$Use15)-1 Use17.score <- c(d$Use17)-1 d <- data.frame(d,Use15.score,Use17.score) y <- d$count M15 <- M.fct(d$Use15) M17 <- M.fct(d$Use17) L.fct <- function(p) { p15 <- M15%*%p p17 <- M17%*%p L <- as.matrix(c( qnorm(p15[2]+p15[3]), qnorm(p15[3]), qnorm(p17[2]+p17[3]), qnorm(p17[3]), log(p) )) rownames(L) <- c("qnorm(P(Use15>0))","qnorm(P(Use15>1))", "qnorm(P(Use17>0))","qnorm(P(Use17>1))", paste("logp",1:9)) L } CUT <- factor(c(1,2,1,2)) AGE <- factor(c(15,15,17,17)) Xprobit <- model.matrix(~CUT+AGE-1) Xloglin <- model.matrix(~ Use15 + Use17 +Use15.score:Use17.score,d) X <- block.fct(Xprobit,Xloglin) a1 <- mph.fit(y,L.fct=L.fct,X=X) mph.summary(a1) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 4 ): Gsq = 1.77057 (p = 0.77786 ) Pearson's Score Stat (df= 4 ): Xsq = 1.87796 (p = 0.75819 ) Generalized Wald Stat (df= 4 ): Wsq = 1.84803 (p = 0.76368 ) Adj Resids: -1.197 -0.64 ... 0.831 1.038 , Number |Adj Resid| > 2: 0 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 117 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value CUT1 -0.3909243 0.11400659 -3.428962 6.058944e-04 CUT2 -0.9091368 0.12846677 -7.076824 1.475042e-12 AGE17 0.2999332 0.09813575 3.056309 2.240799e-03 (Intercept) -0.7542226 0.09488147 -7.949103 1.776357e-15 Use151 -2.1584116 0.26134587 -8.258832 2.220446e-16 Use152 -3.7075766 0.62358890 -5.945546 2.755366e-09 Use171 -1.3618662 0.20239880 -6.728627 1.712719e-11 Use172 -2.0394637 0.36806236 -5.541082 3.006082e-08 Use15.score:Use17.score 1.1362827 0.20718661 5.484344 4.150058e-08 OBS LINK ML LINK StdErr(L) LINK RESID qnorm(P(Use15>0)) -0.38416697 -0.39092427 0.11400659 0.1952372 qnorm(P(Use15>1)) -0.91732119 -0.90913681 0.12846677 -0.1962242 qnorm(P(Use17>0)) -0.09655862 -0.09099107 0.11249628 -0.1955444 qnorm(P(Use17>1)) -0.60224878 -0.60920361 0.11879774 0.1950816 logp 1 -0.73682224 -0.75422257 0.09488147 0.6982065 logp 2 -2.19722458 -2.11608873 0.14875078 -0.4043946 logp 3 -2.81626379 -2.79368631 0.32721044 -0.1455725 logp 4 -2.97041447 -2.91263418 0.22644742 -0.1850601 logp 5 -3.37587957 -3.13821761 0.28176346 -0.7192649 logp 6 -2.45958884 -2.67953247 0.16875835 0.7432298 logp 7 -4.76217393 -4.46179917 0.59937397 -0.4919918 logp 8 -3.15273602 -3.55109988 0.25921308 0.8452161 logp 9 -2.05412373 -1.95613201 0.21405273 -1.2569055 CONVERGENCE INFORMATION... Original counts used. iterations = 6 , time elapsed = 0.05 norm.diff = 3.97857e-09 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 2.05338e-10 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09 #Fit the Probit Marginal model, along with the Independence Association model
Xprobit <- model.matrix(~CUT+AGE) Xloglin <- model.matrix(~Use15+Use17,d) X <- block.fct(Xprobit,Xloglin) a2 <- mph.fit(y,L.fct=L.fct,X=X)
a1$Xsq; a1$df; a2$Xsq; a2$df PEARSON SCORE STATISTIC 1.877965 [1] 4 PEARSON SCORE STATISTIC 45.3922 [1] 5 cbind(a1$beta[3],sqrt(a1$covbeta[3,3]),a2$beta[3],sqrt(a2$covbeta[3,3])) [,1] [,2] [,3] [,4] [1,] 0.2999332 0.09813575 0.2975494 0.1575182 #The b(AGE) coef estimates and ase's.
Remark: As expected, the independence/probit model appears to be untenable for these repeated measures data. Although the independence-model estimator of the marginal probit parameter b(AGE) is consistent, the reported standard error is not. In fact, it can be anticipated that the actual standard error of this independence-model estimator is significantly larger than the saturated-model estimator which correctly accommodates the association between the two responses, AGE15 and AGE17 marijuana use responses. |
#If we wanted to use log(m), rather than log(p), loglinear model, we'd have to use ... Lm.fct <- function(m) { p <- m/sum(m) p15 <- M15%*%p p17 <- M17%*%p L <- as.matrix(c( qnorm(p15[2]+p15[3]), qnorm(p15[3]), qnorm(p17[2]+p17[3]), qnorm(p17[3]), log(m) )) rownames(L) <- c("qnorm(P(Use15>0))","qnorm(P(Use15>1))", "qnorm(P(Use17>0))","qnorm(P(Use17>1))", paste("logm",1:9)) L } Xprobit <- model.matrix(~CUT+AGE-1) Xloglin <- model.matrix(~Use15+Use17+Use15.score:Use17.score,d) X <- block.fct(Xprobit,Xloglin) a3 <- mph.fit(y,L.fct=Lm.fct,X=X,L.mean=T)
EXAMPLE 15. Marginal Cumulative Logit and Linear-by-Linear Loglinear Association Model.
Opinions about Government Spending (data sources: General Social Survey 1989 and Table 1 of Lang and Agresti, JASA94).
Cities 1 2 3 Law Enforcement 1 2 3 1 2 3 1 2 3 Environment Health 1 1 62 17 5 90 42 3 74 31 11 2 11 7 0 22 18 1 19 14 3 3 2 3 1 2 0 1 1 3 1 2 1 11 3 0 21 13 2 20 8 3 2 1 4 0 6 9 0 6 5 2 3 1 0 1 2 1 1 4 3 1 3 1 3 0 0 2 1 0 9 2 1 2 1 0 0 2 1 0 4 2 0 3 1 0 0 0 0 0 1 2 3 1= too little, 2= about right, 3 = too much spending
Models fitted below:
y ← Y ~ multinomial(n, p), where the table probabilities p satisfy...
...the generalized loglinear model used in Lang and Agresti (JASA94): L(p) = [L1(p), L2(p)] = [X1a, X2b] = Xδ, where the components are defined as follows:
L1(p) = log(p).
L2(p) = vector of oneway cumulative logits = log (odd(Rt <= h) , t=1,2,3,4, h=1,2). Here, Rt is the tth response variable (t=1=Environment, 2=Health, 3=Cities, 4=Law Enforcement).
X1a has (ijkl)th element of the form:
a(0) + a(1)i + a(2)j+ a(3)k + a(4)l + a(1,2)*i*j + a(1,3)*i*k + a(1,4)*i*l + a(2,3)*j*k + a(2,4)*j*l + a(3,4)*k*l
X2b has (t,h)th element of the form: b(level)h + b(response)t
Note that L(p) = Xδ has the generalized loglinear model form ClogAp = Xδ. This model was denoted J(LxL)∩M(PO) in Lang and Agresti (JASA94), because the association structure is modeled via a linear-by-linear loglinear model and the one-way marginal structure is modeled via a proportional odds cumulative logit model.
For convenience, we will augment this model so that estimates of the oneway marginal probabilities along with their corresponding adjusted residuals will be output. Specifically, we will re-express the model in the equivalent form
L(p) = [L1(p), L2(p), L3(p)] = [X1a, X2b, X3θ] = Xβ, where L3(p) = (P(Rt=h), t=1,2,3,4, h=1,2,3) is the vector of one-way marginal probabilities and X3 = diag(12), the 12x12 identity matrix.
In this model, the marginal probabilities L3(p) are indirectly smoothed via the constraints implied by [L1(p), L2(p)] = [X1a, X2b]. See Lang (JASA 2005) for a discussion of these "indirect smoothing models."
The data model (data sampling model along with the data parameter model), y ←Y ~ multinomial(n, p) and L(p) = Xβ, is a homogeneous linear predictor model (see Lang JASA 2005).
y <- scan() 62 17 5 90 42 3 74 31 11 11 7 0 22 18 1 19 14 3 2 3 1 2 0 1 1 3 1 11 3 0 21 13 2 20 8 3 1 4 0 6 9 0 6 5 2 1 0 1 2 1 1 4 3 1 3 0 0 2 1 0 9 2 1 1 0 0 2 1 0 4 2 0 1 0 0 0 0 0 1 2 3 # The counts are entered as y = (yijkl: i,j,k,l=1,2,3), where the order follows the rule that # the rightmost subscripts move faster. The subscripts on yijkl, correspond # to the four responses as follows: # i=Environment, j=Health, k=Cities, and l= Law Enforcement. # For example, y1212 = 7 is the 11th observation in the vector y; it means # that 7 of those sampled responded (Environ=1, Health=2, Cities=1, Law=2). # See Table 1 of Lang and Agresti (JASA94). Environ <- factor(gl(3,27,81)) Health <- factor(gl(3,9,81)) Cities <- factor(gl(3,3,81)) Law <- factor(gl(3,1,81)) data.frame(Environ,Health,Cities,Law,y) Escore <- c(Environ) Hscore <- c(Health) Cscore <- c(Cities) Lscore <- c(Law) ## CREATE THE LINK FUNCTION L(m). ## ME <- M.fct(Environ) MH <- M.fct(Health) MC <- M.fct(Cities) ML <- M.fct(Law) M <- rbind(ME,MH,MC,ML) CUM0 <- scan() 1 0 0 0 1 1 1 1 0 0 0 1 CUM0 <- matrix(CUM0,4,3,byrow=T) CUM <- kronecker(diag(4),CUM0) AMarg <- CUM%*%M C0 <- scan() 1 -1 0 0 0 0 1 -1 C0 <- matrix(C0,2,4,byrow=T) CMarg <- kronecker(diag(4),C0) Lp.fct <- function(p) { logp <- log(p) onewayclogit <- CMarg%*%log(AMarg%*%p) onewayprob <- M%*%p L <- as.matrix(c(logp,onewayclogit,onewayprob)) rownames(L) <- c(paste("logp",1:81), "loddsE<2","loddsE<3", "loddsH<2","loddsH<3", "loddsC<2","loddsC<3", "loddsL<2","loddsL<3", "P(E=1)","P(E=2)","P(E=3)", "P(H=1)","P(H=2)","P(H=3)", "P(C=1)","P(C=2)","P(C=3)", "P(L=1)","P(L=2)","P(L=3)") L } ##CREATE DESIGN MATRIX X = block[XAssoc, XMarg, diag(12)] ## XAssoc <- model.matrix(~Environ+Health+Cities+Law+Escore:Hscore + Escore:Cscore + Escore:Lscore + Hscore:Cscore + Hscore:Lscore + Cscore:Lscore) dm <- scan(what=list(CUT="",RESP="")) 1 E 2 E 1 H 2 H 1 C 2 C 1 L 2 L dm <- data.frame(dm) XMarg <- model.matrix(~CUT + RESP - 1,data=dm) X <- block.fct(XAssoc, XMarg, diag(12)) ## FIT THE MODEL AND SUMMARIZE ## ap <- mph.fit(y, L.fct=Lp.fct, X=X) mph.summary(ap) MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho... Likelihood Ratio Stat (df= 69 ): Gsq = 71.535 (p = 0.39365 ) Pearson's Score Stat (df= 69 ): Xsq = 64.33728 (p = 0.6365 ) Generalized Wald Stat (df= 69 ): Wsq = 43.12471 (p = 0.99382 ) Adj Resids: -1.996 -1.743 ... 2.043 2.28 , Number |Adj Resid| > 2: 2 SAMPLING PLAN INFORMATION... Number of strata: 1 Strata identifiers: 1 Strata with fixed sample sizes: all Observed strata sample sizes: 607 LINEAR PREDICTOR MODEL RESULTS... BETA StdErr(BETA) Z-ratio p-value (Intercept) -3.856486433 0.178483120 -21.60700935 0.000000e+00 Environ2 -2.588205013 0.308823990 -8.38084183 0.000000e+00 Environ3 -5.493086736 0.655267978 -8.38296227 0.000000e+00 Health2 -2.595621274 0.297230071 -8.73270078 0.000000e+00 Health3 -5.610605951 0.637689942 -8.79832907 0.000000e+00 Cities2 -0.034223012 0.204332191 -0.16748713 8.669868e-01 Cities3 -0.937638792 0.413278262 -2.26878323 2.328151e-02 Law2 -1.826510927 0.263654376 -6.92767159 4.278133e-12 Law3 -4.197613831 0.551070670 -7.61719695 2.597922e-14 Escore:Hscore 0.498689176 0.112274533 4.44169449 8.925322e-06 Escore:Cscore 0.314319282 0.103927942 3.02439627 2.491299e-03 Escore:Lscore -0.003494676 0.112011470 -0.03119927 9.751106e-01 Hscore:Cscore 0.051558696 0.099563584 0.51784692 6.045651e-01 Hscore:Lscore 0.454699678 0.103442225 4.39568735 1.104228e-05 Cscore:Lscore 0.198569482 0.089578780 2.21670224 2.664345e-02 CUT1 -1.342154170 0.086191456 -15.57177748 0.000000e+00 CUT2 0.515427959 0.079998129 6.44300014 1.171345e-10 RESPE 2.336871534 0.117010601 19.97145142 0.000000e+00 RESPH 2.255916301 0.120197721 18.76837833 0.000000e+00 RESPL 1.874491508 0.111986680 16.73852200 0.000000e+00 [3.1] 0.730018682 0.017844267 40.91054536 0.000000e+00 [3.2] 0.215418742 0.013701431 15.72235327 0.000000e+00 [3.3] 0.054562575 0.005782026 9.43658465 0.000000e+00 [3.4] 0.713769394 0.018116177 39.39955879 0.000000e+00 [3.5] 0.227338141 0.013759216 16.52260869 0.000000e+00 [3.6] 0.058892465 0.006166318 9.55066935 0.000000e+00 [3.7] 0.207156029 0.014156292 14.63349475 0.000000e+00 [3.8] 0.418922021 0.014035304 29.84773458 0.000000e+00 [3.9] 0.373921950 0.018727908 19.96602862 0.000000e+00 [3.10] 0.630028094 0.019252284 32.72484993 0.000000e+00 [3.11] 0.286027282 0.014261072 20.05650639 0.000000e+00 [3.12] 0.083944625 0.007931189 10.58411547 0.000000e+00 OBS LINK ML LINK StdErr(L) LINK RESID logp 1 -2.28139441 -2.34214479 0.095988109 0.7667174334 logp 2 -3.57531545 -3.51888124 0.121653595 -0.2852222757 logp 3 -4.79909088 -5.24020966 0.263546567 0.9008338501 logp 4 -1.90871912 -1.81192035 0.055480544 -1.3221152708 logp 5 -2.67085917 -2.79008731 0.083716619 0.8845427650 logp 6 -5.30991650 -4.31284624 0.146242602 -3.1537711693 logp 7 -2.10446370 -2.15088867 0.091828124 0.7271838532 logp 8 -2.97454159 -2.93048614 0.099761757 -0.3173819288 logp 9 -4.01063352 -4.25467560 0.194390396 0.8817534685 logp 10 -4.01063352 -3.93281852 0.147114161 -0.3155472181 logp 11 -4.46261864 -4.65485528 0.151960064 0.4990257493 logp 12 -Inf -5.92148402 0.258577660 -Inf logp 13 -3.31748634 -3.35103537 0.103701811 0.1803506853 logp 14 -3.51815703 -3.87450266 0.116467520 1.4071862606 logp 15 -6.40852879 -4.94256192 0.143423649 -3.2094351956 logp 16 -3.46408981 -3.63844500 0.136629080 0.8473253486 logp 17 -3.76947146 -3.96334280 0.125889658 0.7369118170 logp 18 -5.30991650 -4.83283257 0.183270761 -1.1515979690 logp 19 -5.71538161 -5.94285565 0.314590661 0.3133186698 logp 20 -5.30991650 -6.21019273 0.279577646 1.0463497281 logp 21 -6.40852879 -7.02212179 0.399062617 0.4725321071 logp 22 -5.71538161 -5.30951381 0.215228332 -0.7599555379 logp 23 -Inf -5.37828141 0.161798208 -Inf logp 24 -6.40852879 -5.99164099 0.265133659 -0.5440565922 logp 25 -6.40852879 -5.54536473 0.282673600 -1.4797191620 logp 26 -5.30991650 -5.41556285 0.217235401 0.1862820936 logp 27 -6.40852879 -5.83035295 0.309194311 -0.8490914514 logp 28 -4.01063352 -4.12083603 0.159071976 0.4036179341 logp 29 -5.30991650 -5.30106714 0.189103424 -0.0163488085 logp 30 -Inf -7.02589024 0.330757519 -Inf logp 31 -3.36400635 -3.27629230 0.101436265 -0.4927805100 logp 32 -3.84357943 -4.25795393 0.130274768 1.3250353670 logp 33 -5.71538161 -5.78420754 0.207945842 0.0982490088 logp 34 -3.41279652 -3.30094133 0.120324284 -0.6616066827 logp 35 -4.32908725 -4.08403349 0.132070516 -0.8732852306 logp 36 -5.30991650 -5.41171762 0.225586007 0.1809478286 logp 37 -6.40852879 -5.21282057 0.184460807 -2.3148291504 logp 38 -5.02223443 -5.93835201 0.191118931 1.1960147896 logp 39 -Inf -7.20847543 0.292969107 -Inf logp 40 -4.61676932 -4.31671815 0.132339573 -0.9290704495 logp 41 -4.21130421 -4.84368011 0.145148255 1.4646959078 logp 42 -Inf -5.91523404 0.168305243 -Inf logp 43 -4.61676932 -4.28980849 0.142729601 -1.0435538445 logp 44 -4.79909088 -4.61820096 0.137112069 -0.4726692938 logp 45 -5.71538161 -5.49118542 0.179786309 -0.3707831882 logp 46 -6.40852879 -6.72416852 0.313328218 0.2799401698 logp 47 -Inf -6.99500029 0.280566245 -Inf logp 48 -6.40852879 -7.81042402 0.394972611 0.7094006523 logp 49 -5.71538161 -5.77650740 0.193018061 0.0870906645 logp 50 -6.40852879 -5.84876968 0.141283077 -0.7549586446 logp 51 -6.40852879 -6.46562394 0.238341328 0.0570865128 logp 52 -5.02223443 -5.69803905 0.236612342 1.0261032969 logp 53 -5.30991650 -5.57173184 0.168630327 0.4124166195 logp 54 -6.40852879 -5.99001662 0.258680890 -0.5451050874 logp 55 -5.30991650 -6.21620397 0.340140679 1.0773149652 logp 56 -Inf -7.39992976 0.370492121 -Inf logp 57 -Inf -9.12824753 0.552626047 -Inf logp 58 -5.71538161 -5.05734095 0.193879185 -1.4039179534 logp 59 -6.40852879 -6.04249727 0.221722074 -0.4565869832 logp 60 -Inf -7.57224555 0.398403024 -Inf logp 61 -4.21130421 -4.76767071 0.227096973 1.4838118658 logp 62 -5.71538161 -5.55425754 0.232411843 -0.2649112170 logp 63 -6.40852879 -6.88543634 0.400463523 0.3961777810 logp 64 -6.40852879 -6.80949934 0.327427053 0.3407838090 logp 65 -Inf -7.53852545 0.326104252 -Inf logp 66 -Inf -8.81214355 0.475427281 -Inf logp 67 -5.71538161 -5.59907763 0.179375460 -0.1813535926 logp 68 -6.40852879 -6.12953426 0.167189288 -0.3272239667 logp 69 -Inf -7.20458287 0.310174095 -Inf logp 70 -5.02223443 -5.25784869 0.205559437 0.4513390204 logp 71 -5.71538161 -5.58973584 0.171184375 -0.1962228393 logp 72 -Inf -6.46621497 0.308626547 -Inf logp 73 -6.40852879 -7.82215811 0.468598685 0.7167739163 logp 74 -Inf -8.09648455 0.431739288 -Inf logp 75 -Inf -8.91540296 0.560422556 -Inf logp 76 -Inf -6.56017771 0.314340321 -Inf logp 77 -Inf -6.63593466 0.255312333 -Inf logp 78 -Inf -7.25628360 0.380583676 -Inf logp 79 -6.40852879 -6.16739007 0.340998263 -0.2950815133 logp 80 -5.71538161 -6.04457754 0.265725370 0.4171827178 logp 81 -5.30991650 -6.46635699 0.380972464 1.2104544224 loddsE<2 1.00207436 0.99471736 0.090538025 0.5786279423 loddsE<3 2.79379093 2.85229949 0.112086258 -0.4203606869 loddsH<2 0.91975294 0.91376213 0.088673239 0.4227572342 loddsH<3 2.79379093 2.77134426 0.111256896 0.1704312075 loddsC<2 -1.26125559 -1.34215417 0.086191456 1.5860601196 loddsC<3 0.47321734 0.51542796 0.079998130 -1.6717484121 loddsL<2 0.50117219 0.53233734 0.082594968 -1.9876428816 loddsL<3 2.65147985 2.38991947 0.103139187 2.5184851520 P(E=1) 0.73146623 0.73001868 0.017844267 0.5776478132 P(E=2) 0.21087315 0.21541874 0.013701431 -0.4772650886 P(E=3) 0.05766063 0.05456258 0.005782026 0.4314828693 P(H=1) 0.71499176 0.71376939 0.018116177 0.4222152620 P(H=2) 0.22734761 0.22733814 0.013759216 0.0009466647 P(H=3) 0.05766063 0.05889246 0.006166318 -0.1687532296 P(C=1) 0.22075783 0.20715603 0.014156292 1.6236398142 P(C=2) 0.39538715 0.41892202 0.014035304 -1.6476003693 P(C=3) 0.38385502 0.37392195 0.018727908 1.6804419408 P(L=1) 0.62273476 0.63002809 0.019252284 -1.9955680999 P(L=2) 0.31136738 0.28602728 0.014261072 2.1968010087 P(L=3) 0.06589786 0.08394462 0.007931189 -2.2597073814 CONVERGENCE INFORMATION... Original counts used. iterations = 10 , time elapsed = 1.5 norm.diff = 4.26534e-07 = dist between last and second last iterates. Norm diff convergence criterion [1e-06] was met. norm.score = 3.3329e-07 = norm of score at last iteration. Norm score convergence criterion [1e-06] was met. FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09 # REMARK: # The model fitted above was expressed in terms of the table probabilities p, as L(p) = X b. # Alternatively, the model could have been specified in terms of the expected counts m, # as L(m) = X b, where L(.) is slightly modified from the definition above. # Below, the relevant code is given... # Lm.fct <- function(m) { logm <- log(m) onewayclogit <- CMarg%*%log(AMarg%*%m) onewayprob <- M%*%m/sum(m) L <- as.matrix(c(logm,onewayclogit,onewayprob)) rownames(L) <- c(paste("logm",1:81), "loddsE<2","loddsE<3", "loddsH<2","loddsH<3", "loddsC<2","loddsC<3", "loddsL<2","loddsL<3", "P(E=1)","P(E=2)","P(E=3)", "P(H=1)","P(H=2)","P(H=3)", "P(C=1)","P(C=2)","P(C=3)", "P(L=1)","P(L=2)","P(L=3)") L } am <- mph.fit(y, L.fct=Lm.fct, L.mean=T, X=X) mph.summary(am)
Document Last Updated:
05/22/2009 03:37:07 PM, Joseph B. Lang