Introduction

This report describes the analysis of the RIKZ dataset, a traditional dataset used to illustrate mixed models.

We read in the data

RIKZ <- read.delim("RIKZ.txt")
#recoding exposure so that 8 and 10 become 0's and 11's become 1
RIKZ$Exposure=ifelse(RIKZ$Exposure==11,1,0)

we look at the data structure

str(RIKZ)
## 'data.frame':    45 obs. of  5 variables:
##  $ Sample  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Richness: int  11 10 13 11 10 8 9 8 19 17 ...
##  $ Exposure: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ NAP     : num  0.045 -1.036 -1.336 0.616 -0.684 ...
##  $ Beach   : int  1 1 1 1 1 2 2 2 2 2 ...
head(RIKZ)
##   Sample Richness Exposure    NAP Beach
## 1      1       11        0  0.045     1
## 2      2       10        0 -1.036     1
## 3      3       13        0 -1.336     1
## 4      4       11        0  0.616     1
## 5      5       10        0 -0.684     1
## 6      6        8        0  1.190     2

We can look at the response variable, species richness, as a function of NA

plot(Richness~NAP,data=RIKZ)

When we see it like this, there is no notion about the structure of the data, in particular that there are 5 measurements in each beach

plot(Richness~NAP,data=RIKZ)
abline(lm(Richness~NAP,data=RIKZ),lwd=2,lty=2)

but we could make it a bit more explicit by coloring each beach with a diferent symbol

plot(Richness~NAP,pch=as.character(Beach),data=RIKZ)
abline(lm(Richness~NAP,data=RIKZ),lwd=2,lty=2)

Now the two stage approach: (1) we consider the relation between R and NAP, for each beach, and (2) we model the estimated coefficients per beach as a function of exposure

plot(Richness~NAP,pch=as.character(Beach),data=RIKZ)
as=numeric(9);bs=numeric(9);exp=numeric(9)
for(i in 1:9){
  m=lm(Richness~NAP,data=RIKZ[RIKZ$Beach==i,])
  cs=coef(m)
  as[i]=cs[1]
  bs[i]=cs[2]
  exp[i]=RIKZ$Exposure[i*5]
  abline(cs,lty=exp[i]+1,col=i)
}
legend("topright",lty=c(1,2),legend=c("Exposure 0","Exposure 1"),inset=0.05)

Now the second stage

boxplot(bs~exp)

The way to do all this in one go is to consider a mixed model, where exposure is a fixed effect, but beach is considered a random effect.

Implementing the mixed model, with just random intercepts

Using lme from nlme

library(nlme)
RIKZ$fbeach=as.factor(RIKZ$Beach)
lme1=lme(Richness~NAP,random=~1|fbeach,data=RIKZ)
summary(lme1)
## Linear mixed-effects model fit by REML
##  Data: RIKZ 
##        AIC     BIC    logLik
##   247.4802 254.525 -119.7401
## 
## Random effects:
##  Formula: ~1 | fbeach
##         (Intercept) Residual
## StdDev:    2.944065  3.05977
## 
## Fixed effects: Richness ~ NAP 
##                 Value Std.Error DF   t-value p-value
## (Intercept)  6.581893 1.0957618 35  6.006682       0
## NAP         -2.568400 0.4947246 35 -5.191574       0
##  Correlation: 
##     (Intr)
## NAP -0.157
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.4227495 -0.4848006 -0.1576462  0.2518966  3.9793918 
## 
## Number of Observations: 45
## Number of Groups: 9

Using lmer from lme4

library(lme4)
lme2=lmer(Richness~NAP+(1|fbeach),data=RIKZ)
summary(lme2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Richness ~ NAP + (1 | fbeach)
##    Data: RIKZ
## 
## REML criterion at convergence: 239.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4228 -0.4848 -0.1576  0.2519  3.9794 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  fbeach   (Intercept) 8.668    2.944   
##  Residual             9.362    3.060   
## Number of obs: 45, groups:  fbeach, 9
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   6.5819     1.0958   6.007
## NAP          -2.5684     0.4947  -5.192
## 
## Correlation of Fixed Effects:
##     (Intr)
## NAP -0.157

Make predictions from lme

Level0=fitted(lme1,level=0)
Level1=fitted(lme1,level=1)

Look at the model plotted

I=order(RIKZ$NAP)
NAPs=sort(RIKZ$NAP)
plot(Richness~NAP,pch=as.character(Beach),col=Beach,data=RIKZ)
lines(NAPs,Level0[I],lwd=3)
for(j in 1:9){
  #index for beach
  bi=RIKZ$Beach==j
  xs=RIKZ$NAP[bi]
  ys=Level1[bi]
  Oxs=order(xs)
  lines(sort(xs),ys[Oxs],col=j)
}

Implementing the mixed model, with random intercepts and slopes

Using lme from nlme

lme3=lme(Richness~NAP,random=~NAP|fbeach,data=RIKZ)
summary(lme3)
## Linear mixed-effects model fit by REML
##  Data: RIKZ 
##        AIC      BIC    logLik
##   244.3839 254.9511 -116.1919
## 
## Random effects:
##  Formula: ~NAP | fbeach
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 3.549064 (Intr)
## NAP         1.714963 -0.99 
## Residual    2.702820       
## 
## Fixed effects: Richness ~ NAP 
##                 Value Std.Error DF   t-value p-value
## (Intercept)  6.588706  1.264761 35  5.209448   0e+00
## NAP         -2.830028  0.722940 35 -3.914610   4e-04
##  Correlation: 
##     (Intr)
## NAP -0.819
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.8213326 -0.3411043 -0.1674617  0.1921216  3.0397049 
## 
## Number of Observations: 45
## Number of Groups: 9

Using lmer from lme4

lme5=lmer(Richness~NAP+(NAP|fbeach),data=RIKZ)
summary(lme5)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Richness ~ NAP + (NAP | fbeach)
##    Data: RIKZ
## 
## REML criterion at convergence: 232.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8216 -0.3415 -0.1673  0.1931  3.0412 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  fbeach   (Intercept) 12.630   3.554         
##           NAP          2.942   1.715    -0.99
##  Residual              7.303   2.702         
## Number of obs: 45, groups:  fbeach, 9
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    6.588      1.266   5.203
## NAP           -2.830      0.723  -3.914
## 
## Correlation of Fixed Effects:
##     (Intr)
## NAP -0.820
## convergence code: 0
## Model failed to converge with max|grad| = 0.00721061 (tol = 0.002, component 1)

Make predictions from lme

Level0.3=fitted(lme3,level=0)
Level1.3=fitted(lme3,level=1)

and ploting

I=order(RIKZ$NAP)
NAPs=sort(RIKZ$NAP)
plot(Richness~NAP,pch=as.character(Beach),col=Beach,data=RIKZ)
lines(NAPs,Level0.3[I],lwd=3)
for(j in 1:9){
  #index for beach
  bi=RIKZ$Beach==j
  xs=RIKZ$NAP[bi]
  ys=Level1.3[bi]
  Oxs=order(xs)
  lines(sort(xs),ys[Oxs],col=j)
}

Zuur says in page 110 we could compare the models by AIC, but that is just nonsense. Having been fitted by REML, no AIC is available

AIC(lme1)
## [1] 247.4802
AIC(lme3)
## [1] 244.3839

Just for comparison, we could try to see if a random effect model would be a better model. In other words, was NAP required at all or a different mean by beach would suffice?

lme6=lme(Richness~1,random=~1|fbeach,data=RIKZ)

We can look at the summary and AIC of such a model

summary(lme6)
## Linear mixed-effects model fit by REML
##  Data: RIKZ 
##        AIC      BIC    logLik
##   267.1142 272.4668 -130.5571
## 
## Random effects:
##  Formula: ~1 | fbeach
##         (Intercept) Residual
## StdDev:    3.237112 3.938415
## 
## Fixed effects: Richness ~ 1 
##                Value Std.Error DF  t-value p-value
## (Intercept) 5.688889  1.228419 36 4.631066       0
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.77968689 -0.50704111 -0.09795286  0.25468670  3.80631705 
## 
## Number of Observations: 45
## Number of Groups: 9
AIC(lme6)
## [1] 267.1142

Comparing the 3 models

AIC(lme1,lme3,lme6)
## Warning in AIC.default(lme1, lme3, lme6): models are not all fitted to the
## same number of observations
##      df      AIC
## lme1  4 247.4802
## lme3  6 244.3839
## lme6  3 267.1142