# Measurement Invariance + 2nd order Growth Model (ECLS-K)

#### Nilam Ram, Kevin Grimm, et al.

# 1 Overview

In this tutorial we walk through the very basics of testing measurement invariance in the context if a longitudinal factor model - and how a 2nd order growth model can then be used to describe change in an invariant factor.

This tutorial follows the example in Chapter 14 of Growth Modeling: Structural Equation and Multilevel Modeling Approaches (Grimm, Ram & Estabrook, 2017). Using 3-occasion data from the ECLS-K, we test for factorial invariance, and then use a second-order growth model to describe change in the factor scores across time.

# 2 Introduction to the Common Factor Model

The basic factor analysis model can be written as a matrix equation â€¦

\[ \boldsymbol{Y_{i}} = \boldsymbol{\tau} + \boldsymbol{\Lambda}\boldsymbol{F_{i}} + \boldsymbol{U_{i}} \]

where \(\boldsymbol{Y_{i}}\) is a \(p\) x 1 vector of observed variable scores, \(\boldsymbol{\Lambda}\) is a *p* x *q* matrix of factor loadings, \(\boldsymbol{F_{i}}\) is a \(q\) x 1 vector of common factor scores, and \(\boldsymbol{U_{i}}\) is a *p* x 1 vector of unique factor scores.

We can rewrite the model in terms of variance-covariance and mean expectations. For example, the expected covariance matrix, \(\boldsymbol{\Sigma} = \boldsymbol{Y}'\boldsymbol{Y}\), becomes

\[ \boldsymbol{\Sigma} = \boldsymbol{\Lambda}\boldsymbol{\Psi}\boldsymbol{\Lambda}' + \boldsymbol{\Theta} \] where \(\boldsymbol{\Sigma}\) is a *p* x *p* covariance (or correlation) matrix of the observed variables, \(\boldsymbol{\Lambda}\) is a *p* x *q* matrix of factor loadings, \(\boldsymbol{\Psi}\) is a *q* x *q* covariance matrix of the latent factor variables, and \(\boldsymbol{\Theta}\) is a diagonal matrix of unique factor variances.

and the expected *p* x 1 mean vector, $ $ becomes \[ \boldsymbol{\mu} = \boldsymbol{\tau} + \boldsymbol{\Lambda}\boldsymbol{\alpha} \]

where \(\boldsymbol{\tau}\) is a *p* x 1 vector of manifest variable means, \(\boldsymbol{\Lambda}\) is a *p* x *q* matrix of factor loadings, and \(\boldsymbol{\alpha}\) is a *q* x 1 vector of latent variable means.

We can then extend the model to multiple-occasion settings with occasion-specific subscript, so that

\[ \boldsymbol{\Sigma_t} = \boldsymbol{\Lambda_t}\boldsymbol{\Psi_t}\boldsymbol{\Lambda_t}' + \boldsymbol{\Theta_t} \] and \[ \boldsymbol{\mu_t} = \boldsymbol{\tau_t} + \boldsymbol{\Lambda_t}\boldsymbol{\alpha_t} \]

Different levels of *measurement invariance* are established by testing (or requiring) that various matrices are equal.

Specifically, â€¦

For *Configural Invaraince* we establish that the structure of \(\boldsymbol{\Lambda_t}\) is equivalent across occasions.

For *Weak Invariance* we additionally establish that the factor loadings are equivalent across occasions, \(\boldsymbol{\Lambda_t} = \boldsymbol{\Lambda}\).

For *Strong Invariance* we additionally establish that the manifest means are equivalent across occasions, \(\boldsymbol{\tau_t} = \boldsymbol{\tau}\). and

For *Strict Invariance* we additionally establish that the residual/unique variances are also equivalent across occasions \(\boldsymbol{\Theta_t} = \boldsymbol{\Theta}\).

Measurement Invariance testing is usually conducted within a Structural Equation Modeling (SEM) framework. Here we illustrate how this may be done using the `lavaan`

package.

Lots of good information and instruction (including about invariance testing - in a multiple-group setting) can be found on the package website â€¦ http://lavaan.ugent.be.

### 2.0.1 Prelim - Loading libraries used in this script.

```
library(psych)
library(ggplot2)
library(corrplot) #plotting correlation matrices
library(lavaan) #for fitting structural equation models
library(semPlot) #for automatically making diagrams
```

### 2.0.2 Prelim - Reading in Repeated Measures Data

For this example, we use an ECLS-K dataset that is in wideform. There are variables for childrenâ€™s science, reading, and math aptitude scores, obtained in 3rd, 5th, and 8th grade. The three aptitude scores are considered indicators of an *academic achievement* latent factor.

```
#set filepath for data file
<- "https://raw.githubusercontent.com/LRI-2/Data/main/GrowthModeling/ECLS_Science.dat"
filepath #read in the text data file using the url() function
<- read.table(file=url(filepath),na.strings = ".")
dat
names(dat) <- c("id", "s_g3", "r_g3", "m_g3", "s_g5", "r_g5", "m_g5", "s_g8",
"r_g8", "m_g8", "st_g3", "rt_g3", "mt_g3", "st_g5", "rt_g5",
"mt_g5", "st_g8", "rt_g8", "mt_g8")
#selecting only the variables of interest
<- dat[ ,c("id", "s_g3", "r_g3", "m_g3", "s_g5", "r_g5", "m_g5", "s_g8",
dat "r_g8", "m_g8")]
```

### 2.0.3 Prelim - Descriptives

Lets have a quick look at the data file and the descriptives.

```
#data structure
head(dat,10)
```

id | s_g3 | r_g3 | m_g3 | s_g5 | r_g5 | m_g5 | s_g8 | r_g8 | m_g8 |
---|---|---|---|---|---|---|---|---|---|

1 | NA | NA | NA | NA | NA | NA | NA | NA | NA |

3 | NA | NA | NA | NA | NA | NA | NA | NA | NA |

8 | NA | NA | NA | NA | NA | NA | 103.90 | 204.10 | 166.67 |

16 | 51.57 | 142.18 | 115.59 | 65.94 | 141.02 | 133.67 | 86.90 | 169.83 | 156.67 |

28 | NA | NA | NA | NA | NA | NA | NA | NA | NA |

44 | NA | NA | NA | NA | NA | NA | NA | NA | NA |

46 | 72.09 | 154.43 | 96.87 | 79.44 | 170.57 | 116.28 | 89.08 | 192.07 | 132.40 |

62 | 34.71 | 106.40 | 87.86 | 47.44 | 145.72 | 104.68 | NA | NA | NA |

66 | NA | NA | NA | NA | NA | NA | NA | NA | NA |

74 | 62.94 | 126.06 | 92.47 | 73.70 | 145.17 | 124.73 | 92.67 | 193.43 | 133.93 |

```
#descriptives (means, sds)
::describe(dat[,-1]) #-1 to remove the id column psych
```

vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

s_g3 | 1 | 1442 | 50.99316 | 15.61923 | 50.935 | 50.82393 | 16.76079 | 18.37 | 92.66 | 74.29 | 0.0789253 | -0.5926226 | 0.4113174 |

r_g3 | 2 | 1430 | 127.65780 | 29.21852 | 126.975 | 128.44100 | 31.33475 | 51.46 | 195.82 | 144.36 | -0.2126704 | -0.4974589 | 0.7726633 |

m_g3 | 3 | 1442 | 99.71625 | 25.53598 | 102.605 | 99.98539 | 27.70979 | 35.72 | 159.40 | 123.68 | -0.0954525 | -0.7022825 | 0.6724653 |

s_g5 | 4 | 1135 | 65.25489 | 16.18239 | 67.530 | 65.99278 | 16.51616 | 22.57 | 103.23 | 80.66 | -0.3888726 | -0.4883320 | 0.4803355 |

r_g5 | 5 | 1133 | 151.09049 | 27.30787 | 152.330 | 153.09954 | 26.73128 | 64.69 | 202.22 | 137.53 | -0.6153502 | -0.0358596 | 0.8112841 |

m_g5 | 6 | 1136 | 124.34706 | 25.16900 | 128.645 | 126.08475 | 25.13748 | 50.87 | 169.53 | 118.66 | -0.5818852 | -0.2415627 | 0.7467526 |

s_g8 | 7 | 947 | 84.88585 | 16.71213 | 88.930 | 86.88478 | 14.81117 | 29.61 | 107.90 | 78.29 | -0.9948362 | 0.4785809 | 0.5430713 |

r_g8 | 8 | 941 | 172.04634 | 27.72934 | 179.700 | 175.53752 | 24.90768 | 89.15 | 208.44 | 119.29 | -0.9835795 | 0.2411465 | 0.9039507 |

m_g8 | 9 | 945 | 142.46749 | 22.50262 | 147.360 | 145.06745 | 21.23083 | 67.75 | 172.20 | 104.45 | -0.9360756 | 0.3588947 | 0.7320104 |

```
#correlation matrix
round(cor(dat[,-1], use = "pairwise.complete"),2)
```

```
## s_g3 r_g3 m_g3 s_g5 r_g5 m_g5 s_g8 r_g8 m_g8
## s_g3 1.00 0.76 0.71 0.85 0.73 0.68 0.75 0.68 0.66
## r_g3 0.76 1.00 0.75 0.73 0.85 0.70 0.70 0.76 0.68
## m_g3 0.71 0.75 1.00 0.70 0.72 0.88 0.71 0.66 0.81
## s_g5 0.85 0.73 0.70 1.00 0.77 0.74 0.81 0.73 0.70
## r_g5 0.73 0.85 0.72 0.77 1.00 0.75 0.74 0.80 0.71
## m_g5 0.68 0.70 0.88 0.74 0.75 1.00 0.74 0.68 0.85
## s_g8 0.75 0.70 0.71 0.81 0.74 0.74 1.00 0.78 0.78
## r_g8 0.68 0.76 0.66 0.73 0.80 0.68 0.78 1.00 0.75
## m_g8 0.66 0.68 0.81 0.70 0.71 0.85 0.78 0.75 1.00
```

```
#visusal correlation matrix
corrplot(cor(dat[,-1], use = "pairwise.complete"), order = "original", tl.col='black', tl.cex=.75)
```