2.8 Reshaping Repeated Measures Data
Behavioral science tends to use relational data structures - in basic form, spreadsheets. Typically, the data are stored in a data frame (a “fancy” matrix) with multiple rows and columns. Two common schemata used to accommodate repeated measures data are wide format and long format. Different analysis and plotting functions require different kinds of data input. Thus, it is imperative that one can convert the data back and forth between wide and long formats.
There are lots of ways to do this. We illustrate one way.
Sidebar: The dput()
function provides a convenient method to get the variable names (or any R object) into a format that can be read back into R. For example, this can be helpful when working with a long vector of strings.
## c("id", "verb1", "verb2", "verb4", "verb6", "perfo1", "perfo2",
## "perfo4", "perfo6", "info1", "comp1", "simu1", "voca1", "info6",
## "comp6", "simu6", "voca6", "momed", "grad", "constant")
First, let’s subset our data to only include the variables we need for this analysis.
var_names_sub <- c(
"id", "verb1", "verb2", "verb4", "verb6",
"perfo1", "perfo2", "perfo4", "perfo6",
"momed", "grad"
)
wiscraw <- wisc3raw[,var_names_sub]
head(wiscraw)
## id verb1 verb2 verb4 verb6 perfo1 perfo2 perfo4 perfo6 momed grad
## 1 1 24.42 26.98 39.61 55.64 19.84 22.97 43.90 44.19 9.5 0
## 2 2 12.44 14.38 21.92 37.81 5.90 13.44 18.29 40.38 5.5 0
## 3 3 32.43 33.51 34.30 50.18 27.64 45.02 46.99 77.72 14.0 1
## 4 4 22.69 28.39 42.16 44.72 33.16 29.68 45.97 61.66 14.0 1
## 5 5 28.23 37.81 41.06 70.95 27.64 44.42 65.48 64.22 11.5 0
## 6 6 16.06 20.12 38.02 39.94 8.45 15.78 26.99 39.08 14.0 1
2.8.1 Reshape Wide to Long
One way to go from wide to long is using the reshape()
function from base R.
Notice, the varying
argument contains the repeated measures columns we want to stack and the timevar
is a new variable containing the grade level information previosuly appended at the end of the colnames listed in varying
.
# reshape data from wide to long
wisclong <- reshape(
data = wiscraw,
varying = c("verb1", "verb2", "verb4","verb6", "perfo1","perfo2","perfo4","perfo6"),
timevar = c("grade"),
idvar = c("id"),
direction = "long",
sep = ""
)
# reorder by id and day
wisclong <- wisclong[ order(wisclong$id, wisclong$grade), ]
head(wisclong, 8)
## id momed grad grade verb perfo
## 1.1 1 9.5 0 1 24.42 19.84
## 1.2 1 9.5 0 2 26.98 22.97
## 1.4 1 9.5 0 4 39.61 43.90
## 1.6 1 9.5 0 6 55.64 44.19
## 2.1 2 5.5 0 1 12.44 5.90
## 2.2 2 5.5 0 2 14.38 13.44
## 2.4 2 5.5 0 4 21.92 18.29
## 2.6 2 5.5 0 6 37.81 40.38
Again, notice how reshape
automatically split verb1, verb2, etc. into a string name and a grade variable.
2.8.2 Reshape Long to Wide
Now we go from long to wide, again using the reshape()
function. The v.names
argument specifies the variables to be expanded column wise based on the repeated measure specified in timevar
.
#reshaping long to wide
wiscwide <- reshape(
data = wisclong,
timevar = c("grade"),
idvar = c("id"),
v.names = c("verb","perfo"),
direction = "wide",
sep = ""
)
# reordering columns
wiscwide <- wiscwide[, c(
"id", "verb1", "verb2", "verb4", "verb6",
"perfo1", "perfo2", "perfo4", "perfo6",
"momed","grad"
)]
head(wiscwide)
## id verb1 verb2 verb4 verb6 perfo1 perfo2 perfo4 perfo6 momed grad
## 1.1 1 24.42 26.98 39.61 55.64 19.84 22.97 43.90 44.19 9.5 0
## 2.1 2 12.44 14.38 21.92 37.81 5.90 13.44 18.29 40.38 5.5 0
## 3.1 3 32.43 33.51 34.30 50.18 27.64 45.02 46.99 77.72 14.0 1
## 4.1 4 22.69 28.39 42.16 44.72 33.16 29.68 45.97 61.66 14.0 1
## 5.1 5 28.23 37.81 41.06 70.95 27.64 44.42 65.48 64.22 11.5 0
## 6.1 6 16.06 20.12 38.02 39.94 8.45 15.78 26.99 39.08 14.0 1
Using functions included in base R can be useful in a number of situations. One example is package development where one may wants to limit dependencies.
That said, many people find reshape
to be unnecessarily complicated. A similar, and potentially more convenient, set of functions have been developed for reshaping data in the tidyr
(Wickham 2021) package. For those interested take a look at the pivot_longer()
and pivot_wider()
functions.
For examples using tidyr
to reshape data see the tidyr vignette on pivoting.