Course notes for HDFS 523
1
About This Book
1.1
Why this book?
1.2
Code Folding
1.3
Acknowledgements
2
Data Cleaning
2.1
Example Data
2.2
Reading in Repeated Measures Data
2.3
Familiarize Yourself with the Data
2.4
Look for Duplicated IDs
2.5
Using
table()
to Spot Irregularities
2.6
Missing Data
2.6.1
Generating Example Data
2.6.2
Recoding Values with
NA
2.6.3
Missing Data Visualization
2.7
Exporting Data
2.8
Reshaping Repeated Measures Data
2.8.1
Reshape Wide to Long
2.8.2
Reshape Long to Wide
3
Describing Longitudinal Data
3.1
Example Data
3.2
Describing Means and Variances
3.2.1
Verbal Ability (All Persons and Occasions)
3.2.2
Verbal Ability (Across Time)
3.3
Describing Covariances
3.4
Individual-Level Descriptives
3.5
References
4
Matrix Algebra
4.1
Types of matrices
4.1.1
Square
4.1.2
Symmetric
4.1.3
Diagonal
4.1.4
Identity
4.2
Operations on Matrices
4.2.1
Matrix Transpose
4.2.2
Matrix Trace
4.2.3
Addition
4.2.4
Subtraction
4.2.5
Matrix Multiplication
4.2.6
Matrix Division
5
Ordinary Least Squares
5.1
Linear Regression Model
5.2
Ordinary Least Squares (OLS)
5.3
Assumptions of OLS
5.3.1
Assumption 1.
\(\mathbb{E}(\epsilon_{i}) = 0\)
5.3.2
Assumption 2. Homoscedasticity
5.3.3
3.
\(\mathbb{E}(\epsilon_{i}\epsilon_{j}) = 0\)
5.3.4
4. No Perfect Collinearity
5.3.5
5.
\(\mathbb{C}(\epsilon_{i},x_{ki}) = 0\)
5.4
Properties of the OLS Estimator
5.4.1
1. Consistentcy of
\(\boldsymbol{{\beta}}\)
5.4.2
2. Asymptotic Normality
5.4.3
Variance of
\(\hat{\beta}\)
5.5
Failure to Meet Assumptions
5.5.1
Failure of Assumption 1.
5.5.2
Failure of Assumption 2 or 3.
5.5.3
Failure of Assumption 5.
5.6
Regression and Matrix Notation
5.6.1
An Intercept-Only Model
5.6.2
Intercept-Only Model in Matrix Form
5.6.3
Simple Regression in Matrix Form
5.6.4
Multiple Regression in Matrix Form
5.7
Solving the Regression Equation
5.7.1
Matrix Multiplication and Transpose
5.7.2
Matrix Inverse
5.8
The Linear Probability Model
5.8.1
Advantages of the LPM
5.8.2
Disadvantages of the LPM
6
Linear Regression
6.1
Example Data
6.2
Intercept-Only Model
6.2.1
Intercept-Only Equation
6.2.2
Intercept-Only Model in R
6.2.3
Intercept as Mean of Outcome
6.2.4
Intercept-Only Model
\(R^2\)
6.3
Simple Linear Regression
6.3.1
Regression Equation and Model Fitting
6.3.2
Path Diagram
6.3.3
Interpreting Model Parameters
6.3.4
Plotting Regression Line
6.4
Mean Centering Predictors
6.4.1
Interpreting Model Parameters
6.4.2
Plotting Regression Line
6.5
Multiple Linear Regression
6.5.1
Regression Equation
6.5.2
Fit Model in R
6.5.3
Path Diagram
6.5.4
A Note on Interpretation
6.6
Categorical Variable Interaction
6.6.1
Interaction as Moderation
6.6.2
Moderation by Categorical Variable
6.6.3
Interpretation
6.6.4
Fit Regression Model in R
6.6.5
Path Diagram
7
Logistic Regression
7.1
Categorical Data in the Social Sciences
7.1.1
Examples of Categorical Data
7.2
Introduction to GLMs
7.2.1
Linear Regression as GLM
7.2.2
Logistic Regression as GLM
7.2.3
Poisson Regression as GLM
7.2.4
Additional Remarks
7.3
Binary Logistic Regression
7.3.1
Overcoming LPM
7.3.2
Model
7.4
Example Data
7.4.1
Variables
7.5
Intercept-Only Model
7.5.1
Intercept-Only Model in R
7.5.2
Interpretation
7.6
Single Predictor Model
7.6.1
Overdispersion
7.6.2
Coefficients
7.7
Marginal Effects
7.7.1
A Definition of Marginal Effects
7.7.2
A Few Observations
7.7.3
Types of Marginal Effects
7.7.4
Example Model
7.7.5
Marginal Effects at Representative Values (MER)
8
Poisson Regression
8.1
Poisson Regression
8.1.1
Review of GLM
8.1.2
Poisson Regression as GLM
8.2
Poisson Distribution
8.3
Notes on Interpretation
8.3.1
One Predictor Model
8.3.2
Similarity to Logistic Regression
8.3.3
Percent Change
8.4
Example Data
8.4.1
Dependent variable
8.4.2
Explanatory Variables
8.5
Single Predictor Model
8.5.1
Read in Data
8.5.2
Single Predictor Model in GLM
8.5.3
Deviance and Goodness of Fit
8.5.4
Overdispersion?
8.5.5
Interpretation of Single Predictor Model
8.6
Multiple Predictor Model
8.7
Revisisting Overdispersion
8.7.1
Quassi-Poisson Family
8.7.2
Negative Binomial Regression
8.7.3
References
9
Two-Occassion change
9.1
Example Data I
9.2
Inroduction
9.2.1
A Thought Experiment
9.2.2
Equivalent Models in Psychology
9.2.3
Equivalent Models: An Example
9.2.4
Examples of Equivalent Models
9.3
Residualized Change
9.4
Raw Change
9.5
Autoregressive Models
9.5.1
Autoregressive Residuals
9.5.2
Autoregressive Model in R
9.6
Difference Score Model
9.6.1
Calculating Difference Scores
9.6.2
Comparison to Residualized Change
9.6.3
A Difference Score Regression Model
9.7
Critique of Residualized Change
9.8
Critique of Difference Scores
9.8.1
Alternative Interpretation
9.9
Comparing Models
9.10
Lord’s Paradox
9.11
Example Data II
9.11.1
Generate Some Data According to
(Castro-Schilo and Grimm 2018)
9.11.2
Plot Data
9.11.3
Fit Residualized Change Model
9.11.4
Fit Difference Score Model
9.12
Example Data III
9.12.1
Generate Some Data According to
(Castro-Schilo and Grimm 2018)
Example B
9.12.2
Plot Data
9.12.3
Fit Residualized Change Model
9.12.4
Fit Difference Score Model
9.13
Closing Thoughts
9.14
References
10
Introduction to Growth
10.1
Introduction
10.1.1
What is a multilevel model?
10.1.2
Two Faces of MLM
10.1.3
Two-Level Longitudinal Data
10.2
Example Data
10.2.1
Data Preparation and Description
10.2.2
Sample Moments
10.3
A General Model
10.4
Unconditional Means Model
10.4.1
Level 1
10.4.2
Level 2
10.4.3
Single Equation
10.4.4
Model Elaboration
10.4.5
Estimated Quantities
10.4.6
More Notation
10.4.7
Unconditional Means Model in R
10.4.8
Intra-Class Correlation
10.4.9
Model-Impled Moments
10.4.10
Model Residuals
10.5
Repeated Measures ANOVA
10.5.1
Intra-Class Correlation
10.5.2
Model-Implied Mean Vector
10.5.3
Model-Implied Covariance Matrix
10.6
Repeated Measures MANOVA
10.6.1
Model-Implied Mean Vector
10.6.2
Model-Implied Covariance Matrix
10.7
Repeated Measures MANOVA (Unstructured)
10.7.1
Model-Implied Mean Vector
10.7.2
Model-Implied Covariance Matrix
10.7.3
References
11
Growth Curve Modeling
11.1
Introduction
11.2
Data Preparation and Description
11.2.1
Loading libraries used in this script.
11.3
Individual Growth Models
11.3.1
Visualizing Individual Change
11.3.2
Multiple Individuals
11.4
Unconditional Means Model
11.4.1
Predicted Trajectories
11.5
Linear Growth Model
11.5.1
Random Intercept Model
11.5.2
Random Intercept and Slopes Model
11.5.3
Model Comparison
11.5.4
MLM and Individual Models
11.6
Quadratic Growth Model
11.7
Conditional Growth Model
11.7.1
Conditional Growth Equation
11.7.2
Conditional Growth Model 1
11.7.3
Conditional Growth Model 1
11.8
Alternative Time Metrics
11.8.1
Recentering time metrics
11.8.2
Rescaling time metric
11.8.3
Remapping Time
11.8.4
Compare Growth Metrics
12
Nonlinear Growth Curves
12.1
Review of Linear Growth
12.1.1
Theory of Linear Growth
12.1.2
Characteristics of Linear Growth
12.1.3
No Growth Model
12.1.4
Random Intercept Model
12.1.5
Linear Growth Model
12.1.6
Quadratic Growth Model
12.1.7
Quadratic Growth Equations
12.2
Introducing Nonlinear Growth
12.2.1
Types of Nonlinearity
12.2.2
Flexibility of Nonlinear Growth
12.2.3
Utility of Nonlinear Growth
12.2.4
Some Nonlinear Growth Models
12.2.5
Need for Nonlinear Models
12.3
Example Data (Ram & Grimm, 2007)
12.3.1
Read in Cortisol Data
12.3.2
Reshaping Data
12.3.3
Descriptives
12.3.4
Density Plots
12.3.5
Individual-level Trajectories
12.4
Linear Growth (Cortisol)
12.4.1
Equation
12.4.2
Fit Model
12.4.3
Predicted Trajectories
12.4.4
Interpretation
12.5
Quadratic Growth (Cortisol)
12.5.1
Equation
12.5.2
Fit Model
12.5.3
Predicted Trajectories
12.5.4
Interpretation
12.5.5
Interpretational Caution
12.5.6
Nonlinear or Linear Model?
12.6
Latent Basis (Cortisol)
12.6.1
Equation
12.6.2
Fit Model
12.6.3
Predicted Trajectories
12.6.4
Interpretation
12.7
Exponential Growth (Cortisol)
12.7.1
Fit Model
12.7.2
Predicted Trajectories
12.7.3
Interpretation
12.7.4
Nonlinear or Linear Model?
12.8
Multiphase Growth (Cortisol)
12.8.1
Equation
12.8.2
Fit Model
12.8.3
Predicted Trajectories
12.8.4
Interpretation
12.9
Bilinear Spline (Cortisol)
12.9.1
Equation
12.9.2
Fit Model
12.9.3
Predicted Trajectories
12.9.4
Interpretation
12.9.5
Fit Model
12.10
References
13
Dyadic Data Analysis
13.1
Introduction
13.1.1
Interpersonal Phenomena
13.1.2
Dyadic Measurement
13.1.3
Discussion question
13.2
Interdependence
13.2.1
Definition
13.2.2
Ignoring Interdependence
13.2.3
Linkage Types
13.2.4
Sources of Interdependence
13.2.5
Discussion Question
13.3
Basic Definitions
13.3.1
Distinguishability
13.3.2
Variable Types
13.4
Dyadic Designs
13.4.1
Standard Dyadic Design
13.4.2
Social Relations Model
13.4.3
One-with-many Design
13.4.4
Discussion Question:
13.5
Actor Partner Interdependence Model (APIM)
13.5.1
Model
13.5.2
Conceptual Interpretations
13.5.3
Actor-Partner Interactions
13.6
Longitudinal APIM
13.6.1
Model
13.6.2
Estimated Parameters
13.6.3
Parameter Covariation
13.7
Data Example
13.7.1
Preliminaries
13.7.2
Modeling Scenario
13.7.3
Descriptives
13.7.4
Dyadic Data Prep
13.7.5
APIM
13.7.6
Null Model
13.7.7
Full APIM
13.8
Data Example 2
13.8.1
Overview
13.8.2
Outline
13.8.3
The Modeling Enterprise.
13.8.4
The Data.
13.8.5
Plotting the Data.
13.8.6
The Multilevel Model.
13.8.7
Fit Male Model
13.8.8
Fit Female Model
13.8.9
Fit Full Model
13.8.10
Conclusion
13.9
Reference
14
Cluster Analysis
14.1
Introduction
14.1.1
Supervised Learning
14.1.2
Unsupervised Learning
14.2
Cluster Analysis
14.3
Clustering Algorithms
14.4
Hierarchical Clustering
14.4.1
Distances
14.4.2
Distance Between Clusters
14.4.3
Linkages Applied Example
14.4.4
Dendograms
14.5
K-Means Clustering
14.5.1
Within-Cluster Variation
14.5.2
K-Means Algorithm
14.5.3
K-means Example
14.5.4
Choosing K
14.6
DBSCAN
14.6.1
Example of DBSCAN
14.7
Issues in Clustering
14.7.1
Recommendations
14.8
Applied Examples
14.8.1
Preliminaries
14.8.2
Preparing Data
14.8.3
Scaling
14.8.4
Plotting
14.8.5
Distances
14.8.6
K-Means
14.8.7
Hierarchical Clustering
14.8.8
Two-step Approach
14.8.9
Final Thoughts
14.9
Reference
Published with bookdown
HDFS 523: Strategies for Data Analysis in Developmental Research
14.9
Reference
Castro-Schilo, Laura, and Kevin J Grimm. 2018.
“Using Residualized Change Versus Difference Scores for Longitudinal Research.”
Journal of Social and Personal Relationships
35 (1): 32–58.