Course notes for HDFS 523
1
About This Book
1.1
Why this book?
1.2
Code Folding
1.3
Acknowledgements
2
Data Cleaning
2.1
Example Data
2.2
Reading in Repeated Measures Data
2.3
Familiarize Yourself with the Data
2.4
Look for Duplicated IDs
2.5
Using
table()
to Spot Irregularities
2.6
Missing Data
2.6.1
Generating Example Data
2.6.2
Recoding Values with
NA
2.6.3
Missing Data Visualization
2.7
Exporting Data
2.8
Reshaping Repeated Measures Data
2.8.1
Reshape Wide to Long
2.8.2
Reshape Long to Wide
3
Describing Longitudinal Data
3.1
Example Data
3.2
Describing Means and Variances
3.2.1
Verbal Ability (All Persons and Occasions)
3.2.2
Verbal Ability (Across Time)
3.3
Describing Covariances
3.4
Individual-Level Descriptives
3.5
References
4
Matrix Algebra
4.1
Types of matrices
4.1.1
Square
4.1.2
Symmetric
4.1.3
Diagonal
4.1.4
Identity
4.2
Operations on Matrices
4.2.1
Matrix Transpose
4.2.2
Matrix Trace
4.2.3
Addition
4.2.4
Subtraction
4.2.5
Matrix Multiplication
4.2.6
Matrix Division
4.3
References
5
Ordinary Least Squares
5.1
Linear Regression Model
5.2
Ordinary Least Squares (OLS)
5.3
Assumptions of OLS
5.3.1
Assumption 1.
\(\mathbb{E}(\epsilon_{i}) = 0\)
5.3.2
Assumption 2. Homoscedasticity
5.3.3
3.
\(\mathbb{E}(\epsilon_{i}\epsilon_{j}) = 0\)
5.3.4
4. No Perfect Collinearity
5.3.5
5.
\(\mathbb{C}(\epsilon_{i},x_{ki}) = 0\)
5.4
Properties of the OLS Estimator
5.4.1
1. Consistentcy of
\(\boldsymbol{{\beta}}\)
5.4.2
2. Asymptotic Normality
5.4.3
Variance of
\(\hat{\beta}\)
5.5
Failure to Meet Assumptions
5.5.1
Failure of Assumption 1.
5.5.2
Failure of Assumption 2 or 3.
5.5.3
Failure of Assumption 5.
5.6
Regression and Matrix Notation
5.6.1
An Intercept-Only Model
5.6.2
Intercept-Only Model in Matrix Form
5.6.3
Simple Regression in Matrix Form
5.6.4
Multiple Regression in Matrix Form
5.7
Solving the Regression Equation
5.7.1
Matrix Multiplication and Transpose
5.7.2
Matrix Inverse
5.8
The Linear Probability Model
5.8.1
Advantages of the LPM
5.8.2
Disadvantages of the LPM
6
Linear Regression
6.1
Example Data
6.2
Intercept-Only Model
6.2.1
Intercept-Only Equation
6.2.2
Intercept-Only Model in R
6.2.3
Intercept as Mean of Outcome
6.2.4
Intercept-Only Model
\(R^2\)
6.3
Simple Linear Regression
6.3.1
Regression Equation and Model Fitting
6.3.2
Path Diagram
6.3.3
Interpreting Model Parameters
6.3.4
Plotting Regression Line
6.4
Mean Centering Predictors
6.4.1
Interpreting Model Parameters
6.4.2
Plotting Regression Line
6.5
Multiple Linear Regression
6.5.1
Regression Equation
6.5.2
Fit Model in R
6.5.3
Path Diagram
6.5.4
A Note on Interpretation
6.6
Categorical Variable Interaction
6.6.1
Interaction as Moderation
6.6.2
Moderation by Categorical Variable
6.6.3
Interpretation
6.6.4
Fit Regression Model in R
6.6.5
Path Diagram
7
Statistical Control
7.1
Statistical Control
7.2
Directed Acyclic Graphs (DAGs)
7.2.1
Introduction to DAGs
7.2.2
Introduction to DAGs: Paths
7.2.3
Introduction to DAGs: Chains
7.2.4
Introduction to DAGs: Descendants and Ancestors
7.2.5
Introduction to DAGs: Forks
7.2.6
Introduction to DAGs: Inverted Forks
7.2.7
Introduction to DAGs: Acyclicity
7.3
Statistical Control Done Right
7.3.1
Building a DAG
7.3.2
Building a DAG: Back-Door Paths
7.4
Statistical Control Gone Wrong
7.4.1
Collider Bias
7.4.2
Conditioning on a Collider
7.4.3
Avoiding Collider Bias
7.4.4
Variations on Collider Bias: Nonresponse Bias
7.4.5
Controlling for Mediators
Published with bookdown
HDFS 523: Strategies for Data Analysis in Developmental Research
3.5
References