8.1 Categorical Data in the Social Sciences
Linear regression is a workhorse procedure of modern statistics. Our introduction to regression in this class was framed around the idea of a continuous dependent (outcome) variable. However, categorical data is extremely common in many health, behavioral and social science applications.
8.1.1 Examples of Categorical Data
Binary Variables have two categories and are often used to indicate that an event has occurred or a characteristic is present. Are you sick? Did you vote in the last election? Are you married?
Ordinal variables have categories that can be ranked. Surveys often ask respondents to indicate their agreement to a statement, how frequently then engage in a behavior, or even educational attainment.
Nominal variables occur when there are multiple outcomes that cannot be ordered. For example, left or right handedness or occupation.
Censored variables occur when the value of a variable is unknown over some range of the variable. For example, measuring hourly wages might be restricted on the lower end by minimum wage laws.
Counts indicate the number of times that some event has occurred. How many drinks last week? How many people living in a house? How many years of education? Censored and count variables are often lumped in with more traditional categorical variables under the umbrella of limited dependent variables.