Pre-test and post-test survey data from a civic education study involving 794 students across 16 schools. The study measured a wide range of civic attitudes, knowledge, behaviors, and participation before and after a civic education intervention. With 237 variables covering survey items (Likert scales, binary, and open-ended), composite scale scores, and demographic information, this dataset supports analyses of treatment effects, regression to the mean, and scale construction.
Format
A tibble with 794 rows and 237 columns. Variables are organized into the following groups:
Identifiers and treatment indicators:
- id_match
Unique student identifier for matching pre and post records. Type: numeric. Range: (201180001, 351190031).
- school
School identifier. Type: character. 16 unique schools coded as: 20, 21, 22, 23, 25, 26, 27, 29, and others.
- wtp
We the People program participation. Type: numeric. Binary indicator (0/1) where 1 = participated in the civic education program, 0 = comparison group. 63% participated.
- ap
AP (Advanced Placement) class indicator. Type: numeric. Binary indicator (0/1) where 1 = AP class, 0 = non-AP class.
Pre-test survey items (t_q1 through t_q99):
Variables prefixed with t_ represent pre-test (Time 1) survey
responses. These include:
- t_q1 to t_q23
Civic knowledge and attitude items. Type: numeric. Mostly 4-point Likert scales (1-4). Some items have 3-point scales (1-3). Small amounts of missing data (NA = 0 to 8).
- t_numorg
Number of organizations participated in (pre-test). Type: numeric. Range: (0, 5).
- t_orgs_y
Any organization membership (pre-test). Type: numeric. Binary (0/1) where 1 = member of at least one organization.
- t_q24
Political interest item. Type: numeric. 6-point scale (1-6).
- t_q25 to t_q28
Political activity items (pre-test). Type: numeric. Binary (0/1).
- t_numldr
Number of leadership roles (pre-test). Type: numeric. Range: (0, 4).
- t_q29 to t_q35
Civic activity items (pre-test). Type: numeric. Binary (0/1) with some missing values.
- t_numact
Number of civic activities (pre-test). Type: numeric. Range: (0, 7).
- t_q36 to t_q47
Civic attitude items. Type: numeric. 4-point Likert scales (1-4).
- t_q48, t_q49, t_q50
Knowledge test items. Type: character. Multiple choice responses (A, B, C, D, E, F) with -1 indicating missing or no response.
- t_q51 to t_q52
Attitude items. Type: numeric. 4-point scales (1-4).
- t_q53 to t_q64
Political feeling thermometer items. Type: numeric. 7-point scales (1-7). Higher values indicate warmer feelings.
- t_q65 to t_q78
Civic engagement and obligation items. Type: numeric. 4-point Likert scales (1-4).
- t_q79 to t_q84
Additional feeling thermometer items. Type: numeric. 7-point scales (1-7).
- t_q85 to t_q89
Tolerance and ambiguity items. Type: numeric. 4- or 5-point scales.
- t_q92_a to t_q92_f
Race/ethnicity indicators (pre-test). Type: numeric. Binary (0/1). Multiple categories (e.g., a = Asian, b = Black, c = Hispanic, etc.).
- t_q90, t_q91
Media exposure items. Type: numeric. 6-point scales (0-5).
- t_q93
Political discussion at home. Type: numeric. Binary (0/1) where 1 = discusses politics at home.
- t_q94
Religious attendance. Type: numeric. Range: (0, 3).
- t_q95
Self-reported grades. Type: numeric. Range: (1, 8).
- t_q96_a to t_q96_k
Subject enrollment indicators. Type: numeric. Binary (0/1) for various courses taken.
- t_q97 to t_q99
Additional survey items. Type: numeric. 4- or 5-point scales.
Post-test survey items (r_q1 through r_q84):
Variables prefixed with r_ represent post-test (Time 2) survey
responses. These mirror the pre-test items with the same numbering
(e.g., r_q1 corresponds to t_q1). Scales and coding are
identical to the pre-test versions.
Demographic and background variables:
- par_coll
Parent has college education. Type: numeric. Binary (0/1) where 1 = at least one parent attended college. NA = 40.
- hi_math
Highest math course taken. Type: numeric. Range: (0, 5). Higher values indicate more advanced math courses.
Composite scale scores (standardized, mean approximately 0):
Variables ending in _m are IRT-based or factor-analytic composite
scores. Suffixes _t and _r indicate pre-test and post-test
respectively.
- attn_pre, attn_pos
Attention to politics (pre and post). Type: numeric. Standardized scores approximately (-1.5, 1.8).
- inf_pre, inf_post
Political information/knowledge (pre and post). Type: numeric. Standardized scores.
- ldr_t_m, ldr_r_m
Leadership composite (pre and post). Type: numeric. Standardized.
- govm_t_m, govm_r_m
Government knowledge (pre and post). Type: numeric. Standardized.
- posg_t_m, posg_r_m
Positive government attitudes (pre and post). Type: numeric. Standardized.
- eff_t_m, eff_r_m
Political efficacy (pre and post). Type: numeric. Standardized.
- res_t_m, res_r_m
Political responsibility (pre and post). Type: numeric. Standardized.
- obl_t_m, obl_r_m
Civic obligation (pre and post). Type: numeric. Standardized.
- tol_t_m, tol_r_m
Political tolerance (pre and post). Type: numeric. Standardized.
- amb_t_m, amb_r_m
Ambiguity tolerance (pre and post). Type: numeric. Standardized.
- comm_t_m, comm_r_m
Community engagement (pre and post). Type: numeric. Standardized.
- all_t_m, all_r_m
Overall civic competence composite (pre and post). Type: numeric. Standardized. No missing values.
- liked_cl
Liked civic learning environment. Type: numeric. Standardized. Mean approximately 0.
Source
Tolo, K. W. (1999). The Civic Education of American Youth: From
State Policies to School District Practices. Policy Research Project
Report, Lyndon B. Johnson School of Public Affairs, The University of
Texas at Austin. Original data file: civic_ed.dta
Details
This dataset is used in Chapter 1 (Simple Linear Regression) to illustrate regression to the mean with the civic education example: regressing post-test attention to politics on pre-test attention yields a slope less than 1, showing that students below average before the course tend to improve and those above average tend to decline. The dataset also supports analyses in later chapters including ANCOVA (comparing treatment and control groups while controlling for pre-test scores), multiple regression with many predictors, and model building.
Key analyses include: simple regression of post-test on pre-test scores to demonstrate regression to the mean, ANCOVA comparing WTP program participants with comparison students, and exploring relationships among multiple civic attitude dimensions.
Examples
data(civic_ed)
# Regression to the mean: post-test on pre-test attention to politics
lm(attn_pos ~ attn_pre, data = civic_ed)
#>
#> Call:
#> lm(formula = attn_pos ~ attn_pre, data = civic_ed)
#>
#> Coefficients:
#> (Intercept) attn_pre
#> -0.003737 0.771140
#>
# ANCOVA: treatment effect controlling for pre-test
lm(attn_pos ~ attn_pre + wtp, data = civic_ed)
#>
#> Call:
#> lm(formula = attn_pos ~ attn_pre + wtp, data = civic_ed)
#>
#> Coefficients:
#> (Intercept) attn_pre wtp
#> -0.07049 0.75878 0.10703
#>
