10 Control Comparison
Forensic question: Is the target author’s convergence signal anomalous relative to a peer control author who has similar publishing volume?
Inputs: - data/analysis/comparison_report.json (loaded if present, generated otherwise) - data/analysis/{slug}_convergence.json for both target and control - data/analysis/{slug}_drift.json for both target and control
Outputs (in-notebook): - Side-by-side change-point timelines (target vs control) - Per-feature-family contrast bar chart (significant ones) - Velocity-trajectory overlay (centroid drift over time) - Verdict: count of feature families where target activity is significantly higher than control
Run metadata: (auto-populated by first code cell)
10.1 Methodology note
Control comparison detects whether the target’s signal is anomalous relative to a peer. If the same change-point density, the same family elevations, and the same centroid-velocity trajectory show up in a control author publishing into the same outlet over the same period, the signal is editorial (CMS, copy-desk, house style) — not the target’s own writing change.
- Pair selection: the default pair (
colby-halltarget,sarah-rumpfcontrol) is chosen because both authors have similar publishing volume but the target has the strongest convergence signal in the corpus. To pick a different pair, override the parameters cell:quarto render notebooks/08_control_comparison.ipynb -P target_slug:foo -P control_slug:bar. - Statistical test: Welch’s two-sample t-test per feature (target vs pooled controls). Family-level elevation requires at least one feature with
p < 0.05ANDtarget_mean > control_mean. - Editorial-vs-author signal: for every target convergence window, count the fraction of controls with no overlapping window. Mean across all target windows.
0= controls always agree (outlet-wide),1= controls never agree (author-specific). - Provenance: when the persisted
comparison_report.jsondoes not cover the requested pair, this notebook regenerates it viaforensics.analysis.compare_target_to_controlsand writes the result back to disk.
Summary finding: Side-by-side control comparisons distinguish author-specific drift from outlet-wide editorial shifts.