forensics.pipeline
End-to-end pipeline orchestration (scrape → extract → analyze → report).
forensics.cli.run_all calls run_all_pipeline here. Order of operations:
- Audit —
PipelineContextrecordsforensics allinanalysis_runs(best-effort; failures log a warning and the run continues). - Scrape —
asyncio.run(dispatch_scrape(...))with all boolean stage flags false, which selects the same full scrape handler as a plainforensics scrape(discover → metadata → fetch → dedup → JSONL export). Seeforensics.cli.scrape.dispatch_scrape. - Extract —
extract_all_featuresfor all authors, embeddings on. - Analyze —
run_analyze(AnalyzeRequest(stages=AnalyzeStageFlags(...)))withtimeseries=Trueandconvergence=Trueonly (no changepoint, drift, compare-only, or AI baseline unless you edit this module). - Report —
run_reportwithReportArgsbuilt fromget_settings().report.output_format.
Operational detail and artifact layout: docs/RUNBOOK.md, docs/ARCHITECTURE.md.
run_all_pipeline
Section titled “run_all_pipeline”def run_all_pipeline(*, show_progress: bool = True, observer: PipelineObserver | None = None) -> intRun the default full pipeline; returns process exit code.
The pipeline refuses to start when preflight checks hard-fail (returns
exit code 2) — this prevents cascading errors deeper in the run when
the environment is known to be broken.
Arguments:
show_progress- When true andobserverisNone, attach a~forensics.progress.RichPipelineObserverfor scrape + phase labels.observer- Optional pre-constructed observer (e.g. Rich session owned by the CLI). When set,show_progressonly controls the feature-extract Rich bar, not observer construction.
See also
Section titled “See also” Built by Abstract Data