Data

Democracy Monitor is open data. All assessment results, source documents, and computed metrics are available for download and programmatic access.

CSV Downloads#

Weekly Aggregates

One row per category-week with flattened structural, AI, thematic, and concern metrics.

Download CSV

Document Scores

One row per scored document with keyword matches, severity scores, and document class.

Download CSV

Full Database#

For developers and researchers who need the complete dataset, a PostgreSQL dump is available for download. The dump is a single pg_dump -Fc file including all tables (~2 GB): source documents, AI assessments, weekly aggregates, baselines, narratives, and vector embeddings. Updated weekly.

Download PostgreSQL dump

Key tables for researchers

Table	Description
documents	Source documents with metadata, content, source type
ai_document_assessments	Per-document AI review (P1 flag, P2 classification, reasoning)
weekly_aggregates	Category-week rollups with structural/AI/thematic/concern scores
baselines	Biden-era baseline statistics for comparison
narratives	AI-generated weekly and term summaries
document_scores	Per-document keyword assessment scores

Setup

# Automatic (recommended)

createdb democracy_monitor

pnpm db:init --force

# Manual

curl -LO https://democracymonitor.us/api/data/dump

pg_restore --clean --if-exists --no-owner -d democracy_monitor dump

pnpm db:migrate

See DEPLOYMENT.md for full setup instructions. Schema is defined in lib/db/schema.ts.

API Endpoints#

Endpoint	Params	Description
/api/export/weekly	format (csv\|json), category, from, to	Weekly aggregate data per category with structural/AI/thematic scores
/api/export/scores	format (csv\|json), category, from, to	Per-document keyword assessment scores with match details

All endpoints default to JSON. Add ?format=csv for CSV output with flattened columns (no JSON blobs).

CSV Column Reference#

CSV exports flatten nested JSON fields into individual columns. Weekly aggregates include prefixed columns for each detection layer:

structural_* — Composite score, per-dimension z-scores with raw values, baseline means, and baseline standard deviations (volume, type composition, functional distribution, agency activity, publication tempo, source convergence), anomalous flag, drift trend, long-horizon cumulative deviation/window, functional shifts (bucket:direction pairs)
ai_* — Flag count, total documents, flag/concern rates, P2 classification distribution (routine, novel, potentially/clearly concerning), audit false negative rate
thematic_* — Centroid distance, z-score, novel document rate, variance ratio, cross-admin distance, rolling window metadata (weeks, mean distance, std dev), cross-admin baseline period, bootstrap flag
concern_* — Status, pattern description, per-layer elevation flags

Document scores flatten matches and suppressed arrays into count + comma-joined keyword columns.

Rate Limits#

Export endpoints are rate-limited to 1 request per second per IP address. Responses include a Retry-After header when throttled.