Monitoring democratic institutions through public records

← Back to overview

Data

Democracy Monitor is open data. All assessment results, source documents, and computed metrics are available for download and programmatic access.

CSV Downloads#

Weekly Aggregates

One row per category-week with flattened structural, AI, thematic, and concern metrics.

Download CSV

Document Scores

One row per scored document with keyword matches, severity scores, and document class.

Download CSV

Full Database#

For developers and researchers who need the complete dataset, a PostgreSQL dump is available for download. The dump is a single pg_dump -Fc file including all tables (~2 GB): source documents, AI assessments, weekly aggregates, baselines, narratives, and vector embeddings. Updated weekly.

Key tables for researchers

TableDescription
documentsSource documents with metadata, content, source type
ai_document_assessmentsPer-document AI review (P1 flag, P2 classification, reasoning)
weekly_aggregatesCategory-week rollups with structural/AI/thematic/concern scores
baselinesBiden-era baseline statistics for comparison
narrativesAI-generated weekly and term summaries
document_scoresPer-document keyword assessment scores

Setup

# Automatic (recommended)

createdb democracy_monitor

pnpm db:init --force

# Manual

curl -LO https://democracymonitor.us/api/data/dump

pg_restore --clean --if-exists --no-owner -d democracy_monitor dump

pnpm db:migrate

See DEPLOYMENT.md for full setup instructions. Schema is defined in lib/db/schema.ts.

API Endpoints#

EndpointParamsDescription
/api/export/weeklyformat (csv|json), category, from, toWeekly aggregate data per category with structural/AI/thematic scores
/api/export/scoresformat (csv|json), category, from, toPer-document keyword assessment scores with match details

All endpoints default to JSON. Add ?format=csv for CSV output with flattened columns (no JSON blobs).

CSV Column Reference#

CSV exports flatten nested JSON fields into individual columns. Weekly aggregates include prefixed columns for each detection layer:

  • structural_* — Composite score, per-dimension z-scores with raw values, baseline means, and baseline standard deviations (volume, type composition, functional distribution, agency activity, publication tempo, source convergence), anomalous flag, drift trend, long-horizon cumulative deviation/window, functional shifts (bucket:direction pairs)
  • ai_* — Flag count, total documents, flag/concern rates, P2 classification distribution (routine, novel, potentially/clearly concerning), audit false negative rate
  • thematic_* — Centroid distance, z-score, novel document rate, variance ratio, cross-admin distance, rolling window metadata (weeks, mean distance, std dev), cross-admin baseline period, bootstrap flag
  • concern_* — Status, pattern description, per-layer elevation flags

Document scores flatten matches and suppressed arrays into count + comma-joined keyword columns.

Rate Limits#

Export endpoints are rate-limited to 1 request per second per IP address. Responses include a Retry-After header when throttled.