Monitoring democratic institutions through public records

← Back to overview

System Architecture

Overview#

Democracy Monitor is a Next.js application backed by PostgreSQL (with pgvector for embeddings) and Redis for caching. It ingests documents from 9 government data source types, processes them through an AI-driven detection pipeline, and presents findings via an interactive dashboard.

Weekly cron jobs fetch new documents, score them, compute weekly aggregates, generate embeddings, run AI two-pass content assessment, and produce narrative summaries. The entire pipeline is automated and requires no manual intervention.

Detection Pipeline#

Documents flow through a multi-stage pipeline: fetch from external sources, store in PostgreSQL, score with keyword annotations, aggregate weekly, embed with OpenAI, assess with L2 AI content assessment (sole active detection layer) plus descriptive context layers (structural anomaly, silence/source health, thematic drift), and generate narratives for elevated categories.

Data Sources#

Nine source types provide coverage of different government activities: the Federal Register (executive orders, rules), GovInfo (presidential documents via CPD, GAO/congressional reports), Congressional Record (CREC floor speeches), CourtListener (federal courts), DOJ press releases, Inspector General reports (HHS, DOJ, SSA), LegiScan (federal legislation), FEC filings, and GDELT global news.

Deployment#

The application is deployed on Render.com with a web service (Next.js), managed PostgreSQL, Redis key-value store, and three weekly cron jobs: LegiScan fetch (Monday 01:00 UTC), snapshot pipeline (Monday 03:00 UTC), and database dump (Monday 05:00 UTC).