Election Analytics Chatbot - Backend Guide

Overview

The backend is a Python-based FastAPI application that leverages LangGraph to provide a stateful, hierarchical multi-agent workflow for election data analysis. It handles complex queries using an Orchestrator-Workers pattern, decomposing tasks and delegating them to specialized subgraphs (Data Analyst, Researcher) with built-in reflection and error recovery.

1. Architecture Overview

Framework: LangGraph for hierarchical workflow orchestration and state management.
API: FastAPI for providing REST and streaming (SSE) endpoints.
State Management: Persistent state using LangGraph's StateGraph with a PostgreSQL checkpointer. Maintains global state (AgentState) and isolated worker states (WorkerState).
Virtual File System (VFS): An in-memory abstraction passed between nodes to manage intermediate artifacts (scripts, CSVs, charts) without bloating the context window.
Database: PostgreSQL.
- Application data: Uses users table for local and OIDC users (String IDs).
- History: Persists chat history and artifacts.
- Election Data: Structured datasets for analysis.

2. Core Components

2.1. State Management (`src/ea_chatbot/graph/state.py` & `workers/*/state.py`)

Global State: Tracks the conversation context, the high-level task checklist, execution progress (current_step), and the VFS.
Worker State: Isolated snapshot for specialized subgraphs, tracking internal retry loops (iterations), worker-specific prompts, and raw results.

2.2. The Orchestrator

Located in src/ea_chatbot/graph/nodes/:

query_analyzer: Analyzes the user query to determine the intent and required data. If ambiguous, routes to clarification.
planner: Decomposes the user request into a strategic checklist of sub-tasks assigned to specific workers.
delegate: The traffic controller. Routes the current task to the appropriate worker and enforces a strict retry budget to prevent infinite loops.
reflector: The quality control node. Evaluates a worker's summary against the sub-task requirements. Can trigger a retry if unsatisfied.
synthesizer: Aggregates all worker results into a final, cohesive response for the user.
clarification: Asks the user for more information if the query is critically ambiguous.

2.3. Specialized Workers (Sub-Graphs)

Located in src/ea_chatbot/graph/workers/:

data_analyst: Generates Python/SQL code, executes it securely, and captures dataframes/plots. Contains an internal retry loop (coder -> executor -> error check -> coder).
researcher: Performs web searches for general election information and synthesizes factual findings.

2.4. The Workflow

The global graph connects the Orchestrator nodes, wrapping the Worker subgraphs as self-contained nodes with mapped inputs and outputs.

3. Key Modules

src/ea_chatbot/api/: Contains FastAPI routers for authentication, conversation management, and the agent streaming endpoint.
src/ea_chatbot/graph/: Core LangGraph logic, including state definitions, node implementations, and the workflow graph.
src/ea_chatbot/history/: Manages persistent chat history and message mapping between application models and LangGraph state.
src/ea_chatbot/utils/: Utility functions for database inspection, LLM factory, and logging.

4. Development & Execution

Entry Point

The main entry point for the API is src/ea_chatbot/api/main.py.

Running the API

cd backend
uv run python -m ea_chatbot.api.main

Database Migrations

Handled by Alembic.

uv run alembic upgrade head

Testing

Tests are located in the tests/ directory and use pytest.

uv run pytest

3.7 KiB Raw Blame History