# Election Analytics Chatbot - Backend Guide ## Overview The backend is a Python-based FastAPI application that leverages **LangGraph** to provide a stateful, hierarchical multi-agent workflow for election data analysis. It handles complex queries using an Orchestrator-Workers pattern, decomposing tasks and delegating them to specialized subgraphs (Data Analyst, Researcher) with built-in reflection and error recovery. ## 1. Architecture Overview - **Framework**: LangGraph for hierarchical workflow orchestration and state management. - **API**: FastAPI for providing REST and streaming (SSE) endpoints. - **State Management**: Persistent state using LangGraph's `StateGraph` with a PostgreSQL checkpointer. Maintains global state (`AgentState`) and isolated worker states (`WorkerState`). - **Virtual File System (VFS)**: An in-memory abstraction passed between nodes to manage intermediate artifacts (scripts, CSVs, charts) without bloating the context window. - **Database**: PostgreSQL. - Application data: Uses `users` table for local and OIDC users (String IDs). - History: Persists chat history and artifacts. - Election Data: Structured datasets for analysis. ## 2. Core Components ### 2.1. State Management (`src/ea_chatbot/graph/state.py` & `workers/*/state.py`) - **Global State**: Tracks the conversation context, the high-level task `checklist`, execution progress (`current_step`), and the VFS. - **Worker State**: Isolated snapshot for specialized subgraphs, tracking internal retry loops (`iterations`), worker-specific prompts, and raw results. ### 2.2. The Orchestrator Located in `src/ea_chatbot/graph/nodes/`: - **`query_analyzer`**: Analyzes the user query to determine the intent and required data. If ambiguous, routes to `clarification`. - **`planner`**: Decomposes the user request into a strategic `checklist` of sub-tasks assigned to specific workers. - **`delegate`**: The traffic controller. Routes the current task to the appropriate worker and enforces a strict retry budget to prevent infinite loops. - **`reflector`**: The quality control node. Evaluates a worker's summary against the sub-task requirements. Can trigger a retry if unsatisfied. - **`synthesizer`**: Aggregates all worker results into a final, cohesive response for the user. - **`clarification`**: Asks the user for more information if the query is critically ambiguous. ### 2.3. Specialized Workers (Sub-Graphs) Located in `src/ea_chatbot/graph/workers/`: - **`data_analyst`**: Generates Python/SQL code, executes it securely, and captures dataframes/plots. Contains an internal retry loop (`coder` -> `executor` -> error check -> `coder`). - **`researcher`**: Performs web searches for general election information and synthesizes factual findings. ### 2.4. The Workflow The global graph connects the Orchestrator nodes, wrapping the Worker subgraphs as self-contained nodes with mapped inputs and outputs. ## 3. Key Modules - **`src/ea_chatbot/api/`**: Contains FastAPI routers for authentication, conversation management, and the agent streaming endpoint. - **`src/ea_chatbot/graph/`**: Core LangGraph logic, including state definitions, node implementations, and the workflow graph. - **`src/ea_chatbot/history/`**: Manages persistent chat history and message mapping between application models and LangGraph state. - **`src/ea_chatbot/utils/`**: Utility functions for database inspection, LLM factory, and logging. ## 4. Development & Execution ### Entry Point The main entry point for the API is `src/ea_chatbot/api/main.py`. ### Running the API ```bash cd backend uv run python -m ea_chatbot.api.main ``` ### Database Migrations Handled by Alembic. ```bash uv run alembic upgrade head ``` ### Testing Tests are located in the `tests/` directory and use `pytest`. ```bash uv run pytest ```