Files
ea-chatbot-lg/backend/GEMINI.md

3.7 KiB

Election Analytics Chatbot - Backend Guide

Overview

The backend is a Python-based FastAPI application that leverages LangGraph to provide a stateful, hierarchical multi-agent workflow for election data analysis. It handles complex queries using an Orchestrator-Workers pattern, decomposing tasks and delegating them to specialized subgraphs (Data Analyst, Researcher) with built-in reflection and error recovery.

1. Architecture Overview

  • Framework: LangGraph for hierarchical workflow orchestration and state management.
  • API: FastAPI for providing REST and streaming (SSE) endpoints.
  • State Management: Persistent state using LangGraph's StateGraph with a PostgreSQL checkpointer. Maintains global state (AgentState) and isolated worker states (WorkerState).
  • Virtual File System (VFS): An in-memory abstraction passed between nodes to manage intermediate artifacts (scripts, CSVs, charts) without bloating the context window.
  • Database: PostgreSQL.
    • Application data: Uses users table for local and OIDC users (String IDs).
    • History: Persists chat history and artifacts.
    • Election Data: Structured datasets for analysis.

2. Core Components

2.1. State Management (src/ea_chatbot/graph/state.py & workers/*/state.py)

  • Global State: Tracks the conversation context, the high-level task checklist, execution progress (current_step), and the VFS.
  • Worker State: Isolated snapshot for specialized subgraphs, tracking internal retry loops (iterations), worker-specific prompts, and raw results.

2.2. The Orchestrator

Located in src/ea_chatbot/graph/nodes/:

  • query_analyzer: Analyzes the user query to determine the intent and required data. If ambiguous, routes to clarification.
  • planner: Decomposes the user request into a strategic checklist of sub-tasks assigned to specific workers.
  • delegate: The traffic controller. Routes the current task to the appropriate worker and enforces a strict retry budget to prevent infinite loops.
  • reflector: The quality control node. Evaluates a worker's summary against the sub-task requirements. Can trigger a retry if unsatisfied.
  • synthesizer: Aggregates all worker results into a final, cohesive response for the user.
  • clarification: Asks the user for more information if the query is critically ambiguous.

2.3. Specialized Workers (Sub-Graphs)

Located in src/ea_chatbot/graph/workers/:

  • data_analyst: Generates Python/SQL code, executes it securely, and captures dataframes/plots. Contains an internal retry loop (coder -> executor -> error check -> coder).
  • researcher: Performs web searches for general election information and synthesizes factual findings.

2.4. The Workflow

The global graph connects the Orchestrator nodes, wrapping the Worker subgraphs as self-contained nodes with mapped inputs and outputs.

3. Key Modules

  • src/ea_chatbot/api/: Contains FastAPI routers for authentication, conversation management, and the agent streaming endpoint.
  • src/ea_chatbot/graph/: Core LangGraph logic, including state definitions, node implementations, and the workflow graph.
  • src/ea_chatbot/history/: Manages persistent chat history and message mapping between application models and LangGraph state.
  • src/ea_chatbot/utils/: Utility functions for database inspection, LLM factory, and logging.

4. Development & Execution

Entry Point

The main entry point for the API is src/ea_chatbot/api/main.py.

Running the API

cd backend
uv run python -m ea_chatbot.api.main

Database Migrations

Handled by Alembic.

uv run alembic upgrade head

Testing

Tests are located in the tests/ directory and use pytest.

uv run pytest