Intelligent Insurance Claim Processing Agent — Agentic RAG for Auto Insurance
Explore the Intelligent Insurance Claim Processing Agent, a production-grade Agentic RAG (Retrieval-Augmented Generation) system that fully automates the evaluation of auto insurance claims. By combining semantic policy retrieval, CSV-backed validation, real-time web search for repair cost estimation, and dual interchangeable AI pipelines, this system delivers structured, auditable coverage decisions — transforming a labor-intensive manual process into a transparent, sub-second automated workflow.
Project Overview
The Claim Processing Agent ingests auto insurance claims as structured JSON, validates them against policy records, retrieves relevant policy language from a ChromaDB vector store, estimates market repair costs via DuckDuckGo web search, and produces structured ClaimDecision objects — all orchestrated by an AI agent. The system features two fully interchangeable pipelines: a deterministic LangChain/LangGraph state graph and an autonomous Smolagents agent, selectable via a single Hydra configuration change. With 58 automated tests, a FastAPI REST backend, a Streamlit UI with reasoning trace viewer, and Docker deployment, the project represents a complete production-grade agentic system built on modern AI engineering practices.
Technology Stack & Architecture
Core Technologies & Architectural Decisions
- Dual Agentic Pipelines: LangChain/LangGraph stateful graph and Smolagents autonomous agent — fully interchangeable via Hydra config with no code changes required
- Agentic RAG: ChromaDB vector store with OpenAI
text-embedding-3-smallembeddings for semantic retrieval of policy document chunks - REST Backend: FastAPI with Pydantic v2 request/response schemas, structured middleware for logging and error handling, and async Uvicorn serving
- Interactive Frontend: Streamlit UI with claim form (manual and JSON input modes), coverage decision result card, and step-by-step agent reasoning trace viewer
- Configuration Management: Hydra / OmegaConf for structured YAML config — all pipeline, LLM, vector store, and server settings externalized
- Production Deployment: Multi-stage Docker builds with Docker Compose orchestration, environment variable injection, and health-check configuration
- Code Quality: Poetry packaging, Ruff linting/formatting, 58 pytest tests, MkDocs documentation, and Loguru structured logging
AI Architecture Innovation
- Deterministic LangGraph Orchestration: Eight-node
StateGraphwith explicit conditional edges encoding business logic — invalid policy and inflated cost checks cause deterministic graph branching - Autonomous Smolagents Alternative:
ToolCallingAgentwith custom system prompt enforcing the sequential workflow, offering a flexible autonomous alternative to the graph-based approach - Repair Cost Intelligence: Real-time DuckDuckGo web search estimates market repair costs per claim description, flagging inflated claims exceeding 40% above market estimates
- Persistent Vector Store: Load-or-build pattern that detects an existing ChromaDB collection and skips re-embedding on subsequent runs, reducing startup overhead
- Full Execution Tracing: Every graph node records its name, entry timestamp, and elapsed time in a
tracelist — surfaced in the Streamlit UI for complete observability - Pipeline Factory Pattern: Lazy-import factory reads
cfg.pipeline.typeand instantiates the correct pipeline, allowing runtime swapping with no dependency overhead
Technical Innovation & System Design
LangGraph Claim Processing Workflow
The LangChain pipeline implements a precise eight-node StateGraph that maps directly to the business workflow with 100% deterministic routing:
Processing Nodes & Conditional Edges:
- parse_claim: Validates incoming JSON against the
ClaimInfoPydantic schema, surfacing malformed input early - validate_claim: Checks policy existence, premium dues status, and coverage date range in the CSV records — conditional edge routes to
finalize_invalidon failure - check_policy: LLM generates 3–5 targeted semantic search queries from claim details; ChromaDB returns the most relevant policy text chunks
- price_check: DuckDuckGo web search retrieves market repair cost estimates — conditional edge routes to
finalize_inflatedwhen the claimed amount exceeds market by more than 40% - generate_recommendation: GPT-4o synthesizes claim data, retrieved policy text, and cost validation into a structured
PolicyRecommendation - finalize_decision / finalize_invalid / finalize_inflated: Three terminal nodes assembling the final
ClaimDecisionwith appropriate coverage flags, deductible, payout, and notes
Smolagents Autonomous Pipeline:
- ToolCallingAgent Orchestration: Five registered tools covering parsing, validation, query generation, retrieval, and recommendation — the agent autonomously decides execution order within guardrails
- Custom Prompt Engineering: Structured system prompt enforces the sequential workflow, preventing the agent from skipping validation or retrieval steps
- Schema Guardrails: Agent output is validated against the
ClaimDecisionPydantic schema with one automatic retry on malformed responses - OpenAIServerModel Backend: Provider-agnostic LLM interface supporting GPT-4o-mini, with extensibility for Anthropic Claude or local Ollama models
Technical Excellence & Innovation Leadership
Measurable Technical Achievements
Delivered quantifiable results demonstrating advanced agentic AI engineering and insurance domain expertise:
- 58 Automated Tests: Comprehensive pytest suite covering schemas, validation logic, retrieval accuracy, both pipeline implementations, and FastAPI endpoints
- Dual Pipeline Architecture: Two fully interchangeable orchestration strategies — deterministic LangGraph graph and autonomous Smolagents agent — selectable without code changes
- Fraud Detection via Web Search: Real-time DuckDuckGo integration flags inflated repair cost claims by comparing against market estimates, adding an AI-powered anti-fraud layer
- Full Reasoning Transparency: Step-by-step agent trace surfaced in the Streamlit UI, enabling auditors and adjusters to inspect every decision the agent made
- Sub-Second Policy Retrieval: Persistent ChromaDB load-or-build pattern delivers instant vector store access after the initial ingestion run
Architectural Excellence & System Design
Engineered a clean separation of concerns across all system layers:
- Layered Architecture: Streamlit frontend communicates exclusively via REST API — no direct Python imports from the backend, ensuring true microservices separation
- Abstract Pipeline Interface:
BasePipeline(ABC)contract enforces a consistentprocess_claim()API across both pipeline implementations - Core Logic Isolation: Validation, ingestion, and retrieval are plain Python functions in
src/claim_agent/core/— framework-agnostic and directly unit-testable - Configuration-Driven Behavior: Hydra YAML controls pipeline type, LLM model, embedding model, chunk sizes, retrieval depth, and server settings — no hardcoded values anywhere
- Structured Observability: Loguru logging with contextual fields (claim number, pipeline type, tool name, timing) for production-grade diagnostics
AI Engineering & LLM Orchestration
Pioneered practical agentic RAG patterns for insurance domain automation:
- LLM-Powered Query Generation: GPT-4o generates 3–5 targeted semantic search queries from claim context, dramatically improving retrieval precision over keyword-based approaches
- Grounded Recommendation Generation: Coverage recommendations are anchored to retrieved policy text chunks, preventing hallucination on legally sensitive insurance language
- Multi-Framework Competency: Side-by-side implementations in both LangChain/LangGraph and Smolagents demonstrate deep familiarity with the current agentic AI ecosystem
- Healthcare-Grade Pydantic Validation: All inputs and outputs enforced through Pydantic v2 models, catching schema violations at system boundaries before they propagate
- Conditional Business Logic as Graph Edges: Insurance adjudication rules (policy validity, cost inflation thresholds) encoded as LangGraph conditional edges rather than buried in LLM prompts
Quality Assurance & Production Excellence
Comprehensive Testing Framework
- 58 Automated Tests: Full pytest suite across schemas, claim validation, ChromaDB retrieval, LangChain pipeline, Smolagents pipeline, and FastAPI routes
- Scenario-Based Validation: Four representative claim fixtures covering valid claims, expired policies, invalid policies, and inflated repair costs — ensuring all graph branches are exercised
- Async API Testing:
pytest-asynciowithhttpxtest client validates REST endpoint behavior including error handling and malformed input responses - Ruff Zero-Warning Codebase: Enforced import sorting, type annotation checks, and best-practice linting rules across the entire codebase
- Makefile Automation: Single-command workflows —
make test,make lint,make format,make ingest,make docker-build— streamlining the full development lifecycle
Engineering Excellence & Problem-Solving Innovation
Dual Pipeline Architecture with Single Interface
Challenge: Supporting two fundamentally different orchestration paradigms (explicit graph vs. autonomous agent) without forcing consumers to know which pipeline is active. Solution: Designed the BasePipeline(ABC) abstract interface with a process_claim() contract, a lazy-import pipeline factory reading from Hydra config, and completely independent tool implementations in each pipeline's subdirectory — zero shared mutable state between pipelines.
Repair Cost Fraud Detection
Challenge: Auto insurance claims frequently include inflated repair cost estimates. Detecting fraud without access to internal adjuster databases requires creative tooling. Solution: Integrated DuckDuckGo web search as a real-time market intelligence tool — the agent searches for repair cost estimates matching the claim's vehicle and damage description, then flags claims where the submitted estimate exceeds market rates by more than 40%, triggering the finalize_inflated graph path.
Deterministic Agent Tracing for Auditability
Challenge: Insurance coverage decisions require full audit trails — "the AI said so" is not acceptable in regulated environments. Solution: Every LangGraph node appends a structured entry to the trace list in ClaimState, recording node name, timestamp, inputs, and elapsed time. The FastAPI response includes the full trace, and the Streamlit trace viewer renders it step-by-step, giving adjusters a complete view of every decision the agent made.
Framework-Agnostic Core Logic
Challenge: Keeping domain logic (policy validation, PDF ingestion, vector retrieval) portable across both pipeline frameworks without code duplication. Solution: Implemented all business logic as plain Python functions in src/claim_agent/core/ — no LangChain or Smolagents imports — then wrapped them with framework-specific tool decorators in each pipeline's tools.py. This allows the core logic to be unit-tested in complete isolation from the agent frameworks.
Flexible Configuration & Tuning
Every aspect of the system is configurable through Hydra YAML, enabling rapid experimentation without code changes:
| Parameter | Config Key | Default | Effect |
|---|---|---|---|
| Pipeline type | pipeline.type | langchain | Switch between LangGraph and Smolagents |
| LLM model | llm.model | gpt-4o-mini | Swap to gpt-4o, Claude, Ollama, etc. |
| Temperature | llm.temperature | 0.1 | Lower = more deterministic decisions |
| Max tokens | llm.max_tokens | 4096 | Cap on recommendation length |
| Embedding model | vectordb.embedding_model | all-MiniLM-L6-v2 | Swap for OpenAI or other embeddings |
| Chunk size | vectordb.chunk_size | 500 | Larger = more context per retrieval chunk |
| Retrieval depth | vectordb.n_results | 5 | More results = richer policy context |
| Inflation threshold | pipeline.inflation_threshold | 0.40 | Flag claims exceeding market estimate by this % |
Scalability Vision & Future Roadmap
Planned Enhancements
- Multi-Line Insurance Support: Extend beyond auto insurance to home, life, and health claims with domain-specific validation rules and policy corpora
- LLM-as-Judge Evaluation: Automated scoring of coverage recommendation quality — groundedness, completeness, and regulatory compliance — for continuous quality monitoring
- Human-in-the-Loop Escalation: Configurable confidence thresholds that pause the pipeline and route borderline claims to a human adjuster queue before finalizing decisions
- Structured Output Streaming: Server-Sent Events (SSE) from FastAPI for real-time step-by-step trace streaming to the Streamlit UI during long-running agent runs
- Multi-Modal Claim Intake: Accept photo evidence and repair shop invoices alongside JSON claims, using vision models to cross-validate damage descriptions against submitted images
- Enterprise Auth & Multi-Tenancy: JWT-based authentication with per-tenant policy corpora, enabling deployment across multiple insurance carriers from a single platform
Leadership Impact & Technical Excellence
The Intelligent Insurance Claim Processing Agent represents a complete demonstration of production-grade agentic AI engineering — from LangGraph state machine design and ChromaDB RAG pipelines to FastAPI backend development, Streamlit UI construction, and Docker deployment. The dual-pipeline architecture, with both a deterministic LangGraph graph and an autonomous Smolagents agent sharing the same interface, showcases deep familiarity with the modern agentic AI ecosystem and the engineering discipline to keep complex systems clean and testable.
The project's emphasis on auditability and transparency — full execution traces, Pydantic-validated outputs at every boundary, structured logging, and a reasoning trace viewer in the UI — reflects the real-world requirements of deploying AI in regulated industries where "the model decided" is never a sufficient explanation. Every architectural decision prioritizes explainability alongside automation.
Quantified Technical Achievements: 58 automated tests across five test modules, two fully interchangeable agentic pipelines, real-time web search fraud detection, sub-second vector store retrieval, and single-command Docker deployment. This project exemplifies the intersection of agentic AI architecture, software engineering rigor, and practical insurance domain expertise.
Technical Resources & Documentation
Comprehensive technical implementation showcasing production-grade agentic RAG architecture and insurance claim automation system design: