Intelligent Insurance Claim Processing Agent — Agentic RAG for Auto Insurance

Intelligent Insurance Claim Processing Agent - Agentic RAG by Ali Zarreh

Explore the Intelligent Insurance Claim Processing Agent, a production-grade Agentic RAG (Retrieval-Augmented Generation) system that fully automates the evaluation of auto insurance claims. By combining semantic policy retrieval, CSV-backed validation, real-time web search for repair cost estimation, and dual interchangeable AI pipelines, this system delivers structured, auditable coverage decisions — transforming a labor-intensive manual process into a transparent, sub-second automated workflow.

Project Overview

The Claim Processing Agent ingests auto insurance claims as structured JSON, validates them against policy records, retrieves relevant policy language from a ChromaDB vector store, estimates market repair costs via DuckDuckGo web search, and produces structured ClaimDecision objects — all orchestrated by an AI agent. The system features two fully interchangeable pipelines: a deterministic LangChain/LangGraph state graph and an autonomous Smolagents agent, selectable via a single Hydra configuration change. With 58 automated tests, a FastAPI REST backend, a Streamlit UI with reasoning trace viewer, and Docker deployment, the project represents a complete production-grade agentic system built on modern AI engineering practices.

Technology Stack & Architecture

Core Technologies & Architectural Decisions

Dual Agentic Pipelines: LangChain/LangGraph stateful graph and Smolagents autonomous agent — fully interchangeable via Hydra config with no code changes required
Agentic RAG: ChromaDB vector store with OpenAI text-embedding-3-small embeddings for semantic retrieval of policy document chunks
REST Backend: FastAPI with Pydantic v2 request/response schemas, structured middleware for logging and error handling, and async Uvicorn serving
Interactive Frontend: Streamlit UI with claim form (manual and JSON input modes), coverage decision result card, and step-by-step agent reasoning trace viewer
Configuration Management: Hydra / OmegaConf for structured YAML config — all pipeline, LLM, vector store, and server settings externalized
Production Deployment: Multi-stage Docker builds with Docker Compose orchestration, environment variable injection, and health-check configuration
Code Quality: Poetry packaging, Ruff linting/formatting, 58 pytest tests, MkDocs documentation, and Loguru structured logging

AI Architecture Innovation

Deterministic LangGraph Orchestration: Eight-node StateGraph with explicit conditional edges encoding business logic — invalid policy and inflated cost checks cause deterministic graph branching
Autonomous Smolagents Alternative: ToolCallingAgent with custom system prompt enforcing the sequential workflow, offering a flexible autonomous alternative to the graph-based approach
Repair Cost Intelligence: Real-time DuckDuckGo web search estimates market repair costs per claim description, flagging inflated claims exceeding 40% above market estimates
Persistent Vector Store: Load-or-build pattern that detects an existing ChromaDB collection and skips re-embedding on subsequent runs, reducing startup overhead
Full Execution Tracing: Every graph node records its name, entry timestamp, and elapsed time in a trace list — surfaced in the Streamlit UI for complete observability
Pipeline Factory Pattern: Lazy-import factory reads cfg.pipeline.type and instantiates the correct pipeline, allowing runtime swapping with no dependency overhead

Technical Innovation & System Design

LangGraph Claim Processing Workflow

The LangChain pipeline implements a precise eight-node StateGraph that maps directly to the business workflow with 100% deterministic routing:

Processing Nodes & Conditional Edges:

parse_claim: Validates incoming JSON against the ClaimInfo Pydantic schema, surfacing malformed input early
validate_claim: Checks policy existence, premium dues status, and coverage date range in the CSV records — conditional edge routes to finalize_invalid on failure
check_policy: LLM generates 3–5 targeted semantic search queries from claim details; ChromaDB returns the most relevant policy text chunks
price_check: DuckDuckGo web search retrieves market repair cost estimates — conditional edge routes to finalize_inflated when the claimed amount exceeds market by more than 40%
generate_recommendation: GPT-4o synthesizes claim data, retrieved policy text, and cost validation into a structured PolicyRecommendation
finalize_decision / finalize_invalid / finalize_inflated: Three terminal nodes assembling the final ClaimDecision with appropriate coverage flags, deductible, payout, and notes

Smolagents Autonomous Pipeline:

ToolCallingAgent Orchestration: Five registered tools covering parsing, validation, query generation, retrieval, and recommendation — the agent autonomously decides execution order within guardrails
Custom Prompt Engineering: Structured system prompt enforces the sequential workflow, preventing the agent from skipping validation or retrieval steps
Schema Guardrails: Agent output is validated against the ClaimDecision Pydantic schema with one automatic retry on malformed responses
OpenAIServerModel Backend: Provider-agnostic LLM interface supporting GPT-4o-mini, with extensibility for Anthropic Claude or local Ollama models

Technical Excellence & Innovation Leadership

Measurable Technical Achievements

Delivered quantifiable results demonstrating advanced agentic AI engineering and insurance domain expertise:

58 Automated Tests: Comprehensive pytest suite covering schemas, validation logic, retrieval accuracy, both pipeline implementations, and FastAPI endpoints
Dual Pipeline Architecture: Two fully interchangeable orchestration strategies — deterministic LangGraph graph and autonomous Smolagents agent — selectable without code changes
Fraud Detection via Web Search: Real-time DuckDuckGo integration flags inflated repair cost claims by comparing against market estimates, adding an AI-powered anti-fraud layer
Full Reasoning Transparency: Step-by-step agent trace surfaced in the Streamlit UI, enabling auditors and adjusters to inspect every decision the agent made
Sub-Second Policy Retrieval: Persistent ChromaDB load-or-build pattern delivers instant vector store access after the initial ingestion run

Architectural Excellence & System Design

Engineered a clean separation of concerns across all system layers:

Layered Architecture: Streamlit frontend communicates exclusively via REST API — no direct Python imports from the backend, ensuring true microservices separation
Abstract Pipeline Interface: BasePipeline(ABC) contract enforces a consistent process_claim() API across both pipeline implementations
Core Logic Isolation: Validation, ingestion, and retrieval are plain Python functions in src/claim_agent/core/ — framework-agnostic and directly unit-testable
Configuration-Driven Behavior: Hydra YAML controls pipeline type, LLM model, embedding model, chunk sizes, retrieval depth, and server settings — no hardcoded values anywhere
Structured Observability: Loguru logging with contextual fields (claim number, pipeline type, tool name, timing) for production-grade diagnostics

AI Engineering & LLM Orchestration

Pioneered practical agentic RAG patterns for insurance domain automation:

LLM-Powered Query Generation: GPT-4o generates 3–5 targeted semantic search queries from claim context, dramatically improving retrieval precision over keyword-based approaches
Grounded Recommendation Generation: Coverage recommendations are anchored to retrieved policy text chunks, preventing hallucination on legally sensitive insurance language
Multi-Framework Competency: Side-by-side implementations in both LangChain/LangGraph and Smolagents demonstrate deep familiarity with the current agentic AI ecosystem
Healthcare-Grade Pydantic Validation: All inputs and outputs enforced through Pydantic v2 models, catching schema violations at system boundaries before they propagate
Conditional Business Logic as Graph Edges: Insurance adjudication rules (policy validity, cost inflation thresholds) encoded as LangGraph conditional edges rather than buried in LLM prompts

Quality Assurance & Production Excellence

Comprehensive Testing Framework

58 Automated Tests: Full pytest suite across schemas, claim validation, ChromaDB retrieval, LangChain pipeline, Smolagents pipeline, and FastAPI routes
Scenario-Based Validation: Four representative claim fixtures covering valid claims, expired policies, invalid policies, and inflated repair costs — ensuring all graph branches are exercised
Async API Testing: pytest-asyncio with httpx test client validates REST endpoint behavior including error handling and malformed input responses
Ruff Zero-Warning Codebase: Enforced import sorting, type annotation checks, and best-practice linting rules across the entire codebase
Makefile Automation: Single-command workflows — make test, make lint, make format, make ingest, make docker-build — streamlining the full development lifecycle

Engineering Excellence & Problem-Solving Innovation

Dual Pipeline Architecture with Single Interface

Challenge: Supporting two fundamentally different orchestration paradigms (explicit graph vs. autonomous agent) without forcing consumers to know which pipeline is active. Solution: Designed the BasePipeline(ABC) abstract interface with a process_claim() contract, a lazy-import pipeline factory reading from Hydra config, and completely independent tool implementations in each pipeline's subdirectory — zero shared mutable state between pipelines.

Repair Cost Fraud Detection

Challenge: Auto insurance claims frequently include inflated repair cost estimates. Detecting fraud without access to internal adjuster databases requires creative tooling. Solution: Integrated DuckDuckGo web search as a real-time market intelligence tool — the agent searches for repair cost estimates matching the claim's vehicle and damage description, then flags claims where the submitted estimate exceeds market rates by more than 40%, triggering the finalize_inflated graph path.

Deterministic Agent Tracing for Auditability

Challenge: Insurance coverage decisions require full audit trails — "the AI said so" is not acceptable in regulated environments. Solution: Every LangGraph node appends a structured entry to the trace list in ClaimState, recording node name, timestamp, inputs, and elapsed time. The FastAPI response includes the full trace, and the Streamlit trace viewer renders it step-by-step, giving adjusters a complete view of every decision the agent made.

Framework-Agnostic Core Logic

Challenge: Keeping domain logic (policy validation, PDF ingestion, vector retrieval) portable across both pipeline frameworks without code duplication. Solution: Implemented all business logic as plain Python functions in src/claim_agent/core/ — no LangChain or Smolagents imports — then wrapped them with framework-specific tool decorators in each pipeline's tools.py. This allows the core logic to be unit-tested in complete isolation from the agent frameworks.

Flexible Configuration & Tuning

Every aspect of the system is configurable through Hydra YAML, enabling rapid experimentation without code changes:

Parameter	Config Key	Default	Effect
Pipeline type	`pipeline.type`	langchain	Switch between LangGraph and Smolagents
LLM model	`llm.model`	gpt-4o-mini	Swap to gpt-4o, Claude, Ollama, etc.
Temperature	`llm.temperature`	0.1	Lower = more deterministic decisions
Max tokens	`llm.max_tokens`	4096	Cap on recommendation length
Embedding model	`vectordb.embedding_model`	all-MiniLM-L6-v2	Swap for OpenAI or other embeddings
Chunk size	`vectordb.chunk_size`	500	Larger = more context per retrieval chunk
Retrieval depth	`vectordb.n_results`	5	More results = richer policy context
Inflation threshold	`pipeline.inflation_threshold`	0.40	Flag claims exceeding market estimate by this %

Scalability Vision & Future Roadmap

Planned Enhancements

Multi-Line Insurance Support: Extend beyond auto insurance to home, life, and health claims with domain-specific validation rules and policy corpora
LLM-as-Judge Evaluation: Automated scoring of coverage recommendation quality — groundedness, completeness, and regulatory compliance — for continuous quality monitoring
Human-in-the-Loop Escalation: Configurable confidence thresholds that pause the pipeline and route borderline claims to a human adjuster queue before finalizing decisions
Structured Output Streaming: Server-Sent Events (SSE) from FastAPI for real-time step-by-step trace streaming to the Streamlit UI during long-running agent runs
Multi-Modal Claim Intake: Accept photo evidence and repair shop invoices alongside JSON claims, using vision models to cross-validate damage descriptions against submitted images
Enterprise Auth & Multi-Tenancy: JWT-based authentication with per-tenant policy corpora, enabling deployment across multiple insurance carriers from a single platform

Leadership Impact & Technical Excellence

The Intelligent Insurance Claim Processing Agent represents a complete demonstration of production-grade agentic AI engineering — from LangGraph state machine design and ChromaDB RAG pipelines to FastAPI backend development, Streamlit UI construction, and Docker deployment. The dual-pipeline architecture, with both a deterministic LangGraph graph and an autonomous Smolagents agent sharing the same interface, showcases deep familiarity with the modern agentic AI ecosystem and the engineering discipline to keep complex systems clean and testable.

The project's emphasis on auditability and transparency — full execution traces, Pydantic-validated outputs at every boundary, structured logging, and a reasoning trace viewer in the UI — reflects the real-world requirements of deploying AI in regulated industries where "the model decided" is never a sufficient explanation. Every architectural decision prioritizes explainability alongside automation.

Quantified Technical Achievements: 58 automated tests across five test modules, two fully interchangeable agentic pipelines, real-time web search fraud detection, sub-second vector store retrieval, and single-command Docker deployment. This project exemplifies the intersection of agentic AI architecture, software engineering rigor, and practical insurance domain expertise.

Technical Resources & Documentation

Comprehensive technical implementation showcasing production-grade agentic RAG architecture and insurance claim automation system design:

Live Demo View Source Code Technical Documentation