How LangGraph powers legal AI agents and why it matters

Mar 17, 2026

8 Min Read

Hub-and-spoke diagram with a central processor node connected by solid and dashed lines to surrounding tool nodes on a dark background, representing a LangGraph agent selecting tools per query

The simplest way to build a legal document Q&A system is a single retrieval call followed by a single generation call. Embed the question, find the nearest chunks, pass them to the LLM, return the answer. You can build this in an afternoon.

It works fine for simple, self-contained questions against clean documents. The moment your queries get more complex — follow-up questions, multi-clause analysis, references to linked documents, questions that require reasoning across multiple retrieved passages — the single-path architecture starts failing in ways that are hard to debug and harder to fix without rearchitecting.

LangGraph is the architecture that solves this. LexReviewer uses it as the core of its agent layer. This post explains how it works and why it is the right choice for legal AI specifically.

A single retrieval path cannot adapt to query complexity. Legal documents generate queries that range from simple lookups to multi-step reasoning problems — often in the same conversation.

What LangGraph is

LangGraph is a library for building stateful, graph-based agent workflows on top of LangChain. Instead of a linear chain of steps, you define a graph of nodes and edges. Each node performs a specific operation. Edges control which node runs next based on the current state.

The key word is stateful. The graph maintains an AgentState object that persists across node executions within a run and across turns within a conversation. Every node reads from and writes to this state. That is what makes multi-step reasoning and multi-turn conversations possible without losing context.

The problem with linear chains for legal queries

In a linear RAG chain, the flow is fixed: retrieve then generate. There is no branching, no tool selection, no way to handle a query that requires a different retrieval strategy than the one hardcoded into the chain.

Legal queries do not fit a single retrieval strategy. Consider these four queries against the same contract:

What are the payment terms? — semantic, single-document
Does Section 4.2(b) apply if the force majeure clause is triggered? — exact reference plus cross-clause reasoning
How does the indemnification clause here compare to what the MSA says? — multi-document retrieval
Given what we just discussed about termination rights, what notice period applies? — follow-up requiring prior context

A linear chain handles the first query adequately. It struggles with the second, fails on the third, and has no mechanism for the fourth. LangGraph handles all four because the graph can branch based on what the query actually requires.

How LexReviewer's agent graph is structured

The DocumentReviewer graph in LexReviewer has three core nodes, each responsible for a distinct part of the agent's reasoning process.

Node 1: Required tools generator

This node analyses the incoming question and the current conversation state to decide which tools are needed. It has access to two tools:

document_retriever hybrid vector and BM25 search scoped to the current document_id
linked_documents fetches and queries documents referenced by the current document

The node outputs a tool selection decision that gets written to AgentState. This decision drives the edges that follow — if only in-document retrieval is needed, the graph routes directly to the agent prompt generator. If linked document retrieval is also needed, it routes through that tool first.

This is the critical difference from a linear chain. The retrieval strategy is not hardcoded. It is determined per query based on what the question actually requires.

Node 2: Agent prompt generator

This node takes the retrieved context from whichever tools ran and composes the prompt for the LLM. It injects:

The user's question
Retrieved passages with their source metadata
The conversation history from AgentState
The legal-answer prompt template that enforces citation and grounding requirements

The prompt template is what enforces legal output quality. It instructs the LLM to ground every claim in retrieved text, to cite sources explicitly, and to decline rather than speculate when the answer is not in the retrieved content.

Node 3: Agent node

This node runs the OpenAI LLM against the composed prompt and streams the output. It can run in standard mode or reasoning mode depending on query complexity. The output streams as NDJSON with three interleaved content types:

{ "type": "chunk, "content": "The termination clause..." } { "type": "thought, "content": "Checking Section 4.2 for..." } { "type": "reference_positions", "content": { "page": 12, "bbox": [...] } }

References and bounding boxes stream alongside the answer text rather than being appended after. This means the UI can highlight source passages in the original document while the answer is still generating.

How conversation state works

AgentState is the object that persists across the entire graph execution for a given turn, and its history component persists across turns within a session.

Each turn appends to the conversation history stored in MongoDB using a session ID scoped to user_id and document_id. When a follow-up question arrives, the agent prompt generator node reads the full history and injects it into the prompt. The LLM has the complete prior context and can answer follow-up questions that reference earlier parts of the conversation.

For long conversations, a history summariser condenses older turns into a summary that preserves the essential context without consuming the full context window. This keeps multi-turn conversations stable without hitting token limits.

Context that evaporates after two messages is not conversation. It is a series of disconnected lookups. Legal workflows require real continuity across turns.

Linked document routing in practice

The linked documents tool is what makes LexReviewer handle real contracts. When a user asks about something defined in a referenced document, the required tools generator node detects that the question may require external context and routes to the linked_documents tool.

This tool calls the LINKED_DOCUMENT_FETCH_URL endpoint, retrieves the referenced document, runs retrieval against it, and returns the relevant passages with their source metadata. These passages are then included in the agent prompt generator node alongside the in-document results.

The answer the user receives reflects the full contract context — not just the document they uploaded, but the amendment, the schedule, or the MSA that the document points to. Without this routing, a significant proportion of real contract questions produce incomplete answers.

Why this architecture is the right fit for legal AI

Three properties of LangGraph make it specifically well-suited for legal workflows.

Explicit state management

Legal conversations require tracking what was established earlier in the session — what documents were reviewed, what questions were already answered, what context was set. AgentState makes this explicit and auditable rather than implicit and fragile.

Branching without complexity explosion

The graph structure lets you add new tools and new routing logic without rewriting the core agent. Adding a new retrieval source, a new tool, or a new reasoning path is a matter of adding a node and an edge — not refactoring a monolithic chain.

Streaming support

LangGraph's streaming output model maps directly onto the NDJSON streaming response pattern LexReviewer uses. Citations and answer text stream together because the graph emits them together. This is architecturally clean in a way that retrofitting streaming onto a linear chain is not.

Getting started

The full LangGraph agent implementation is in the agent_graph directory of the LexReviewer repo. The node definitions, state schema, and graph assembly are all readable and extensible. If you want to add a new tool — a different legal data source, a custom classification step, a domain-specific prompt layer — the graph structure makes that a contained change.

Repo: LexReviewer

LexStack is open-source infrastructure for legal AI. It includes LexReviewer for document RAG, Law MCP for structured legal tools, and MicroEvals for CI-native evaluation.

‹ Why we open-sourced LexStack — and what we are building on top of it

Why persistent chat history is non-negotiable for legal AI ›