Why we open-sourced LexStack — and what we are building on top of it

Mar 12, 2026

6 Min Read

Minimalist node network with evenly spaced circles connected by clean lines in a geometric lattice pattern on a dark background, representing a shared open infrastructure layer

For three years, we were building a contract lifecycle management tool. It had AI-powered contract review, drafting assistance, and data extraction. We spent most of that time working on one problem: making the AI outputs trustworthy enough that lawyers would actually rely on them.

We got good at it. The retrieval architecture, the citation system, the evaluation layer — these were the parts that made the product work. They were also the parts that took the longest to build and the parts that every other team building legal AI was building independently, making the same mistakes, hitting the same walls.

Every legal AI team is solving the same infrastructure problems in isolation. That is the waste we wanted to fix.

We eventually concluded the full CLM product was not the right thing to keep building. The market for standalone CLM tools is crowded and the value we were adding was in the infrastructure layer, not the application layer. So we made a decision: stop building the product, open-source the best parts, and build a shared foundation that any team can use.

Why open source

The case for open-sourcing infrastructure is straightforward. Legal AI infrastructure is not a competitive advantage — it is a prerequisite. No team wins because they built a better PDF chunker or a smarter retrieval merge strategy. Teams win because of the legal knowledge they encode, the workflows they design, and the trust they build with their users.

Keeping infrastructure closed means every team pays the same tax. Open-sourcing it means the tax gets paid once — by the community — and everyone benefits from the improvements that accumulate.

The current trend across the AI tooling space confirms this. The most successful AI infrastructure companies open-source the core and build commercial products on top. LangChain, Qdrant, DeepEval — the pattern is consistent. Open source builds trust, drives adoption, and creates a feedback loop that makes the core better faster than any single team could.

What we open-sourced and why

LexReviewer

The contract review backend — PDF ingestion, hybrid retrieval, LangGraph agent routing, citation streaming, linked document awareness, chat history, observability. This is the part we spent the most time on and the part most teams would have to build themselves. It is the foundation.

MicroEvals

The evaluation layer — legal-specific metrics, small curated datasets, a CLI that runs in CI. This is the part most teams skip and regret. Making it open source lowers the barrier to building legal AI with evaluation from the start rather than bolting it on later.

What we are building on top

Open-sourcing the core does not mean everything is free. The commercial layer is what funds the work that keeps the open-source layer moving.

Law MCP is the first commercial product: structured access to legal data sources as callable agent tools. It is priced per tool call with prepaid credits. No subscription to start.

The hosted infrastructure for MicroEvals is the second layer. Running evals in our infrastructure means we handle the compute, the scaling, and the dataset maintenance. You get the results.

Full evals — comprehensive benchmarking that shows you where your legal AI stands relative to the market — are private and available as a paid product. MicroEvals are open. Full evals are not. The distinction makes sense: MicroEvals need to run frequently and need to be auditable. Full evals are a periodic benchmark and their value is partly in the market comparison data we maintain.

Who this is for

The primary audience is developers. Not legal teams, not enterprises buying software — developers building legal AI products, either in service companies or early-stage startups. They are the ones who hit the infrastructure wall first and who benefit most from having it solved.

The funnel is GitHub and Reddit, not sales calls. If you are a developer building something in legal AI, you should be able to find LexStack through the work itself — the repos, the posts, the tools — not through a marketing campaign.

The bigger vision

The vision is straightforward: standardise the legal AI stack the way cloud standardised infrastructure. Right now, every team is making independent decisions about chunking strategies, retrieval architectures, evaluation metrics, and legal data access. There is no shared foundation.

LexStack is that foundation. Not because we decided it should be, but because we built the thing and the thing turned out to be useful for more people than just us.

The roadmap extends beyond legal. The same infrastructure pattern applies to finance, compliance, healthcare — any domain where AI outputs need to be grounded, cited, and evaluated against domain-specific standards. Legal is where we started because it is where we had the deepest knowledge. It will not be where we stop.

Repo: LexReviewer

LexStack is open-source infrastructure for legal AI. It includes LexReviewer for document RAG, Law MCP for structured legal tools, and MicroEvals for CI-native evaluation.

‹ How hybrid retrieval works — and why legal AI needs both vector search and BM25

How LangGraph powers legal AI agents and why it matters ›