Layered Intelligence: Building the Next Generation of AI in eDiscovery

Over this series we’ve seen that AI is not one thing but a decades-long layering of ideas. In the first post, we looked back — from handcrafted rules to modern large language models — and saw that new breakthroughs don’t erase the value of older methods. In the second, we went from theory to practice: exploring how our PrivFinder tool breaks down the hard problem of privilege review into simpler, cost-effective parts using heuristics, corpus-level statistics, and user control rather than reaching straight for the biggest model.

Now we want to look forward. What does that mindset mean for the future of eDiscovery? Where do we think AI is truly headed, and how is Lineal preparing for it?

Why “Data, Not Documents” Is the Foundation

The phrase we keep returning to is data, not documents. Documents are messy: inconsistent, duplicative, full of noise. Feeding raw documents to any advanced AI wastes time and compute while it tries to separate the useful from the irrelevant.

Transform those same documents into structured, meaningful data and everything downstream becomes easier, cheaper, and more accurate. That means mapping senders and recipients, normalizing email signatures, filtering out bots, and modeling relationships. Tools like Bots and PrivFinder quietly do this work first.

Think of it as building a map before asking for directions. If the paths and intersections are clear, you can get anywhere faster and the system guiding you won’t get lost.

Layering the AI Toolbox

A layered approach means using the right level of intelligence at the right time:

  • First pass: fast, interpretable filters. Tools like Bots remove obviously irrelevant material before deeper analysis begins.
  • Second pass: data modeling and metrics. Systems like PrivFinder turn unstructured email into structured actor data, combining heuristics with corpus-level statistics, an idea borrowed from traditional machine learning.
  • Third pass: targeted advanced AI. Only once the data is clean and organized do GenAI and small, domain-tuned language models enter. At this stage they are focused, efficient, and far more accurate because they’re working on high-quality inputs.

Each layer strengthens the next. The early passes are not just cost savers; they are what make later, more advanced AI reliable.

The Future This Enables

Because of this foundation, we see a near future where legal teams interact with their data in dramatically smarter ways:

  • Smarter retrieval: Instead of raw keyword search, imagine asking: “Show me all conversations between finance leadership after March 1, excluding automated system emails.” Upstream tools have already filtered bots, mapped relationships, and labeled actors.
  • Context-aware assistants: A GenAI tool that knows who is on an email, what role they play, and how communications flow can answer focused questions and summarize with confidence, rather than hallucinating irrelevant content.
  • Trustworthy, domain-specific LMs: Smaller language models trained securely on cleaned legal corpora can run faster and cost less than giant general-purpose models because the messy prep work is done.
  • Agentic workflows: AI agents that don’t just answer but act — surfacing key documents, flagging privilege risks, or building matter summaries — become feasible once the data landscape is coherent and structured.

Strategic Innovation, Not Hype

This is how we think about AI at Lineal: pragmatic but ambitious. We build each layer deliberately, guided by deep domain expertise. When new technology like GenAI is ready to add value, we’re positioned to use it intelligently because the foundation is solid.

We’re actively experimenting with GenAI and new agentic approaches. But we’ll deploy them the way we believe good AI should be deployed: on top of clean data, with transparency and control for our users, and with an eye toward real-world cost and performance.

Takeaway

The future of AI in eDiscovery isn’t to replace everything with a giant model. It’s to combine thoughtful preprocessing, smart structure, and targeted advanced AI where it truly helps. That approach makes discovery faster, more reliable, and more affordable — and it’s why we’re so excited for what’s ahead.

Missed the earlier posts in this series? Catch up here:
Part 1 – A Short History of AI: Why LLMs Aren’t the Whole Story
Part 2 – From History to Practice: How We Find Privileged Communications

_

About the Author   

Matthew Heston is Lead Data Scientist at Lineal, where he leads the design and implementation of AI, machine learning, and scalable data systems that transform how legal teams work with complex information. He received a PhD in Technology and Social Behavior from Northwestern University. 

_

About Lineal 

Lineal is an innovative eDiscovery and legal technology solutions company that empowers law firms and corporations with modern data management and review strategies. Established in 2009, Lineal specializes in comprehensive eDiscovery services, leveraging its proprietary technology suite, Amplify™  to enhance efficiency and accuracy in handling large volumes of electronic data. With a global presence and a team of experienced professionals, Lineal is dedicated to delivering custom-tailored solutions that drive optimal legal outcomes for its clients. For more information, visit lineal.com