GenAI in Action: Transforming Unstructured Data into Value for Environmental Science Associates

Maximized the value of over 70 million documents through unstructured data ingestion, classification and GenAI, enabling faster & smarter decision-making across teams of environmental consultants.

Key takeaways
90%+
classification accuracy
<30s
RAG system response time
TECH STACK
Company Logo Icon
Industry
Environmental Consulting
Location
San Francisco, CA
SERVICES
Artificial Intelligence
Artificial Intelligence
Empower your business with pragmatic applications of AI
Decision Sciences
Decision Sciences
Empowering decision-makers one model at a time
Product
No items found.
TECH STACK
Databricks
LangGraph
MLflow

The Challenge

Environmental Science Associates (ESA) manages over 70 million documents including contracts, proposals, resumes, and reports that contain highly valuable information and critical institutional knowledge. Despite the richness of this data, limited search and discovery capabilities have made much of this knowledge effectively inaccessible. For example, during the proposal generation and project staffing processes, teams frequently spent significant time recreating existing documents or analyses due to a lack of visibility into previously developed materials. This resulted in inefficiencies, lost productivity, duplicated effort, and delayed timelines, reducing the organization’s ability to fully capitalize on its historical work and expertise. The absence of intelligent document classification and retrieval highlighted the need for a scalable, AI-driven solution that could unlock ESA’s institutional knowledge, and provide teams with timely, context-aware insights.

Our Approach

We started with an unstructured data ingestion pipeline and a classification model that accurately categorized documents into predefined groups. Building on this foundation, we implemented a Retrieval-Augmented Generation (RAG) solution using LangGraph that enabled semantic, context-aware search across the classified document corpus. Users could retrieve relevant documents and specific passages using natural language queries rather than relying on exact keyword matches. The solution retrieved the most relevant documents along with synthesized, citation-backed answers, significantly reducing time spent searching. Our solution was supported by a comprehensive evaluation framework built on MLflow to ensure performance, accuracy, and reliability.

,

Results

RESULT #01
90%+ Document Classification Accuracy

The AI-powered model accurately categorized documents into defined groups with over 90% accuracy, significantly improving organization and enabling efficient search and retrieval.

GenAI in Action: Transforming Unstructured Data into Value for Environmental Science Associates
RESULT #02
Insights in Under 30 Seconds

The RAG solution delivered rapid, context-aware responses, allowing teams to surface relevant information across projects, people, and historical documents, reducing redundancy and saving time.

GenAI in Action: Transforming Unstructured Data into Value for Environmental Science Associates
RESULT #03
Evaluation-Driven AI System

The team delivered a ground truth builder application to ESA, powered by Databricks Apps and MLflow, to accelerate ground truth dataset creation and enable more robust, consistent, and reliable evaluations over time.

GenAI in Action: Transforming Unstructured Data into Value for Environmental Science Associates

Key Takeaways

90%+
classification accuracy
<30s
RAG system response time

Let’s talk data.
We’ll bring the solutions.

Whether you need advanced AI solutions, strategic data expertise, or tailored insights, our team is here to help.

Meet an Expert