The Semantic Layer

November 29, 2025•Miguel Garcia

Preserving Conceptual Simplicity in a Fragmented World

Part of the Semantic Layer Series (Part 3)

The Semantic Layer

Introduction

We've established three key insights across this series:

Enterprise data models are conceptually simple (Article 1)
Fragmentation across multiple systems is inevitable (Article 2)
Our typical responses to fragmentation make things worse, not better

But here's the key insight: The problem isn't fragmentation itself. The problem is that we've lost the conceptual model.

When ModernRetail's data architect wants to know "what are our total sales by customer segment this quarter," they need to understand:

Customer master data is in Salesforce
But billing customers are in SAP
E-commerce customers might exist only in the web platform
Orders are split across three systems depending on channel
Product categorization for segments is in SAP
But web-specific categories are in the e-commerce platform
The finance team reconciles all this manually in spreadsheets

The conceptual question is simple: "sales by customer segment." The implementation is a nightmare of system knowledge, data lineage understanding, and manual reconciliation.

What if we could restore the conceptual simplicity? What if someone—or something—could ask "show me sales by customer segment" in terms of the simple domain model, and a layer beneath that question would handle all the complexity of knowing where data lives, which system is authoritative, and how to assemble a coherent answer?

How do we preserve the conceptual simplicity while accepting the implementation complexity?

The answer is a Semantic Layer—a knowledge layer that sits between users and the fragmented systems, maintaining the simple conceptual model while handling the messy reality underneath. Combined with the advent of agentic AI, this approach finally makes it practical to work with enterprise data the way we think about it, not the way it's implemented.

Light at the end of the tunnel

What is a Semantic Layer?

Think of a semantic layer as a knowledgeable translator who understands both what you're asking for and where to find it.

When you ask "Show me Patient Martinez's recent lab results," you're thinking in simple conceptual terms: one patient, their labs. But the reality is complex:

Patient demographics live in Epic
Lab orders originated in Epic
Actual lab results live in the specialty lab system (Cerner PathNet)
Insurance coverage for the labs is in Epic's billing module
The supplies used for the lab draw are tracked in the supply chain system

A semantic layer knows:

The conceptual model: What a "Patient" is, what "lab results" means, how they relate
The system of record: Epic owns patient identity, PathNet owns lab results
The data lineage: How lab orders flow from Epic to PathNet, how results flow back
The access patterns: Which API to call, what credentials to use, how to join the data

Instead of you needing to understand all this complexity, the semantic layer handles it. You ask a simple question using simple concepts. The layer translates it into the complex queries needed to fetch and assemble the answer.

The Healthcare Example: RegionalHealth's Semantic Layer

Let's make this concrete with RegionalHealth, our hospital system from Article 2. Remember their fragmented reality:

Epic (clinical records)
Workday (employee/provider HR data)
Clinical supply chain system (inventory)
Specialty systems (cardiology, labs)

The Conceptual Model in the Semantic Layer

The semantic layer maintains RegionalHealth's simple conceptual model:

Patient: A person receiving care

System of Record: Epic
Key attributes: MRN (medical record number), demographics, insurance
Related entities: Encounters, Orders, Insurance

Provider: A credentialed clinician

System of Record: Epic (credentials) + Workday (employment)
Key attributes: NPI number, specialties, schedule
Relationship complexity: A Provider is both an Epic credential and a Workday employee

Encounter: A patient visit or admission

System of Record: Epic
Key attributes: Date/time, location, diagnosis, providers involved
Generates: Orders, charges, clinical documentation

Lab Order: A request for laboratory testing

System of Record: Epic (order) + PathNet (results)
Data flow: Created in Epic → sent to PathNet → results return to Epic
The semantic layer knows this is actually two systems working together

Clinical Supplies: Medical supplies and medications

System of Record: Supply chain system
Usage tracking: Linked to Encounters in Epic through integration
The semantic layer knows how to connect supply usage to patient care

How It Works in Practice

Let's walk through a real business question: "What were the total supply costs for cardiac procedures last month?"

This seems simple, but it requires data from three systems:

Epic: Which encounters were cardiac procedures?
Supply chain system: What supplies were used and their costs?
The integration layer: How do we link encounter IDs to supply transactions?

Without a semantic layer, a business analyst would need to:

Understand Epic's procedure coding system
Know which supply chain tables track usage by encounter
Figure out how encounter IDs map between systems
Write queries against multiple databases
Manually join and reconcile the data
Hope the integration didn't drop any records

With a semantic layer, the analyst asks: "Show me supply costs by procedure type for cardiology in October 2024."

The semantic layer:

Recognizes "procedure type" means Epic procedure codes
Knows "cardiology" translates to specific department codes
Understands "supply costs" requires the supply chain system
Knows how encounters in Epic link to supply transactions
Executes the necessary queries across both systems
Assembles a coherent answer

The analyst thinks in business concepts. The semantic layer handles the technical complexity.

The Knowledge It Maintains

The semantic layer for RegionalHealth knows:

Entity Definitions: What is a Patient, Provider, Encounter, Order? What attributes does each have? How do they relate to each other?

System Mapping: Where does each entity live? Epic has Patients and Encounters. PathNet has lab results. The supply chain system has inventory and usage.

Ownership Rules: Epic is the system of record for patient identity. If patient data conflicts between systems, Epic wins.

Data Lineage: Lab orders flow Epic → PathNet. Results flow PathNet → Epic. The semantic layer tracks these flows and knows when data is stale.

Business Rules: "Active provider" means Epic credential status = active AND Workday employment status = active. The semantic layer encodes these composite rules.

Integration Patterns: When you ask for "encounter supply costs," the layer knows to query Epic for encounter details, then query the supply chain system using the encounter ID that was sent during the integration.

The AI Revolution: Why This Matters Now

Semantic layers aren't new conceptually. What's new is agentic AI that can actually use them effectively.

The Old Way: Humans Wrestling with Complexity

Traditionally, business users would:

Ask IT for a report
Wait days or weeks for development
Get a rigid report that answers one specific question
Need a new report for the next question

Or they'd try to access data themselves using BI tools, getting lost in the complexity of joins, system-specific IDs, and integration timing issues.

The New Way: AI Agents Using the Semantic Layer

With agentic AI, a user can have a natural conversation:

User: "Show me our highest-cost cardiac patients last quarter"

AI Agent (using the semantic layer):

Understands "cardiac patients" means encounters with cardiac procedure codes
Queries Epic for cardiac encounters in Q3
Queries the supply chain system for supplies used in those encounters
Queries Epic's billing module for total charges
Assembles and ranks by total cost
Returns: "Here are the top 20 patients by total cost, ranging from $180K to $340K"

User: "For the top patient, what drove the cost?"

AI Agent:

Drills into that specific patient's encounters
Breaks down costs by category (supplies, procedures, length of stay)
Returns: "Patient had complications requiring extended ICU stay (15 days) and a second procedure. ICU supplies and medications were 60% of total cost."

User: "How does this compare to typical cardiac surgery patients?"

AI Agent:

Queries for average cardiac surgery costs and length of stay
Compares the outlier to the average
Returns analysis with context

This conversation happens in minutes, not weeks. The user thinks in business terms. The AI agent uses the semantic layer to navigate the technical complexity.

Why AI Agents Need the Semantic Layer

AI agents can't work effectively with fragmented systems directly. They need:

Conceptual grounding: The agent needs to understand "Patient" and "Encounter" as business concepts, not as "Epic.PAT_ENC_TABLE" and "SCM.ENCOUNTER_LINK."

Reliable access patterns: The agent needs to know which system to query for what, and how to join data correctly across systems.

Business rule enforcement: The agent needs to apply business logic (like "active provider" rules) consistently.

Error handling: When integrations fail or data is missing, the agent needs to know what to do—which is encoded in the semantic layer.

The semantic layer provides the structure that lets AI agents be genuinely useful rather than confidently wrong.

The Practical Benefits

For RegionalHealth, implementing a semantic layer with AI agents delivers:

Faster insights: Questions that took days or weeks now take minutes. Clinicians, administrators, and executives get answers when they need them.

Reduced IT bottleneck: Business users can ask questions in natural language. IT focuses on maintaining the semantic layer, not fielding endless custom report requests.

Consistent answers: Everyone uses the same conceptual model and business rules. No more conflicting reports because different people pulled data differently.

Easier onboarding: New employees learn the simple conceptual model, not the complex system landscape. The semantic layer handles the technical details.

Better decisions: When it's easy to ask questions and get answers, people make more data-informed decisions. The barrier to using data drops dramatically.

What About Data Quality?

A common concern: "If the underlying systems have bad data, won't the semantic layer just return bad data faster?"

Yes—but with a crucial difference: the semantic layer makes data quality issues visible and fixable.

Without a semantic layer, data quality problems hide in system corners. With a semantic layer:

The layer can flag when systems disagree (Epic says Provider Smith is active, Workday says inactive)
The layer enforces consistency rules (patient age can't be negative)
The layer tracks data lineage (this result came from the PathNet sync that's 2 hours old)
Users see quality metadata alongside their answers

The semantic layer doesn't magically fix bad data, but it makes the problems visible so they can be addressed systematically.

The Implementation Reality

Building a semantic layer isn't trivial, but it's increasingly practical:

Define the conceptual model: Document your core entities, relationships, and business rules. This is the hardest part—requiring business and IT collaboration.

Map to systems: Identify which systems own which entities. Document the integration flows and timing.

Build the access layer: Create APIs or connectors that expose the conceptual model while querying the actual systems underneath.

Add AI capabilities: Deploy agentic AI that can interpret natural language questions and use the semantic layer to answer them.

Iterate: Start with high-value entities (Patient, Provider, Encounter for healthcare). Expand coverage over time.

The key: you don't need to solve everything at once. Start with the entities that matter most to your business questions.

Conclusion

Enterprise data fragmentation isn't going away. The forces we explored in Article 2—historical accumulation, organizational boundaries, specialization, acquisitions, technology evolution—ensure that systems will remain fragmented.

But we don't have to let that fragmentation destroy the conceptual simplicity that makes data useful.

A semantic layer preserves the simple conceptual model while managing the complex implementation. It knows what "Patient" means, where patient data lives, how it flows between systems, and how to access it correctly.

Combined with agentic AI, the semantic layer finally delivers on the promise of "self-service analytics." Business users can ask questions in natural language, thinking in simple business concepts. The AI agent uses the semantic layer to navigate the technical maze and return accurate answers.

The result: enterprise data becomes accessible and useful rather than intimidating and confusing. We preserve conceptual simplicity in a fragmented world.

RegionalHealth's business users don't need to know that patient demographics are in Epic, lab results are in PathNet, and supply costs are in the supply chain system. They just ask about patients, labs, and costs. The semantic layer and AI agent handle the rest.

That's the future of enterprise data: conceptually simple, implementationally complex, but practically accessible.