The Semantic Layer
Preserving Conceptual Simplicity in a Fragmented World
The Semantic Layer
Introduction
We've established three key insights across this series:
- Enterprise data models are conceptually simple (Article 1)
- Fragmentation across multiple systems is inevitable (Article 2)
- Our typical responses to fragmentation make things worse, not better
But here's the key insight: The problem isn't fragmentation itself. The problem is that we've lost the conceptual model.
When ModernRetail's data architect wants to know "what are our total sales by customer segment this quarter," they need to understand:
- Customer master data is in Salesforce
- But billing customers are in SAP
- E-commerce customers might exist only in the web platform
- Orders are split across three systems depending on channel
- Product categorization for segments is in SAP
- But web-specific categories are in the e-commerce platform
- The finance team reconciles all this manually in spreadsheets
The conceptual question is simple: "sales by customer segment." The implementation is a nightmare of system knowledge, data lineage understanding, and manual reconciliation.
What if we could restore the conceptual simplicity? What if someone—or something—could ask "show me sales by customer segment" in terms of the simple domain model, and a layer beneath that question would handle all the complexity of knowing where data lives, which system is authoritative, and how to assemble a coherent answer?
How do we preserve the conceptual simplicity while accepting the implementation complexity?
The answer is a Semantic Layer—a knowledge layer that sits between users and the fragmented systems, maintaining the simple conceptual model while handling the messy reality underneath. Combined with the advent of agentic AI, this approach finally makes it practical to work with enterprise data the way we think about it, not the way it's implemented.

What is a Semantic Layer?
Think of a semantic layer as a knowledgeable translator who understands both what you're asking for and where to find it.
When you ask "Show me Patient Martinez's recent lab results," you're thinking in simple conceptual terms: one patient, their labs. But the reality is complex:
- Patient demographics live in Epic
- Lab orders originated in Epic
- Actual lab results live in the specialty lab system (Cerner PathNet)
- Insurance coverage for the labs is in Epic's billing module
- The supplies used for the lab draw are tracked in the supply chain system
A semantic layer knows:
- The conceptual model: What a "Patient" is, what "lab results" means, how they relate
- The system of record: Epic owns patient identity, PathNet owns lab results
- The data lineage: How lab orders flow from Epic to PathNet, how results flow back
- The access patterns: Which API to call, what credentials to use, how to join the data
Instead of you needing to understand all this complexity, the semantic layer handles it. You ask a simple question using simple concepts. The layer translates it into the complex queries needed to fetch and assemble the answer.
The Healthcare Example: RegionalHealth's Semantic Layer
Let's make this concrete with RegionalHealth, our hospital system from Article 2. Remember their fragmented reality:
- Epic (clinical records)
- Workday (employee/provider HR data)
- Clinical supply chain system (inventory)
- Specialty systems (cardiology, labs)
The Conceptual Model in the Semantic Layer
The semantic layer maintains RegionalHealth's simple conceptual model:
Patient: A person receiving care
- System of Record: Epic
- Key attributes: MRN (medical record number), demographics, insurance
- Related entities: Encounters, Orders, Insurance
Provider: A credentialed clinician
- System of Record: Epic (credentials) + Workday (employment)
- Key attributes: NPI number, specialties, schedule
- Relationship complexity: A Provider is both an Epic credential and a Workday employee
Encounter: A patient visit or admission
- System of Record: Epic
- Key attributes: Date/time, location, diagnosis, providers involved
- Generates: Orders, charges, clinical documentation
Lab Order: A request for laboratory testing
- System of Record: Epic (order) + PathNet (results)
- Data flow: Created in Epic → sent to PathNet → results return to Epic
- The semantic layer knows this is actually two systems working together
Clinical Supplies: Medical supplies and medications
- System of Record: Supply chain system
- Usage tracking: Linked to Encounters in Epic through integration
- The semantic layer knows how to connect supply usage to patient care
How It Works in Practice
Let's walk through a real business question: "What were the total supply costs for cardiac procedures last month?"
This seems simple, but it requires data from three systems:
- Epic: Which encounters were cardiac procedures?
- Supply chain system: What supplies were used and their costs?
- The integration layer: How do we link encounter IDs to supply transactions?
Without a semantic layer, a business analyst would need to:
- Understand Epic's procedure coding system
- Know which supply chain tables track usage by encounter
- Figure out how encounter IDs map between systems
- Write queries against multiple databases
- Manually join and reconcile the data
- Hope the integration didn't drop any records
With a semantic layer, the analyst asks: "Show me supply costs by procedure type for cardiology in October 2024."
The semantic layer:
- Recognizes "procedure type" means Epic procedure codes
- Knows "cardiology" translates to specific department codes
- Understands "supply costs" requires the supply chain system
- Knows how encounters in Epic link to supply transactions
- Executes the necessary queries across both systems
- Assembles a coherent answer
The analyst thinks in business concepts. The semantic layer handles the technical complexity.
The Knowledge It Maintains
The semantic layer for RegionalHealth knows:
Entity Definitions: What is a Patient, Provider, Encounter, Order? What attributes does each have? How do they relate to each other?
System Mapping: Where does each entity live? Epic has Patients and Encounters. PathNet has lab results. The supply chain system has inventory and usage.
Ownership Rules: Epic is the system of record for patient identity. If patient data conflicts between systems, Epic wins.
Data Lineage: Lab orders flow Epic → PathNet. Results flow PathNet → Epic. The semantic layer tracks these flows and knows when data is stale.
Business Rules: "Active provider" means Epic credential status = active AND Workday employment status = active. The semantic layer encodes these composite rules.
Integration Patterns: When you ask for "encounter supply costs," the layer knows to query Epic for encounter details, then query the supply chain system using the encounter ID that was sent during the integration.
The AI Revolution: Why This Matters Now
Semantic layers aren't new conceptually. What's new is agentic AI that can actually use them effectively.
The Old Way: Humans Wrestling with Complexity
Traditionally, business users would:
- Ask IT for a report
- Wait days or weeks for development
- Get a rigid report that answers one specific question
- Need a new report for the next question
Or they'd try to access data themselves using BI tools, getting lost in the complexity of joins, system-specific IDs, and integration timing issues.
The New Way: AI Agents Using the Semantic Layer
With agentic AI, a user can have a natural conversation:
User: "Show me our highest-cost cardiac patients last quarter"
AI Agent (using the semantic layer):
- Understands "cardiac patients" means encounters with cardiac procedure codes
- Queries Epic for cardiac encounters in Q3
- Queries the supply chain system for supplies used in those encounters
- Queries Epic's billing module for total charges
- Assembles and ranks by total cost
- Returns: "Here are the top 20 patients by total cost, ranging from $180K to $340K"
User: "For the top patient, what drove the cost?"
AI Agent:
- Drills into that specific patient's encounters
- Breaks down costs by category (supplies, procedures, length of stay)
- Returns: "Patient had complications requiring extended ICU stay (15 days) and a second procedure. ICU supplies and medications were 60% of total cost."
User: "How does this compare to typical cardiac surgery patients?"
AI Agent:
- Queries for average cardiac surgery costs and length of stay
- Compares the outlier to the average
- Returns analysis with context
This conversation happens in minutes, not weeks. The user thinks in business terms. The AI agent uses the semantic layer to navigate the technical complexity.
Why AI Agents Need the Semantic Layer
AI agents can't work effectively with fragmented systems directly. They need:
Conceptual grounding: The agent needs to understand "Patient" and "Encounter" as business concepts, not as "Epic.PAT_ENC_TABLE" and "SCM.ENCOUNTER_LINK."
Reliable access patterns: The agent needs to know which system to query for what, and how to join data correctly across systems.
Business rule enforcement: The agent needs to apply business logic (like "active provider" rules) consistently.
Error handling: When integrations fail or data is missing, the agent needs to know what to do—which is encoded in the semantic layer.
The semantic layer provides the structure that lets AI agents be genuinely useful rather than confidently wrong.
The Practical Benefits
For RegionalHealth, implementing a semantic layer with AI agents delivers:
Faster insights: Questions that took days or weeks now take minutes. Clinicians, administrators, and executives get answers when they need them.
Reduced IT bottleneck: Business users can ask questions in natural language. IT focuses on maintaining the semantic layer, not fielding endless custom report requests.
Consistent answers: Everyone uses the same conceptual model and business rules. No more conflicting reports because different people pulled data differently.
Easier onboarding: New employees learn the simple conceptual model, not the complex system landscape. The semantic layer handles the technical details.
Better decisions: When it's easy to ask questions and get answers, people make more data-informed decisions. The barrier to using data drops dramatically.
What About Data Quality?
A common concern: "If the underlying systems have bad data, won't the semantic layer just return bad data faster?"
Yes—but with a crucial difference: the semantic layer makes data quality issues visible and fixable.
Without a semantic layer, data quality problems hide in system corners. With a semantic layer:
- The layer can flag when systems disagree (Epic says Provider Smith is active, Workday says inactive)
- The layer enforces consistency rules (patient age can't be negative)
- The layer tracks data lineage (this result came from the PathNet sync that's 2 hours old)
- Users see quality metadata alongside their answers
The semantic layer doesn't magically fix bad data, but it makes the problems visible so they can be addressed systematically.
The Implementation Reality
Building a semantic layer isn't trivial, but it's increasingly practical:
Define the conceptual model: Document your core entities, relationships, and business rules. This is the hardest part—requiring business and IT collaboration.
Map to systems: Identify which systems own which entities. Document the integration flows and timing.
Build the access layer: Create APIs or connectors that expose the conceptual model while querying the actual systems underneath.
Add AI capabilities: Deploy agentic AI that can interpret natural language questions and use the semantic layer to answer them.
Iterate: Start with high-value entities (Patient, Provider, Encounter for healthcare). Expand coverage over time.
The key: you don't need to solve everything at once. Start with the entities that matter most to your business questions.
Conclusion
Enterprise data fragmentation isn't going away. The forces we explored in Article 2—historical accumulation, organizational boundaries, specialization, acquisitions, technology evolution—ensure that systems will remain fragmented.
But we don't have to let that fragmentation destroy the conceptual simplicity that makes data useful.
A semantic layer preserves the simple conceptual model while managing the complex implementation. It knows what "Patient" means, where patient data lives, how it flows between systems, and how to access it correctly.
Combined with agentic AI, the semantic layer finally delivers on the promise of "self-service analytics." Business users can ask questions in natural language, thinking in simple business concepts. The AI agent uses the semantic layer to navigate the technical maze and return accurate answers.
The result: enterprise data becomes accessible and useful rather than intimidating and confusing. We preserve conceptual simplicity in a fragmented world.
RegionalHealth's business users don't need to know that patient demographics are in Epic, lab results are in PathNet, and supply costs are in the supply chain system. They just ask about patients, labs, and costs. The semantic layer and AI agent handle the rest.
That's the future of enterprise data: conceptually simple, implementationally complex, but practically accessible.