The Intelligence Layer

There is a problem every team building AI agents at scale eventually runs into, and it does not announce itself cleanly. The agents work. They give answers. They pass the demo. Then usage grows and something starts to feel wrong, costs climbing faster than value, answers that should be fast taking too long, context re-assembled from scratch on every run as though the system has no memory of what happened before. The problem is architectural. And it is one we ran into ourselves. By late 2024, VH3 AI had production AI agents running against real field service operations, early conversational interfaces processing hundreds of thousands of jobs across multiple large clients. What we learned over that period, and particularly in the months that followed, was that the classic retrieval approach, semantic search over job records, answer from the closest chunks, was not enough for the kind of operational work field service agents actually need to do. A chatbot that answers “how many jobs did we complete last week?” is a different machine from an agent that prepares a site briefing, investigates a recurring fault pattern, monitors for emerging SLA risk, and synthesises five years of operational history into a coherent picture. The first can run on retrieval. The second needs something else. That something else is what the intelligence layer is.

The retrieval problem

The AI infrastructure market is converging on this same diagnosis. Semantic search vendors are shipping knowledge layers and new query contracts. Enterprise vendors are investing heavily in structured operational data. The shared recognition is that agents built on search-only memory systems hit a ceiling: they spend most of their compute budget re-assembling context they should already have. The root cause is straightforward. Classic retrieval, embed a query, find the closest chunks, pass them to a model, was designed for question-answering. One question, one lookup, one answer. Agents run tasks. They cross-reference entities, check history, evaluate patterns, and produce recommendations. That workflow needs a different kind of memory. It needs a retrieval contract: the agent receives operating context, a prepared bundle of resolved, current, authoritative information, rather than fragments it has to reassemble on every run. VH3 AI is built around that contract. Four complementary memory primitives, combined into a single synthesis layer.

The four primitives

1. Relational memory: the knowledge graph

Some of the most important questions in field service are structurally relational. Which engineer has worked at this site before? Which sites share a parent customer? Which jobs cluster around the same asset? Which fault patterns connect across engineers who have never worked together? These are relationship questions. The answer does not live in any single document, it lives in the connections between entities. Similarity search alone cannot find them. Keyword search cannot find them. They require a data structure that understands relationships natively. The VH3 AI knowledge graph links every job to its engineer, site, customer, outcome, equipment, and history, traversable in any direction, from any starting point. Entity resolution handles the messy reality of source data written by different people across different systems: “Tesco PLC”, “Tesco Stores”, and “Tesco Express Watford” resolve to the same customer. The same fault described differently across twenty engineers resolves to the same underlying pattern. The graph is not a reporting layer. It is the foundation everything else is built on.

2. Semantic memory: meaning-based search

Not every question is relational. Some questions are about similarity, faults that look like other faults, job outcomes that resemble precedents in the history, descriptions that match without sharing a single keyword. The VH3 AI semantic index stores every fault description, job note, worksheet answer, and engineer observation by meaning. “Boiler intermittent fault” finds jobs recorded as “heating system cutting out sporadically” or “intermittent heating loss.” Precedent discovery across language that was never standardised, at any scale, with no keyword configuration required. Critically, this runs on top of the knowledge graph, not as a separate retrieval layer. A semantic search does not return raw chunks from a document store. It returns graph-linked records: jobs with their engineers, their sites, their outcomes, their history, all already resolved and connected.

3. Structured memory: enriched operational records

Field service data arrives messy. Fault descriptions are free text. Worksheet answers are inconsistent. The same outcome appears in three different formats depending on who completed the job and which device they used. Every job that enters the VH3 AI pipeline passes through an enrichment process once. Fault type is extracted and classified. Work performed is structured. Equipment involved is identified. The operational outcome is resolved. The resulting record is compact, token-efficient, and ready to be read by any downstream agent, report, or automation without re-extraction. This is the pre-computation principle. The expensive interpretive work happens when the data arrives, not when a question is asked. Every report, briefing, automation, and agent query that touches that job reads the same prepared artefact. One enrichment, unlimited consumption, no degradation.

4. Temporal memory: continuous monitoring

The intelligence layer does not wait to be asked. Sentinels run continuously against the full operational record, watching for the patterns that matter: performance slips, site deterioration, SLA drift, dormant customers, overdue service intervals, emerging fault clusters, growth signals hidden in job history. When a pattern crosses a threshold, it surfaces as an actionable alert, not a dashboard metric for someone to find, but a signal delivered to the right person at the right time. The temporal layer is always on. It does not cost a model call to operate. It does not require a question to trigger it. This is memory that acts, not just answers.

How the primitives combine

The power of the synthesis layer is not in any one primitive. It is in how they work together. A pre-visit engineer briefing draws from all four:

Relational: every prior job at this site, who attended, what the outcome was
Semantic: similar faults seen elsewhere in the operation, with precedents
Structured: the compacted record of the last three visits, token-efficient and ready
Temporal: any active sentinel alerts for this site or customer

The engineer does not receive a search result. They receive operating context, assembled, resolved, current, and scoped to the work they are about to do. A fault investigation works the same way. The relational layer provides the job history and engineer connections. The semantic layer finds similar incidents across the operation. The structured layer supplies the classified fault data. The temporal layer flags whether this pattern is new or recurring. An account review synthesises the relational picture of every job linked to a customer against the structured performance data, semantic patterns in communication history, and temporal alerts about dormancy or risk. None of these workflows work reliably on a single retrieval primitive. They require synthesis.

The domain model: why the substrate speaks one language

The four primitives only hold together if what they contain is consistent. That requires an opinionated decision most intelligence platforms avoid: a defined model of what field service concepts actually are. Field service is not generic business data. It is work, performed at a place, for a customer, by an engineer, on equipment, with an outcome. Those concepts have specific meanings, specific relationships, and specific cardinalities. A job is not a row. A customer is not a string. A site is not an address field. They are entities with histories, hierarchies, and connections to each other that have operational consequences. VH3 AI maps all incoming data to a canonical model of these concepts. Source systems spell names differently, use different field structures, and make different assumptions about what a record means. The model does not inherit those inconsistencies. By the time a job enters the intelligence layer, it is classified, resolved, and linked to the right customer, site, and engineer — with its outcome structured and its equipment identified. This has three practical consequences: Queries are consistent across sources. An aggregation of engineer performance, a semantic search for a fault type, a sentinel watching for repeat visits: all of these ask the same question regardless of which FMS the underlying data came from. You do not write different queries for different systems. Entity resolution works because the model has rules. Resolution is not guesswork or fuzzy string matching alone. The model defines what a customer is, what makes two customer records the same customer, and what relationships must hold before a link is drawn. Those rules are applied consistently at ingest and maintained as new data arrives. Intelligence is portable. If you connect a second field system, or replace the first, the operational picture does not fork. New data resolves into the same model. Historical records stay linked to the same entities. The intelligence you have built belongs to the model, not to the source system it came from. The model is not published. Its specific schema, classification taxonomy, and resolution rules represent significant accumulated design work and are part of what the platform provides. What matters for builders is the contract it delivers: consistent, resolved, connected records at every endpoint, regardless of what is upstream.

Pre-computation and the cost of retrieval

There is a second reason this architecture matters: cost. Most AI tools in field service re-run the full pipeline on every interaction. New session, new search, new context assembly, new model call. Every user query, every automation run, every report generation starts from zero. Cost grows linearly with usage. VH3 AI is built the other way. Enrichment happens once, at ingest. The graph is built once, at onboarding, and maintained incrementally. The compacted record is produced once and read everywhere. A hundred users asking questions from the same foundation costs less per query than one user asking from a system that rebuilds context on every call. The practical result: onboarding the second team does not double the AI cost. Wiring in another automation does not triple it. The per-query cost goes down as usage goes up, because more consumers are sharing work that has already been done.

Mature on day one

The intelligence layer is most valuable when it is mature, when it has enough history to surface patterns, resolve conflicts, and produce reliable synthesis. With VH3 AI, that maturity is delivered as part of onboarding. Up to five years of operational history, every job, every outcome, every engineer record, every site and customer relationship, is ingested, enriched, and connected into the graph before go-live. Every entity is resolved. Every relationship is mapped. Every historical conflict in the source data is reconciled. Your agents work from that foundation on day one. The understanding is already there. From that point, every new job flows through the same enrichment pipeline and joins the graph. The understanding compounds. But it never had to warm up.

What this means for builders

The intelligence layer is exposed through a full API and an MCP surface. Every primitive is queryable. Bring your own model, your own framework, your own agents.

Operational discovery

Precedent search, entity resolution, and customer knowledge sections on the layer.

Building on the layer

Citizen builders, coding agents, integrations, and programmatic access.

API Reference

Every primitive accessible via REST. Semantic search, graph queries, enriched job feed, sentinel runs, report generation.

MCP Server

Connect any MCP-compatible agent directly to the intelligence layer. No middleware.

Agent Starter Kits

Drop-in AGENTS.md, Cursor rules, and Claude Project instructions to make any AI tool VH3-fluent in minutes.

n8n Community Node

The intelligence layer as a native n8n node. 100+ workflow templates. PRO operations connect directly to enriched data.

When you build on the intelligence layer, you are starting from a prepared, resolved, always-current understanding of the operation. Your agents read. They do not scavenge. That is the difference between access and understanding.

​The Intelligence Layer

​The retrieval problem

​The four primitives

​1. Relational memory: the knowledge graph

​2. Semantic memory: meaning-based search

​3. Structured memory: enriched operational records

​4. Temporal memory: continuous monitoring

​How the primitives combine

​The domain model: why the substrate speaks one language

​Pre-computation and the cost of retrieval

​Mature on day one

​What this means for builders