The Catalog Is Not the Control Plane

OpenMetadata, Collibra, Alation, DataHub. The catalog category has built brilliant context layers. None of them has built the decision layer the agentic era requires.

A post crosses my feed praising OpenMetadata, the open-source data catalog, as the answer to the silo problem. One place to discover, understand, and trust your data. 120 connectors. Apache 2.0. Deploy anywhere. The post is correct about everything it claims and misleading about what it implies. OpenMetadata is the most polished version of a category that solved the wrong problem.

The category is data catalogs. The problem they solve is human discoverability: analysts asking "what does this column mean, who owns it, where did this number come from." The problem they do not solve, and were never built to solve, is whether a transaction should execute. Catalogs answer what is this. They do not answer should this happen. The agentic era needs an answer to the second question, and no amount of better cataloging will produce it.

The thirty-year frame

Data catalogs are the third generation of the same instinct. Before the catalog there was the data dictionary, then the metadata repository, then the master data hub. Each generation made the same promise: a central place that knows what every data asset means, owns, contains, and connects to. Each generation arrived with new technology that made the promise newly plausible. Each generation discovered the same hard limit.

The limit is staleness. A catalog is a snapshot of metadata at the moment of ingestion. OpenMetadata pulls schemas from 120 sources on a schedule. So does Collibra. So does Alation. So does DataHub. The graph in the catalog reflects the world as it was last night, or last hour, or in the worst case last week. For a human analyst running a Tuesday-morning report, last-night accuracy is fine. The catalog is a reference, not an oracle, and the human supplies the freshness check.

Take the human out of the loop and the snapshot becomes a liability. An agent that reads OpenMetadata to decide whether a column is sensitive does not know that the column was retagged at 9:14 this morning and the catalog refresh is not scheduled until 2 AM tomorrow. The catalog has the wrong answer. The agent has no way to know it has the wrong answer. The transaction proceeds on stale truth.

What changed

Three shifts have arrived together, and they expose the limit catalogs were always under but never had to face.

The first is execution latency. Agents act in milliseconds. Catalog refreshes happen in hours. The gap between the freshest source and the catalog's view of it used to be a paper cut; now it is the decision window itself.

The second is the rise of the MCP layer. Every major catalog vendor, OpenMetadata included, has shipped or announced an MCP server in the last six months. The pitch is the same in each case: agents need context, the catalog has context, expose it over MCP. This is correct as far as it goes. It is also a tell. The fact that a catalog vendor needs to bolt an agent-facing interface onto a human-facing product confirms the agentic era requires something the catalog itself is not. Cataloging context is a feature. Governing the action that uses the context is a different product.

The third is the composition problem. A modern action touches the source system, the catalog, the IdP, the lineage graph, the policy engine, and the token usage history in a single decision window. Each of those sources holds a fragment of the truth. None of them is the truth. The catalog being the best version of one fragment does not make it the adjudicator across all of them.

A catalog can afford to be a day old. A transaction governance decision cannot.

The principle: staleness tolerance

The distinction that matters is staleness tolerance. Every system in the data stack has one, the longest acceptable gap between the source of truth and the system's view of it. For a data warehouse, days is fine. For a catalog, hours. For an operational source system, seconds. For a control plane that decides whether an agentic transaction should execute, the answer must be zero. The decision happens at query time, against the live state of every source it consults, with no persisted snapshot acting as the answer.

This is the line that separates a context layer from a control plane. A context layer caches truth and serves it. A control plane refuses to cache and adjudicates instead. The architectures look similar from the outside, both are multi-source, both speak to many systems, both can expose an MCP interface, but the staleness tolerance is opposite. A catalog tolerates staleness because its users tolerate it. A control plane cannot tolerate it because its users are agents that will not notice.

Once the distinction is named, the entire catalog category resolves to its proper place in the stack. OpenMetadata, Collibra, Alation, DataHub: each of them is an excellent witness to consult when a decision is being made. None of them is the decision-maker. The witnesses live where the data lives. The decision-maker lives one layer above them.

Stack position Agentic action requested │ ▼ ┌──────────────────────────┐ │ CONTROL PLANE │ staleness tolerance: 0 │ Decides: should this │ queries witnesses live │ transaction proceed? │ never caches the verdict └──────────────────────────┘ │ ┌────────┼────────────────────┐ ▼ ▼ ▼ ▼ Source Catalog IdP Lineage graph system (witness) (witness) (witness) (witness) staleness tolerance: hours-to-days persisted snapshots of source truth

What this means for the catalog category

For OpenMetadata and its peers, the implication is not subtraction but repositioning. The catalog category is not threatened by the rise of the control plane. It is liberated from a role it never fit. Catalogs become substrate: the witness layer that control planes consult. They get to be the best version of what they actually are, instead of pretending to be the governance answer they cannot deliver.

The vendors that resist this repositioning will spend the next two years bolting more and more agentic-sounding features onto the catalog and discovering that the staleness ceiling is real. The vendors that accept it will ship better connectors, richer lineage, sharper classifications, and more useful MCP interfaces. They will be consumed by control planes operating one layer up, which is the right outcome for both sides.

For buyers, the takeaway is more direct. If your governance strategy is anchored on a catalog, open-source or commercial, you have a context strategy, not a governance strategy. The catalog is fine. The strategy is incomplete. The missing piece is the layer that decides, in real time, whether the agent acting on the catalog's information should be allowed to act at all.

The closing observation

OpenMetadata is excellent at being a catalog. Collibra is excellent at being a catalog. Alation, DataHub, and the rest of the category are excellent at being catalogs. None of them is bad. None of them is wrong. The mistake is treating any of them as the answer to a question they were not built to ask.

The question they answer is what data exists and what it means. The question the agentic era asks is whether a specific action against that data should be allowed to proceed right now. These are different questions, asked at different latencies, by different actors, with different consequences for getting them wrong. They require different architectures and different products. Pretending otherwise produces brilliant catalogs and absent control planes, which is exactly the position the industry is in today.

Catalogs answer what is this. The control plane answers should this happen. The difference is staleness tolerance, and it is the entire game.

End N° 002