Cover image for Semantic Rot: Why Complex Systems Gradually Lose Their Meaning

Semantic Rot: Why Complex Systems Gradually Lose Their Meaning

• by Craig Greenhouse

A short essay on language, identifiers, and the slow decay of meaning inside complex systems.

There is a famous sentence in the English language:

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.

It is grammatically correct. It is also almost impossible to understand.

The sentence works because the word buffalo can act as a noun (the animal), a verb (to bully), and a proper noun (the city of Buffalo). With careful arrangement the grammar still holds together, even though the meaning becomes extremely difficult to recover.

In other words, the structure survives while the semantics collapse.

Large organisations quietly produce systems that behave in exactly the same way.

Individually, every component appears reasonable. Database schemas are normalised. APIs validate requests. Documentation uses familiar words. Support scripts follow approved terminology. Yet somewhere along the way the meaning of things has begun to drift.

Fields refer to slightly different concepts depending on the system. Identifiers refer to different entities depending on the channel. Organisations involved in the same process perform overlapping roles whose boundaries are no longer entirely clear.

Everything still works.

But nobody is completely sure what some of the words, identifiers, or relationships actually mean anymore.

This phenomenon has a name:

Semantic Rot

Semantic Rot is the gradual loss of meaning in complex systems. It rarely appears suddenly. Instead it accumulates slowly, through thousands of small decisions that each seem harmless at the time. A field name is shortened. An identifier is repurposed. A translation layer is introduced to preserve compatibility between two systems that no longer quite agree. Years later, the system is still running - but the conceptual model that once explained it has begun to decay.

Most system failures are blamed on technology, but many of the hardest problems in large systems are actually failures of language.

My First Encounter with Semantic Rot

My first encounter with Semantic Rot happened long before I worked in software architecture.

It was the late 1980s.

Picture the scene: a slightly geeky teenager studying engineering and mathematics, the sort of person who spent most of his time taking systems apart to understand how they worked.

I was standing in a queue inside a bank branch. Back in those days banks had physical buildings. Inside those buildings were people called cashiers, and you could walk up to them with a cheque and actually interact with a human being.

When it was my turn, the cashier asked a simple question.

"Account name?"

I paused.

What did she mean?

Being the sort of person who instinctively tries to interpret language precisely, I assumed she must be asking about the type of account.

So I replied:

"Current account."

The cashier looked slightly puzzled.

After a moment it became clear what she had actually meant. She was asking for the name of the person who held the account.

In other words: Account holder name.

Somewhere along the way, the word holder had quietly disappeared.

Perhaps a form field had been shortened. Perhaps a training manual had simplified the wording. Perhaps someone had simply decided that "Account Name" sounded cleaner.

To the cashier, the phrase had become a routine piece of language - something she had probably repeated hundreds of times that week.

But removing that single word had introduced ambiguity.

"Account Name" could just as easily refer to:

  • the name of the account holder
  • the type of account
  • the label attached to the account
  • or something else entirely

The system still worked. The cheque was processed. The interaction moved on.

But the precision of the original concept had already begun to erode.

Looking back, that small moment was my first encounter with what would later become a familiar pattern in much larger systems. A tiny piece of Semantic Rot had already taken hold.

Identifier Rot

Semantic Rot rarely remains confined to language. Once systems begin to evolve, it tends to appear in a particularly visible form: identifiers.

Most organisations believe they have a single identifier for a customer or account. In reality they often have several.

Consider a fairly typical modern interaction with a financial or utility company.

When the account is first created, the customer is given a sixteen digit reference number. That number appears to be the primary identifier for the relationship.

So far, so good.

But then the communications begin.

Emails from the company refer to a nine digit reference number. This appears to be derived from the original sixteen digits, although not from the beginning or the end but from somewhere mysteriously in the middle.

When the customer calls the support line, the agent asks for an eight digit number. This turns out to be yet another subset of the same identifier.

Finally, when logging into the online portal, the customer is asked to enter an eighteen character identifier consisting of two letters followed by the original sixteen digits. Those two letters identify which legacy brand the account originally belonged to before the company was merged into the current organisation.

From inside the system this probably makes perfect sense.

The sixteen digit value may be the canonical identifier. The shorter numbers might exist for legacy systems that only accept eight or nine digits. The two-character prefix may have been introduced after a merger to distinguish accounts originating from different subsidiaries.

Each decision was locally reasonable.

But from the outside the result looks like this:

16 digit master reference
9 digit email reference
8 digit phone reference
18 character portal identifier

The customer simply experiences a confusing landscape of numbers that all appear to represent the same thing.

To make matters worse, the same identifier may be described using different names depending on the communication channel.

  • A text message might refer to a Reference Number.
  • An email might refer to a Customer Number.
  • The web portal might display a Customer ID.
  • A call centre agent might ask for the Account Number.

Each term sounds plausible. Each may be technically defensible somewhere inside the organisation.

But for the person interacting with the system, the meaning becomes blurred. Which number is the system actually asking for?

The underlying relationship has not changed. There is still one customer and one account. But the identifiers and the language surrounding them have slowly diverged.

This is a classic form of Identifier Rot.

Over time, systems accumulate identifiers faster than they remove them. Legacy systems require compatibility. New systems introduce new keys. Mergers bring additional identifier schemes into the organisation. Very few identifiers are ever retired. Instead they accumulate.

Eventually the system contains multiple identifiers that all appear to refer to the same thing, even though each originated for a different historical reason.

The structure still works. The numbers still resolve to the correct records. But the meaning of those numbers - and the language used to describe them - has quietly begun to decay.

Responsibility Rot

If Identifier Rot confuses customers about which number represents their account, the next stage of Semantic Rot appears when organisations themselves begin to lose clarity about who actually owns the relationship with the customer.

Insurance provides a particularly good example.

Some years ago a tree on our freehold property was damaged and needed to be removed. Fortunately the property was insured, so the process seemed straightforward enough.

The policy had been arranged through an insurance broker, who had placed the cover with an insurance company. So when the damage occurred, the natural first step was to contact the broker.

At this point the structure of the system began to reveal itself.

The broker explained that the policy had been placed with Insurance Company B, but that company was in turn operating under the umbrella of a larger top-level insurance company that actually carried the risk.

Fine.

But the claim itself would not be handled by either of those companies.

Instead the process had been outsourced to a claims handling company.

Progress, we thought.

Until it became clear that the claims handling company was not actually handling the claim either. They had subcontracted the operational work to another claims handling firm.

At this point the conceptual model looked something like this:

Customer
↓
Broker
↓
Insurance Company
↓
Top-Level Insurance Company
↓
Claims Handling Company
↓
Outsourced Claims Handler

Six layers deep before reaching the person who could actually deal with the problem.

Eventually we were introduced to Laura, the actual human claims handler who managed the case and, to her credit, did an excellent job resolving the situation. The claim was processed. The tree was removed. The system ultimately worked.

But the experience revealed something interesting.

From inside the industry, each layer of this structure almost certainly had a clear purpose. Brokers specialise in distribution. Insurers manage underwriting. Claims management firms provide operational services. Outsourced teams handle case processing. Each role exists for a reason.

But from the customer's perspective the mapping between roles and organisations had become opaque.

  • Who actually owned the relationship?
  • Who was responsible for the claim?
  • Which company was ultimately making the decision?

The system still functioned, but the meaning of responsibility had been stretched across so many entities that it was difficult to see where the authority actually lived.

This is another form of Semantic Rot. Not in language. Not in identifiers. But in responsibility.

Over time, the connection between roles and the organisations performing those roles becomes blurred. Layers of abstraction accumulate. Outsourcing and mergers introduce additional entities. The conceptual model that once explained the system quietly fades from view.

Structural Rot

Semantic Rot does not just appear in customer communications or organisational structures. It also appears inside the software systems that organisations build to run those processes.

A common example appears in database schemas.

At first glance the schema may look perfectly sensible. Tables are properly normalised. Primary keys and foreign keys are defined correctly. Relationships between entities are clear.

Imagine a simple relational structure representing customers, accounts, and transactions. On paper it might expose identifiers like:

customer_id
customer_ref
id
account_id
account_ref

Technically, nothing here is wrong. The schema may be fully compliant with third normal form. The joins work. The database engine is perfectly happy.

But the semantics are already beginning to blur.

Is customer_ref the same concept as customer_id, or does it represent something different? Does the column simply called id refer to the same entity as account_id, or something else entirely? Why do some tables use _id while others use _ref?

The system functions correctly, but understanding the meaning of the identifiers requires increasing amounts of context.

This pattern often continues into application code. Developers encounter identifiers such as:

CustomerID
CustomerRef
CustomerNumber
AccountID
AccountNumber

Each of these names appears reasonable in isolation. Yet across a large system they may refer to subtly different concepts: internal database keys, externally visible identifiers, legacy account numbers, or identifiers inherited from previous systems.

Once again the system still runs. But the conceptual model that once explained the relationships between these identifiers has begun to fragment.

A database can be perfectly normalised and still be semantically rotten.

In more extreme cases, the structure of the system begins to obscure meaning entirely. One platform I worked on - I won't name the company here to protect the innocent - pushed this pattern even further. Instead of a conventional codebase where behaviour could be discovered through files and functions, much of the system logic lived inside MongoDB documents linked together by GUID references.

To navigate the system, developers had to use a custom tree-based user interface provided by the vendor. The tree represented workflows as a series of nodes connected by those GUIDs. The difficulty was that the tree only made sense if you already understood the structure. A new developer opening the interface would see a forest of nodes and identifiers with no obvious starting point. Finding the logic behind a feature meant following a chain of GUID references through multiple documents until the relevant configuration fragment appeared.

The machine could execute the workflow perfectly.

But for a human trying to understand the system, the semantics had effectively disappeared.

This is the end stage of Structural Rot.

A system reaches full Semantic Rot when the machine can still execute it, but humans can no longer easily explain how it works.

A Short Taxonomy of Semantic Rot

The examples so far will feel familiar to anyone who has worked inside large organisations or complex software systems. Although they appear in different forms, they tend to fall into a small number of recurring patterns.

Semantic Rot
  ├── Linguistic Rot
  ├── Identifier Rot
  ├── Responsibility Rot
  └── Structural Rot

Linguistic Rot

The simplest form of Semantic Rot begins with language itself. Words are shortened, simplified, or reused until their original meaning becomes ambiguous. The bank cashier's request for an "Account Name" is a small example of this process. Somewhere along the way the phrase "Account Holder Name" was compressed, and a piece of semantic precision disappeared with it. The system still works, but the language now requires interpretation.

Identifier Rot

As systems evolve, they tend to accumulate identifiers. A single customer or account may acquire multiple reference numbers: internal database keys, externally visible account numbers, legacy identifiers from older systems, and new identifiers introduced during mergers or platform migrations. Over time these identifiers proliferate. The system still resolves the records correctly. But the meaning of the identifiers becomes increasingly difficult to explain.

Responsibility Rot

In organisational systems, Semantic Rot often appears as confusion about who actually owns a process or relationship. The insurance example illustrates this pattern. Brokers, insurers, underwriters, claims management firms, and outsourced handlers may all participate in the same process. Each organisation performs a specific role, but the mapping between roles and entities becomes opaque to the customer. Responsibility has been distributed across so many layers that it is difficult to see where authority actually resides.

Structural Rot

Finally, Semantic Rot appears inside software systems themselves. Database schemas contain identifiers whose relationships are not immediately clear. Application code introduces additional variants. In more extreme cases, system behaviour is scattered across configuration fragments and GUID-linked documents that can only be navigated through specialised tooling. The system continues to execute correctly. But recovering the conceptual model behind it becomes increasingly difficult.

These patterns share a common characteristic. The system still functions. The code runs. The records resolve. The processes complete. But the meaning of the system - the relationship between words, identifiers, and responsibilities - gradually becomes harder to recover.

This is the essence of Semantic Rot.

Why Systems Rot

Semantic Rot does not appear suddenly. Most systems do not wake up one morning in a state of confusion. Instead the decay happens slowly, through a long sequence of small, locally reasonable decisions. Over time those decisions accumulate until the conceptual clarity of the system begins to erode.

Several mechanisms drive this process.

Semantic Compression

One of the simplest ways meaning begins to decay is through Semantic Compression. Language is shortened or simplified until the context required to preserve meaning is gradually removed.

The bank cashier's question - "Account Name" - illustrates this perfectly. At some point the original phrase was almost certainly "Account Holder Name." Compressing the phrase by removing a single word made it shorter and arguably cleaner. But it also introduced ambiguity.

Small compressions like this happen constantly inside systems:

Account Holder Name
→ Account Name
→ Account
→ ID

Each step feels harmless. But with each step the semantic precision decreases. Semantic Rot rarely begins with a dramatic failure. It begins with small acts of Semantic Compression.

Identifier Accretion

Systems are very good at adding identifiers and very bad at removing them. A new platform introduces a new primary key. A legacy system requires compatibility with an older identifier. A merger brings an additional numbering scheme into the organisation. Rather than retiring the old identifiers, systems typically keep them and add translation layers between them. Over time the system accumulates identifiers faster than it accumulates understanding.

Local Optimisation

Most systems are not designed as coherent wholes. They evolve through a series of local optimisations. A team modifies a field name to simplify a user interface. Another team introduces a new identifier for a reporting system. A third team creates a translation table to integrate two databases that use different keys. Each decision is locally rational. But the global semantics of the system drift further apart with each change.

Organisational Boundaries

Conway's Law tells us that systems reflect the communication structures of the organisations that build them. When different teams or organisations design different parts of a system, they often develop their own terminology and conceptual models. Over time those models diverge. The result is a system where multiple teams use slightly different language and identifiers to describe the same underlying concepts.

Mergers and Outsourcing

Corporate mergers and outsourcing arrangements accelerate this process dramatically. When two organisations combine their systems, both identifier schemes typically survive. Additional prefixes or translation tables are introduced to distinguish between them. Outsourcing introduces further layers of abstraction, separating operational roles from the organisations performing them. Each step preserves functionality. But the semantic model becomes more complex and harder to explain.

None of these mechanisms are unusual. In fact they are almost inevitable in systems that evolve over years or decades. The result is that many large systems continue to operate long after the clarity of their original conceptual models has begun to decay.

And yet, for the most part, the systems continue to work. Humans are remarkably good at compensating for semantic ambiguity.

Humans Compensate

One reason Semantic Rot can persist for so long is that human beings are remarkably good at compensating for it. When systems become semantically inconsistent, people rarely stop and analyse the problem. Instead they adapt.

A customer calling a support line may hear the agent ask for a reference number.

The customer pauses.

"Which one?"

"The one from the email."

A few seconds later the correct number is exchanged and the conversation continues. Neither side needs to fully understand the underlying system. They simply negotiate meaning through conversation and context.

The same pattern appears inside organisations.

Developers learn which identifier a particular system expects. Support staff know which number appears on the monthly statement. Operations teams remember which prefix corresponds to which legacy platform.

Much of this knowledge never appears in documentation. It lives in conversations, Slack messages, and institutional memory.

In other words, humans act as a kind of semantic repair layer.

When language becomes ambiguous, people ask questions. When identifiers multiply, they learn which ones are usually required. When organisational roles overlap, they work out who to call.

This ability to infer meaning from context is one of the reasons complex systems can continue operating long after their conceptual models have begun to decay. The system may be semantically inconsistent, but humans quietly compensate for the gaps.

For decades this has been good enough.

But a new class of systems is beginning to interact with these environments. Systems that do not possess the same ability to negotiate meaning.

The Collision with Agentic AI

Humans are remarkably good at compensating for semantic ambiguity. When a call centre agent asks for "the reference number", the customer and the agent usually manage to work out which number is meant.

Autonomous systems will not have that luxury.

When software agents begin interacting with complex systems on our behalf, ambiguity about identifiers, roles, and responsibilities becomes a much more serious problem.

Machines operate very differently from people in this respect. Where humans tolerate semantic fuzziness, machines require semantic precision.

A human can infer that "customer number", "reference number", and "account ID" probably refer to the same underlying relationship. An autonomous system cannot safely make that assumption.

That inability to negotiate meaning is not a minor limitation. It is a fundamental incompatibility with systems that have quietly accumulated semantic ambiguity for years.

For decades, Semantic Rot has been tolerable because humans quietly compensated for it. But a new type of participant is beginning to interact with these environments. As agentic systems begin to interact directly with the same environments, that tolerance may begin to disappear.

Organisations may discover that the greatest obstacle to automation is not the intelligence of the machines, but the ambiguity of the systems those machines are expected to operate.

Closing

Systems do not usually fail when they stop running.

They fail when people stop understanding them.

What This Means Now

Semantic Rot is not a dramatic event. It is an accumulation of small decisions, each locally reasonable, each slightly eroding the precision of the conceptual model. A word shortened here. An identifier added there. A responsibility quietly redistributed across one more layer of abstraction.

The system keeps running. The numbers keep resolving. The processes keep completing. Humans keep adapting, filling the gaps with context and institutional memory.

But the meaning of the system - the shared understanding of what it does and why - slowly fades.

Agentic AI will not paper over those gaps. It will find them, and it will stop. The organisations that have quietly tolerated Semantic Rot for years will find it surfacing, clearly and expensively, at exactly the moment they are trying to automate.

The cure is not a new framework or a better identifier scheme. It is the discipline to ask, repeatedly and without embarrassment: what does this actually mean?

That question is harder than it sounds. But it is the one that matters most.

👋 Enjoyed the article?

Book a Call with Us