It Can.
For all my sins, I'm often told the worst is that I am a Chelsea fan. The people who tell me this are currently mainlining schadenfreude given Chelsea's recent performances; at the time of writing this article, we are currently on a six match losing streak, the likes of which haven't been seen since before I was born. You have to laugh, or else you'll cry is becoming an all too familiar adage...
In the pursuit of that laughter, I'm reminded of one of the all time great press conference quotes from the inimitable Mick McCarthy during his brief, tumultuous stint as Blackpool manager for the 22-23 season:
"In terms of results Mick, that's one win in seventeen, it can't go on like this can it?" — Reporter
"... it can." — Mick McCarthy
If I was a Blackpool fan, I'd probably find this answer immensely frustrating rather than the hilariously deadpan response that it is. When facing the seemingly insurmountable, resignation, fatalism, even nihilism often feel like the only appropriate responses. This experience is not isolated to the highs and lows of sports, and that's no truer when it comes to my second worst sin – working in data consultancy as a platform architect.
What do Supreme and Data Leaders have in common?
An intoxicating, inexhaustible appetite for hype. Every year a New™️ idea drops into the data ecosystem that threatens to underpin 99% of conference talks — Gartner tells us this is so.
The very nature of that hype cycle attempts to push us towards a label. For every centralized data platform there is a decentralized, for every "fabric" there is a "mesh". Sometimes those concepts are preached dogmatically; in others the phrases are cobbled together into some sort of best-of-all-worlds bingo card. "We are making the world a better place through our federated, self-service, hybrid, mesh-adjacent, fabric-enabled, product-oriented data platform." Pick four concepts, put them on a slide, call it a target operating model.
None of these ideas are wrong on their own terms. The trouble starts when the work of actually improving how an org handles data gets quietly swapped out for the work of looking like the kind of org that handles data well. The two are not the same thing, and the gap between them is where most platform time, money, and sanity goes to die.
The honest version of continual improvement — the thing buried underneath twenty years of DevOps sloganeering and genuinely useful DORA research — says you improve from where you are, at the rate your org can absorb, in the direction your org actually needs to go. Which means there is no universal endpoint, no mission accomplished, no "completed it mate". There is only the next responsible move, evaluated against the org you've got rather than the one in the slide deck.
This is where the platform team lives.
Not at the destination, not even particularly close to it, but somewhere on the journey — usually with a strategic direction that changes faster than the planning cycle, made up of teams that weren't picked for the system being asked of it, and a set of approval gauntlets designed for a world where releases happened twice a year. The interesting question isn't how do we get to the target state — truthfully, the org may never get there. The interesting question is what's worth building right now, given that the target state will have moved by the time we finish.
"It can't go on like this, can it?"
It can.
And so the job becomes less about marching toward an architecture and more about building one that can absorb the next reorg, the next change of CDO/CTO, the next pivot in strategic direction, without it costing everything. Which sounds modest until you try to do it. Because before you can build anything adaptive, you have to understand the thing that constrains every platform decision more than any technology choice ever will.
The System Is the Org Chart
In 1967, Melvin Conway submitted a paper to the Harvard Business Review entitled "How Do Committees Invent?". They rejected it. The central claim — that organisations which design systems are constrained to produce designs which are copies of the communication structures of those organisations — was rejected on the basis that the thesis lacked sufficient proof. Nearly sixty years later and with time on its side, it's still the most reliably rediscovered truth in software engineering. Every generation encounters it fresh and acts surprised.
Here's what Conway's Law means in practice, stripped of the academic framing: your platform will look like your org chart whether you want it to or not. Not because anyone decided it should. Because the people building it can only coordinate in the ways the organisation allows them to coordinate. If you have three teams, you'll get a three-part system. If those teams don't talk to each other, the system won't talk to itself either. If sign-off lives with a central architecture board, your platform will have a centralised bottleneck built into it, regardless of what the target-state diagram says.
This isn't a bug. It's physics. And pretending otherwise is how you end up with a beautifully drawn "federated, domain-driven data mesh" that in practice is three teams firing off messages into a Teams-shaped communication chasm, because that's the only coordination mechanism they actually have.
The football analogy is almost as irresistible.
Every struggling club has a "philosophy" — a style of play, a formation, a tactical identity they're recruiting toward. But the actual football they play is determined by the players they've got, the relationships between them, and how much the manager's instructions survive first contact with the opposition. Ask me how I know.
Data platforms are the same. A "federated" architecture requires federated teams — domain teams with genuine ownership, genuine autonomy, and genuine technical capability. If you don't have that — and most organisations don't — then "federated" just means "fragmented with extra steps." A "centralised" platform requires a central team with enough capacity, context, and authority to actually own the thing. If the central team is three people drowning in tickets, "centralised" just means "bottleneck."
Conway's Law doesn't tell you which architecture to pick. It tells you that the architecture you pick doesn't matter nearly as much as the org structure you already have. The platform team that ignores this spends its energy fighting the org. The one that understands it spends its energy designing for the org — working with the communication structures that exist, not the ones they wish existed, and building the platform that those structures can actually sustain.
This is where the hype-cycle thinking from the first section becomes actively dangerous.
"Data mesh" is a perfectly coherent idea if your org has strong domain ownership and mature engineering practices in each domain. For the rest of the world — which is most of the world — importing the vocabulary without importing the preconditions just creates a new set of labels for the same old dysfunction. The platform team ends up maintaining the fiction of domain ownership while quietly centralising everything through the back door, because that's what the org actually needs to function.
The honest move is to look at the org you've got and ask: what platform architecture can this organisation actually operate? Not aspire to operate. Not operate if we hire ten more people and reorganise twice. Operate, today, with the teams, skills, and communication patterns that exist right now. Start there. Build something that works. Then — and this is the key — build it in a way that can evolve when the org evolves, rather than building for a future org that may never arrive.
The Terraform Is the Architecture
Here's a provocation: for most data platform engagements, the Terraform is the architecture. Not the diagrams. Not the Confluence pages. Not the slides with the hexagons and the arrows. The Terraform. Because the Terraform is (usually, rogue overprivileged Platform Engineers notwithstanding) the thing that actually gets applied to the cloud, and the cloud is the thing that actually runs the data. Everything else is commentary.
This matters because of a gap that exists on almost every project I've worked on. There is the architecture as designed — usually a clean diagram produced during discovery, agreed by an architecture review board, and filed somewhere nobody looks at again. And then there is the architecture as implemented — the actual Terraform, the actual resource configuration, the actual networking topology that emerged from six months of "we had to make some pragmatic decisions." These two things are rarely the same, and the distance between them grows with every sprint.
The platform team lives in that gap. And the IaC layer is where the gap becomes visible, because code doesn't lie. The diagram can say "hub-spoke networking with private endpoints." The Terraform will tell you whether that's actually true, or whether someone opened up the firewall three months ago to unblock a demo and nobody closed it.
This is why I have a strong view on how Terraform should be written for platform work, and it comes down to a principle that sounds almost offensively simple: KISS over DRY.
KISS All The Time. DRY, Occasionally.
In software engineering, DRY — Don't Repeat Yourself — is treated as a near-universal good. And in application code, it usually is. If you're writing the same function in three places, extract it. But Terraform isn't application code. It's infrastructure definition. And the failure mode of DRY in Terraform is different from the failure mode of DRY in Python.
Here's what happens when you over-DRY your Terraform. You build a module that abstracts a Databricks workspace. You parameterise everything — networking, compute, governance, storage, identity. The module is beautiful. It handles twelve different configurations through a web of conditional logic and nested variable maps. A senior engineer who built it can navigate it in their sleep.
Then that senior engineer leaves the project. Or the engagement ends and the client inherits the codebase. And the person who picks it up — a competent engineer, but not the person who built it — spends three days trying to understand why changing a workspace name requires editing a YAML file that feeds a locals block that feeds a variable map that feeds a module that conditionally creates a resource that references another module. They don't understand it. They can't safely change it. They're afraid to touch it. So they work around it — they create a new resource outside the module, manually, and now you have drift. The abstraction that was supposed to create consistency has created fragility.
Now — in fairness to the DRY advocates, and I have been one — duplication has its own failure mode, and it's not a small one. If you've copy-pasted a resource block across three environments and a security requirement changes — a new tagging policy, a private endpoint configuration, a compliance constraint that didn't exist when you wrote it — you now have to find and update it in three places. Someone misses one, and you've got drift anyway. The well-designed module, for all its navigational complexity, at least gives you a single place to make that change. There are also patterns where DRY is almost inarguably correct even on KISS terms: provider configurations, backend state, default tags, naming conventions — things with a high cost of inconsistency and a low cost of abstraction. A shared module with a stable, narrow interface that handles your provider block and mandatory tags isn't clever engineering. It's just good hygiene.
So this isn't "never abstract." It's a claim about where the default should sit. The question isn't should we DRY this — sometimes you obviously should. The question is what happens when the person who didn't write this needs to change it, and if the answer is "they'll need to understand four layers of indirection first," the abstraction is probably not paying for itself. Abstract when the cost of duplication clearly exceeds the cost of indirection. When it's a close call, err on the side of the person who inherits it being able to read what's in front of them.
What if that next person is a certain agent, for privacy's sake, let's call it Claude C. No, that's too obvious, let's say C. Code.
An LLM can trace the dependency chain through four levels of nested modules. What it cannot do is tell you whether those four levels should exist. That judgment — should this complexity be here — is the one that matters, and it's irreducibly human. No amount of token-window will substitute for an engineer who looks at a codebase and says "this is over-abstracted for the team that will inherit it." And if you're banking on AI tooling to paper over that gap indefinitely, it's worth noting that the era of heavily subsidised AI-assisted development is already ending — building your maintenance strategy around a tool whose pricing model is in flux is its own kind of fragility.
KISS over DRY says: write the Terraform so that the person who inherits it can understand it. If that means repeating a resource block across three environments with slightly different values, do it. If that means having a flat module structure where you can see every resource without traversing four levels of nesting, do it. If that means pushing your YAML transformation to a single, visible locals block rather than distributing it across modules, do it. The goal isn't elegant code. The goal is code that can be safely changed by someone who didn't write it.
That may not sound exciting to you. That may not be the elegant, deeply abstracted, endlessly reusable module architecture you imagined when you started the engagement. But to butcher and reinterpret Mike Bassett:
If you can fill the unforgiving sprint
With 60 seconds worth of CI run,
Yours is the platform and everything that's in it,
And — which is more — you'll pass the handover, my son!
Ladies and gentlemen, we will be deploying flat-module-f***ing-Terraform.
There's a Conway's Law dimension here too. The shape of your Terraform reflects the shape of your team. A deeply abstracted, heavily DRY codebase is designed for a team with deep shared context — everyone knows the module structure, everyone knows the variable conventions, everyone knows where to look. That team exists during the engagement. It does not exist six months after handover. Since the platform will outlive the engagement, optimise for the team that inherits it — not the one that built it.
What's Worth Building
If the hype cycle is a trap, the org chart is a constraint, and the Terraform is the truth — then what does good actually look like? Not as a target architecture. Not as a set of tools. As a disposition toward building.
I think it comes down to a handful of principles that are simple to state, hard to practice, and almost never appear on conference slides.
Design for the org you have, not the one you want. This sounds like giving up. It isn't. It's the precondition for any change that actually sticks. If the organisation can sustain a centralised platform today, build a good centralised platform. If domain teams exist and have real ownership, build for federation. If — as is almost always the case — it's somewhere in between, build for that messy middle and make it explicit. The worst outcome is building for a future state that requires an org change nobody has committed to making.
Optimise for the next responsible move, not the end state. The target architecture is a fiction. Not because it's wrong, but because it's a snapshot of thinking at a point in time, and the world moves. The platform team's job is not to reach the target. It's to make the next move that improves the org's position without over-committing to a direction that might change. This is genuinely hard, because it means living with things that aren't ideal and resisting the urge to rebuild them on principle. The senior engineer who wants to refactor the entire module structure because it doesn't match the reference architecture is not wrong about the architecture — they're wrong about the cost of the change relative to the value.
Build things that can be inherited. This is the KISS-over-DRY argument generalised. Every platform artefact — every module, every pipeline, every governance pattern — should be built as if the person who inherits it is competent but has no shared context with the person who built it. Because that's exactly what happens. The engagement ends. The team changes. The platform remains. If it can only be operated by the people who built it, it's not a platform — it's a dependency.
Hire for judgment, not just execution. Fowler, Joshi, and Venkatraman recently formalised the term "Expert Generalist" for engineers with genuine depth in their domain and the ability to spot patterns across adjacent ones. In platform work, this is the difference between someone who can follow a runbook and someone who can debug a networking issue they haven't seen before by reasoning from how VNet peering actually works. Playbooks get you to competent. Judgment gets you to adaptive. And adaptive is what you need when the target state moves, the CDO changes, and the vendor deprecates the service your architecture depends on. You can't hire this from a job spec that lists twelve tools; you find it by asking people to reason through a problem they haven't rehearsed.
Make the right thing easy, not mandatory. Golden paths, not golden gates. If your standards require enforcement through approval gates and compliance checks, they're too hard to follow. The best platform standards are the ones that feel like the path of least resistance — the default module does the right thing, the default pipeline has the right checks, the default governance is built in. People follow them not because they have to, but because deviating is more effort than conforming.
None of this is glamorous. None of it makes for a good conference slide. "We built an adaptive platform that acknowledges the org chart, uses deliberately simple Terraform, and invests in people over tooling" will not get you a keynote slot. But it's the work that actually matters — the work of building something that survives contact with reality, that can be inherited by people who weren't in the room when the decisions were made, and that improves incrementally rather than aspirationally.
"It can't go on like this, can it?"
It can.
And if you're thoughtful about what you build and honest about the constraints you're building within, it can go on getting better.

Liam Dunphy is a platform architect and regrettably a Chelsea fan. He writes about the intersection of platform engineering, organisational design, and the futility of target-state architecture at youmecicd.com.