The data mesh is one of those ideas that sounds obvious once someone explains it to you. Of course data should be owned by the teams who create it. Of course infrastructure should be self-serve. Of course governance should be federated rather than centralised. And yet, building one in practice is anything but obvious.
"The hardest part of a data mesh is not the technology. It is the organisation deciding it trusts its own teams."
Why the central data team model breaks
For most enterprises, data starts as a central concern. A single team owns the pipelines, the warehouse, the dashboards. This works until it does not — which is usually when the business grows past a certain size and the central team becomes a bottleneck.
The symptoms are familiar: long queues for data requests, pipelines nobody understands, dashboards nobody trusts. The central team is overwhelmed. Domain teams are frustrated. Data quality degrades because ownership is diffuse.
The four principles
Zhamak Dehghani's original formulation gives us four principles to work from. Domain ownership means the team that creates the data is responsible for it as a product. Data as a product means applying product-thinking — discoverability, usability, reliability — to data assets. Self-serve infrastructure means domain teams can build and operate their own pipelines without central bottlenecks. Federated computational governance means global standards (security, compliance, interoperability) are enforced without central control of the data itself.
What the AWS implementation looks like
The reference architecture I use most often separates the data plane (where data lives and is processed) from the governance plane (where policies are defined and enforced). Domain teams have full autonomy over their data plane. The governance plane is shared but lightweight — it sets the rules, it does not own the data.
- Each domain owns an AWS account or at minimum a dedicated S3 prefix
- Lake Formation manages fine-grained access across domain boundaries
- AWS Glue Data Catalog is the shared discovery layer — all domains register here
- Data products are versioned and documented; breaking changes require a deprecation period
- A shared observability layer (CloudWatch + custom dashboards) surfaces data quality metrics across domains
The organisational reality
None of this works without the organisational change that precedes it. Domain teams need to accept accountability for data quality — which means being measured on it. Central data teams need to transition from ownership to enablement — which is a genuine identity shift for many people.
The migrations I have seen succeed have one thing in common: a senior sponsor who understands that this is an organisational transformation, not a technology project. The AWS architecture is the easy part.