Skip to main content
Back to Blog
saaskubernetesplatform-engineeringcloud-nativetenant-clusterseu-data-actdora-compliancecontrol-plane-security

Sovereignty Is a Control Plane Problem: Why Tenant Clusters Are Becoming the Default Pattern for Regulated Kubernetes

Regulated Kubernetes tenant cluster design is now mandatory for sovereignty. Learn why control plane jurisdiction—not region—defines EU and DORA compliance.

Zyfolks Team ·

Picking Frankfurt doesn’t make you sovereign. Platform teams are learning this the hard way as the EU Data Act, NIS-2, DORA, and the UK Data Use and Access Act 2025 turn what used to be a procurement checkbox into a control-plane design problem. Regulators have stopped asking where the bytes sit and started asking who can read them, who can subpoena them, and what happens to them when the operator above your stack gets a letter from a foreign court. If your answer is “we picked a region,” you’ve already lost the audit.

Why Data Residency Stopped Being Enough

The EU Data Act has been fully applicable since January 11, 2025, and the UK Data Use and Access Act 2025 is rolling out through 2026 with portability rules that, per the source, actually bite. NIS-2 and DORA already shape day-to-day platform decisions in regulated sectors. What changed is the scope of the question. Auditors now want to know about control planes, encryption keys, administrative access, and operational responsibility — not just the IP geolocation of a pod.

Residency claims now have to be defensible end-to-end. A SaaS vendor running EU workloads on a control plane operated out of Seattle is not sovereign, no matter how many Frankfurt nodes it owns. If you’re building a multi-tenant SaaS platform that touches EU or UK customers, the procurement questionnaire you’ll get in 2026 will go three layers deeper than the one you got in 2023. The take: residency was the warm-up. Control-plane jurisdiction is the actual exam.

The Four Properties Regulators Actually Want

The source decomposes the demand into four repeating properties: jurisdictional containment of every component that can read tenant data (including the control plane), operational autonomy from any single vendor, cryptographic and access control that keeps keys out of foreign hands, and portability so workloads can move when the geopolitics shift. None of these are satisfied by a region picker.

Each property maps to a different failure mode. Containment fails when a shared API server straddles jurisdictions. Autonomy fails when your platform depends on a hosted control plane you can’t rebuild. Cryptographic control fails when etcd is encrypted with keys held by your provider. Portability fails when workloads are wired to proprietary APIs. A useful exercise: take your current platform diagram, label each box with the legal jurisdiction of the entity that operates it, and see how many boundaries the data crosses just to serve one request. Most teams don’t like what they find.

Why One Big Kubernetes Cluster Doesn’t Cut It

Kubernetes is converging as the substrate for sovereign platforms — CNCF backing, declarative APIs, Kyverno, Argo CD, Flux, KubeVirt, Cilium, SPIFFE/SPIRE, and the Swisscom sovereign Kubernetes reference architecture published on architecture.cncf.io all point the same direction. But the moment real sovereignty requirements meet a single shared cluster, the seams split. One control plane serves all tenants, so a jurisdictional incident on one tenant’s data plane risks affecting everyone sharing the API server, etcd, and controllers. Namespaces aren’t isolation — CRDs and admission webhooks are shared, and a misconfigured controller leaks across the cluster. The usual fallback, a full cluster per jurisdiction per environment per team, is operationally brutal.

Workload placement alone doesn’t establish sovereignty. You can park EU workloads on EU nodes all day; if the shared control plane centralizes administrative authority, policy enforcement, APIs, and controllers somewhere else, that’s where the real sovereignty boundary lives. If you’re an SRE running a regulated platform today, this is the conversation to have with your architecture group: where does the API server actually live, and who has root on the etcd backing it?

Tenant Clusters as a First-Class Sovereignty Primitive

The pattern worth learning, per the source, is the tenant cluster: a Kubernetes control plane carved out for a single isolation boundary, running on top of a shared underlying cluster. Each tenant cluster gets its own API server, controller manager, scheduler, and data store. vCluster is the open-source reference implementation called out in the source, but the architectural idea applies to anything that gives each isolation boundary its own control plane.

Four properties make this useful for sovereignty work. Independent control planes mean one tenant’s CRDs, admission webhooks, and audit logs don’t bleed into another’s — and separate upgrade cycles let you run Kubernetes 1.34 in one jurisdiction and 1.33 in another while a regulator finishes reviewing a CVE. Pluggable backing stores let tenant state live on encrypted volumes on hardware you own. Real tenant isolation, optionally paired with vNode, gVisor, or Kata Containers, gives you a runtime boundary instead of pretending namespaces are one. And because each tenant cluster exposes a conformant Kubernetes API, workloads stay portable across underlying clusters — hyperscaler, sovereign provider, or bare metal. Concretely: a German sovereign AI cloud provider, Polarise, is cited in the source as running this exact shape in production with vCluster Labs through a March 2026 partnership — shared GPU capacity underneath, a tenant cluster per customer on top, all under EU jurisdiction. The prediction: by late 2026, “one cluster per jurisdiction” will look as dated as “one VM per service” looked by 2018.

Jurisdiction as a Cluster, Declared in Git

The clearest implementation of this pattern is one tenant cluster per jurisdiction, declared as Kubernetes resources and reconciled by GitOps. A custom resource describes location, backing store, and policy posture. A controller does the work. The source lays out the constraints that have to land in that resource: a node selector or topology constraint pinning every pod to nodes labelled with the right jurisdiction as a hard constraint, a backing store for the tenant cluster’s own state living in the chosen jurisdiction, audit log sinks local to that jurisdiction, and a Kyverno or OPA Gatekeeper policy bundle enforcing residency, image provenance, and SBOM requirements from inside the tenant cluster.

The payoff is that adding a new jurisdiction becomes a pull request, not a cluster build. The audit trail for “why is tenant X’s data in jurisdiction Y” is a commit history, not a console screenshot. For a SaaS company serving both EU and UK customers under the EU Data Act and the Data Use and Access Act 2025, that’s the difference between a clean audit and a quarter spent reconstructing change history from CloudTrail logs. If you’re building AI agents or automation that touches regulated data across jurisdictions, this is the shape the procurement team will eventually ask you to match.

Blast Radius Math When Something Goes Wrong

Residency is the easy half of sovereignty. The harder half is what happens when something breaks — a subpoena, a misconfigured controller, a leaked credential. Tenant clusters narrow the blast radius in ways the source spells out. A CLOUD Act-style request against the operator of the underlying cluster doesn’t automatically yield a tenant cluster’s etcd contents if that backing store lives with a jurisdiction-local operator. The legal target and the technical target are decoupled by design. A compromised admission webhook in the EU tenant cluster can’t reach into the UK tenant cluster because they don’t share a control plane. Platform-wide CRD upgrades stage per tenant cluster, so version skew becomes a feature.

GPU workloads sharpen this further. GPU-heavy workloads are the loudest argument for hyperscaler dependency and also the workloads most exposed under the EU AI Act’s Article 12 logging and governance requirements. Per the source, GPU-bearing underlying clusters on sovereign bare metal — Metal3 and Ironic, Tinkerbell, or vMetal — with per-customer tenant clusters getting GPU access through Dynamic Resource Allocation gives AI platform teams a credible answer to both “where does training run?” and “who can subpoena the weights?” The take: any sovereign AI roadmap that doesn’t have an answer at the control-plane layer is going to fail its first serious audit in 2026.

FAQ

Q: What is a tenant cluster in Kubernetes? A: A tenant cluster is a Kubernetes control plane — its own API server, controller manager, scheduler, and backing store — carved out for a single isolation boundary and running as pods inside a shared underlying Kubernetes cluster. vCluster is the open-source implementation referenced in the source article. From the workload’s perspective it looks like a normal conformant Kubernetes cluster; from the platform’s perspective it’s a tenant-scoped control plane it can declare, audit, and move.

Q: Does running workloads in an EU region make a platform sovereign under the EU Data Act? A: Not on its own. The source argues that workload placement doesn’t establish sovereignty when a shared Kubernetes control plane centralizes administrative authority, APIs, controllers, and policy enforcement elsewhere. Sovereignty boundaries follow the control plane and the entity operating it, not just the geolocation of the nodes.

Q: Does the tenant cluster pattern eliminate CLOUD Act exposure? A: No. The source is explicit that the pattern reduces and partitions exposure but doesn’t erase it. If a US-headquartered company operates the underlying Kubernetes cluster, CLOUD Act exposure for that operator remains. Workloads where the operator’s jurisdiction is itself the threat model still need a sovereign operator on sovereign hardware, with the tenant cluster pattern layered on top.

Key Takeaways

  • Treat the control plane, not the region, as the primary sovereignty boundary — auditors in 2026 will follow the API server, not the IP block.
  • Stand up tenant clusters per jurisdiction with their own backing store, audit sinks, and Kyverno or OPA Gatekeeper policy bundles before regulators force you to retrofit it.
  • Pair tenant clusters with bare-metal provisioning via Metal3, Ironic, Tinkerbell, or vMetal when hardware sovereignty — not just operator sovereignty — is in scope, especially for GPU and AI workloads under EU AI Act Article 12.
  • Don’t over-apply the pattern: each tenant cluster is a real control plane to monitor, upgrade, and back up, so reserve it for boundaries with real legal or risk weight, not free-tier customers.
  • Expect procurement questionnaires across regulated sectors to start asking explicitly about control-plane jurisdiction, etcd custody, and per-tenant upgrade cadence — platforms that can answer in Git commits will close deals faster than ones that answer in PDFs.

Have a project in mind?

Tell us what you're building — we reply within 24 hours.