I’d like to introduce a project that I’ve been working on: the PinnedDeployment Kubernetes CRD.
PinnedDeployments function a lot like Deployments: they’re a way to run some kind of service in Kubernetes. But, there’s a twist: PinnedDeployments actively support 2 concurrent versions of a service.
To understand why this is special, let’s look at an example Deployment:
apiVersion: apps/v1 kind: Deployment metadata: name: example labels: app: example spec: replicas: 5 selector: matchLabels: app: example template: metadata: labels: app: example spec: containers: - name: webserver image: nginx:1.6 ports: - containerPort: 80
This deployment says “run 5 nginx pods, matching this template”. Obviously this example is a little contrived (who runs a plain nginx server?), but it’s an easy demo.
Suppose we want to update our image version. With a Deployment, we update the spec, which kicks off the Deployment update process. For a typical Deployment using the RollingUpdate strategy, the Deployment controller starts creating new pods with the new version, and terminating old pods with the old version. Under the hood, this is done by managing a pair of Kubernetes ReplicaSets.
This happens (pretty quickly!) until all pods are up to date. You can control how many replicas are updated at a time, or hit pause mid-update, but updating Deployments is mostly a “press start and watch” experience. The Deployment will stop if new pods fail to launch, but it won’t catch more delayed crashes, or application-level issues. Potentially “obvious” issues wouldn’t be caught until they’ve already been fully deployed.
You can read about Deployments in more depth in the Kubernetes docs.
Suppose we have that same example, and want to use a PinnedDeployment instead. Here’s what the PinnedDeployment would look like:
# Example PinnedDeployment apiVersion: rollout.zeitgeistlabs.io/v1alpha1 kind: PinnedDeployment metadata: name: example spec: selector: matchLabels: app: example replicas: 5 replicasPercentNext: 20 replicasRoundingStrategy: "Nearest" templates: previous: metadata: labels: app: example spec: containers: - name: webserver image: nginx:1.16 ports: - containerPort: 80 next: metadata: labels: app: example spec: containers: - name: webserver image: nginx:1.17 ports: - containerPort: 80
This PinnedDeployment says that we want 5 nginx pods, and want 20% of them to be the new version.
If we create this, we get 4 “previous” pods (nginx 1.16) and 1 “next” pod (nginx 1.17.
To roll out more of the next version,
we simply increase the
To roll back (partially or completely),
we reduce the
Suppose that we have fully rolled out our new version,
replicasPercentNext to 100.
Now, we want to try using a different image: httpd (Apache).
Here’s what it would look like:
... templates: previous: ... spec: containers: - name: webserver image: nginx:1.17 ports: - containerPort: 80 next: ... spec: containers: - name: webserver image: httpd:2.4 ports: - containerPort: 80
Notice the the old
having been fully rolled out,
is now the
next version features the httpd container image.
We could technically achieve the previous features by creating 2 Deployment objects in our deployment tooling, and managing both objects.
However, there is a significant bonus to having a single object: all math regarding the replicas is abstracted.
You can point any tool at a PinnedDeployment (including the Kubernetes Horizontal Pod Autoscaler), and tell the PinnedDeployment how many replicas you want. External tools do not need to be aware of the old:new replicas breakdown.
Observant readers will note that this ties into my recent post on autoscaling imbalances. If you created 2 Deployments, each with their own HPA, you will likely find that they don’t scale evenly.
How It Works
The internals of the PinnedDeployment controller were also heavily inspired by the upstream Deployment controller.
The controller watches for PinnedDeployment create/update events, and triggers a reconcile function when one occurs.
The reconcile function searches for all ReplicaSets that match the PinnedDeployment’s labels
It compares the
next pod specs in the PinnedDeployment definition,
and tries to match which ReplicaSet has the
previous pod spec,
and which ReplicaSet has the
next pod spec.
All other label-matching ReplicaSets are deleted.
next ReplicaSets have been identified,
the controller compares the desired state
(including number of replicas for each ReplicaSet)
against the actual ReplicaSet.
If there is a discrepancy,
the ReplicaSet is updated.
If a PinnedDeployment is deleted, its child ReplicaSets will be automatically garbage collected.
I’ll be making a post soon on some of the experience of writing the controller and CRD. In particular, I encountered some hiccups with current CRD shortcomings, and the pod specs. [Edit: post is here.]
PinnedDeployment’s API group is currently v1alpha1, which means… the API is in its infancy. Don’t use it in production yet, but I’d love to know if you’re interested. Once some of the rough edges of the implementation are sorted, it will be upgraded to a beta API.
Don’t expect anything dramatic too soon, but I have some longer term plans of abstractions to build on top of PinnedDeployments.