I’d like to introduce a project that I’ve been working on: the PinnedDeployment Kubernetes CRD.
PinnedDeployments function a lot like Deployments: they’re a way to run some kind of service in Kubernetes. But, there’s a twist: PinnedDeployments actively support 2 concurrent versions of a service.
“Regular” Deployments
To understand why this is special, let’s look at an example Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example
labels:
app: example
spec:
replicas: 5
selector:
matchLabels:
app: example
template:
metadata:
labels:
app: example
spec:
containers:
- name: webserver
image: nginx:1.6
ports:
- containerPort: 80
This deployment says “run 5 nginx pods, matching this template”. Obviously this example is a little contrived (who runs a plain nginx server?), but it’s an easy demo.
Suppose we want to update our image version. With a Deployment, we update the spec, which kicks off the Deployment update process. For a typical Deployment using the RollingUpdate strategy, the Deployment controller starts creating new pods with the new version, and terminating old pods with the old version. Under the hood, this is done by managing a pair of Kubernetes ReplicaSets.
This happens (pretty quickly!) until all pods are up to date. You can control how many replicas are updated at a time, or hit pause mid-update, but updating Deployments is mostly a “press start and watch” experience. The Deployment will stop if new pods fail to launch, but it won’t catch more delayed crashes, or application-level issues. Potentially “obvious” issues wouldn’t be caught until they’ve already been fully deployed.
You can read about Deployments in more depth in the Kubernetes docs.
PinnedDeployments
Suppose we have that same example, and want to use a PinnedDeployment instead. Here’s what the PinnedDeployment would look like:
# Example PinnedDeployment
apiVersion: rollout.zeitgeistlabs.io/v1alpha1
kind: PinnedDeployment
metadata:
name: example
spec:
selector:
matchLabels:
app: example
replicas: 5
replicasPercentNext: 20
replicasRoundingStrategy: "Nearest"
templates:
previous:
metadata:
labels:
app: example
spec:
containers:
- name: webserver
image: nginx:1.16
ports:
- containerPort: 80
next:
metadata:
labels:
app: example
spec:
containers:
- name: webserver
image: nginx:1.17
ports:
- containerPort: 80
This PinnedDeployment says that we want 5 nginx pods, and want 20% of them to be the new version.
If we create this, we get 4 “previous” pods (nginx 1.16) and 1 “next” pod (nginx 1.17.
To roll out more of the next version,
we simply increase the replicasPercentNext
value.
To roll back (partially or completely),
we reduce the replicasPercentNext
value.
Suppose that we have fully rolled out our new version,
by setting replicasPercentNext
to 100.
Now, we want to try using a different image: httpd (Apache).
Here’s what it would look like:
...
templates:
previous:
...
spec:
containers:
- name: webserver
image: nginx:1.17
ports:
- containerPort: 80
next:
...
spec:
containers:
- name: webserver
image: httpd:2.4
ports:
- containerPort: 80
Notice the the old next
version,
having been fully rolled out,
is now the previous
version.
Our new next
version features the httpd container image.
Scaling
We could technically achieve the previous features by creating 2 Deployment objects in our deployment tooling, and managing both objects.
However, there is a significant bonus to having a single object: all math regarding the replicas is abstracted.
You can point any tool at a PinnedDeployment (including the Kubernetes Horizontal Pod Autoscaler), and tell the PinnedDeployment how many replicas you want. External tools do not need to be aware of the old:new replicas breakdown.
Observant readers will note that this ties into my recent post on autoscaling imbalances. If you created 2 Deployments, each with their own HPA, you will likely find that they don’t scale evenly.
Mini Demo
How It Works
The internals of the PinnedDeployment controller were also heavily inspired by the upstream Deployment controller.
The controller watches for PinnedDeployment create/update events, and triggers a reconcile function when one occurs.
The reconcile function searches for all ReplicaSets that match the PinnedDeployment’s labels
(EG app: example
).
It compares the previous
and next
pod specs in the PinnedDeployment definition,
and tries to match which ReplicaSet has the previous
pod spec,
and which ReplicaSet has the next
pod spec.
All other label-matching ReplicaSets are deleted.
Once the previous
and next
ReplicaSets have been identified,
the controller compares the desired state
(including number of replicas for each ReplicaSet)
against the actual ReplicaSet.
If there is a discrepancy,
the ReplicaSet is updated.
If a PinnedDeployment is deleted, its child ReplicaSets will be automatically garbage collected.
I’ll be making a post soon on some of the experience of writing the controller and CRD. In particular, I encountered some hiccups with current CRD shortcomings, and the pod specs. [Edit: post is here.]
Status
PinnedDeployment’s API group is currently v1alpha1, which means… the API is in its infancy. Don’t use it in production yet, but I’d love to know if you’re interested. Once some of the rough edges of the implementation are sorted, it will be upgraded to a beta API.
Don’t expect anything dramatic too soon, but I have some longer term plans of abstractions to build on top of PinnedDeployments.