Canary Release Explained
in DevOps , Kubernetes , DevSecOps , Machine Learning
What is a canary release?
The canary release concept comes from the 1920s coal mining industry. Back then, miners would carry caged canaries to help test the oxygen levels in the mines. If the canary died, it was a sign that there was too much carbon monoxide and they would leave the mine immediately.
In today’s world, a canary release is used to test the performance of the new application version in real-world usage. If there are bugs, then the new version is rolled back and the issues are fixed before another release. If there are no bugs and it works as expected, then it is scaled up until it replaces the old version completely. All this is made possible by the ability to control users’ traffic.
Traffic control in a canary release:
In a canary release, traffic is split based on the application version that it is being directed to. At first, the new application version is deployed with 0% of the traffic directed to it while the old version handles 100% of the traffic. Then a small percentage of traffic is directed to the new version and monitoring of its performance starts. If there is an issue, that traffic is redirected back to the old version and the deployment process is stopped. If there is no issue, more traffic is gradually directed to the new version until the old version’s traffic gets to 0%.
Traffic routing is a key feature of a canary release and it plays a big role in making it an effective deployment strategy. With this strategy, you can select the traffic that you want to direct to the new version randomly or subjectively using set criteria. This level of traffic control allows the developer to collect meaningful data from the users’ interactions with the new version. Istio and Ambassador edge stack are traffic control tools that are commonly used in canary releases on Kubernetes.
Istio: This is an agile traffic controller that accepts programmable adjustment of traffic passing through its service mesh. Its ability to enforce policies right after receiving the instructions makes a canary release flexible because roll-out and roll-backs can be implemented fast.
Ambassador edge stack: It implements traffic routing between different services using the weighted round-robin scheme. This is a load balancing strategy that allows for unequal traffic distribution. Important metrics are collected for all the traffic the tool handles and this makes it easy to monitor the progress of a canary release.
Testing in a canary release:
When undertaking a canary release, we perform canary testing to evaluate the performance of the application in real-world usage. One of the ways to perform canary testing is by using feature flags. This works by allowing the developers to separate feature enablement and code release so as to create more testing dimensions. For instance, it can turn some features on or off remotely for a specific group of users or for all the application users. This allows it to measure feature prominence, performance, and importance to the users who are targeted by the changes. All this data provides meaningful insight related to the users’ interaction with the new application or features.
Caution: Even though a canary release will allow for application testing, it should not be a replacement for other types of tests such as unit testing, capacity testing, and A/B testing. Canary testing should be used to test applications that have passed all other tests. In short, tests performed in a canary release should only show the performance of the new feature, code, or configuration in a production environment. In other words, the tests performed are aimed at increasing the developers’ confidence in the application.
A practical use case: How YouTube used canary release
Google is one of the big tech companies that is known to use the canary release strategy while deploying new code and configurations. A while back, YouTube tested a new feature that would display a video preview and not just a static caption, when the pointer was put on a video. This was meant to give a user more information about a video beyond the views, caption, title, and description. They deployed this new application in some of their servers and directed a small subset of user traffic towards it. From this, they were able to measure two things: whether more users clicked on the videos after the preview and, if it helped users get the video they wanted to watch faster. The test was a success and they eventually rolled out the new application fully.
It is not easy to fully understand the capabilities of the canary version because it only handles a small subset of the traffic. Because of this, you can’t be sure of its performance relative to the existing version which is handling significantly more traffic. Therefore, the decision to move ahead with the deployment is made based on an analysis that isn’t very comprehensive. For instance, a canary version with a small subset of users can be fast but have significant latency when handling all users.
Canary release is very efficient in a situation where there are frequent deployments of new application versions. This is because it is easy to set up when rolling out light updates within a short period of time. While selecting a subset of users to be directed to the new version, one has the option of doing it randomly or based on set criteria. While canary testing can be very insightful, it should not be used as a replacement for other types of tests.
Create your first canary release deployment in minutes!
Try WildCard platform, a hassle-free CI/CD pipeline solution that will help you transform how you deliver your cloud-native applications. Start building, testing, and deploying in minutes not days!
Get similar stories in your inbox weekly, for free
Share this story:
The all-in-one monitoring solution for IT admins, DevOps and SREs
Get deep visibility into the performance of your complex enterprise applications and cloud native workloads. Identify potential issues, improve productivity, and ensure that your business and end users are unaffected by downtime and substandard performance ...
AIOps with Site24x7: Maximizing Efficiency at an Affordable Cost
In this post we'll dive deep into integrating AIOps in your business suing Site24x7 to …
IT Monitoring Powered by AIOps
Harness the power of artificial intelligence (AI) and machine learning (ML) to monitor your IT resources with Site24x7's artificial intelligence for IT operations (AIOps) and machine learning operations (MLOps). Improve mean time to repair (MTTR) issues with the help of Site24x7 AIOps ...
A Review of Zoho ManageEngine
Zoho Corp., formerly known as AdventNet Inc., has established itself as a major player in …
Should I learn Java in 2023? A Practical Guide
Java is one of the most widely used programming languages in the world. It has …
The fastest way to ramp up on DevOps
You probably have been thinking of moving to DevOps or learning DevOps as a beginner. …
Why You Need a Blockchain Node Provider
In this article, we briefly cover the concept of blockchain nodes provider and explain why …
Top 5 Virtual desktop Provides in 2022
Here are the top 5 virtual desktop providers who offer a range of benefits such …
Why Your Business Should Connect Directly To Your Cloud
Today, companies make the most use of cloud technology regardless of their size and sector. …