The Hidden Costs of Kubernetes
The advantages of cloud-native architecture are well-known by now. You get the scalability of the cloud along with the business rewards of an elastic architecture. One thing that often gets lost in the mix, however, is the productivity costs associated with a cloud-native approach.
In the US alone, over 70% of enterprises have adopted or are currently adopting cloud-native architecture, causing a surge in developers who are trying to learn the stack. More and more of these developers building applications using cloud technologies are experiencing the difficult world of configuration, complex networking, and container build times. These same developers are spending more and more of their time managing, configuring, and deploying new tools instead of delivering much-needed applications. The biggest expense in any software organization is developer time.
What is the cost of this productivity loss to the organization caught in the middle of the complexity? With developer salaries on the rise climbing higher into the low to the mid-six-figure range and with 5 million developers in the US alone, a little back-of-the-napkin estimate puts the cost around $12 billion annually. The benefits of a cloud-native infrastructure still far outweigh the costs of developer productivity losses, but it’s certainly far more than it needs to be.
Staying on the Cutting Edge
One critical area of productivity loss is keeping up with all the changing technologies. For instance, Docker and Kubernetes have only been popular for about 5 years, which seems ancient compared to tools like Istio (launched in 2017) and Knative (2018). Keeping up with these new and developing technologies is important to ensure you have the latest security and performance updates but can also be a full-time job.
Cloud-native architecture is still being developed and learning the latest technologies is a moving target. While at the same time, most computer science and software engineering programs don’t delve into the heart of these technologies. At best, graduates will have limited experience working with a handful of these cloud technologies, and not in any real depth. This results in their employer bearing this cost while they learn these technologies on the job.
And these skills can be difficult to master. Whereas a monolithic application runs on a single server and can be easy to reason about, the modern cloud stack is all about distributed applications. This means more time devoted to networking, databases, and the interaction between lots of moving parts. The barrier to entry is high to learn these technologies as it requires a considerable investment in cloud infrastructure just to manipulate a small deployment of tens or hundreds of cloud instances.
Trading Development for Maintenance
When developers learn best practices on the job this takes productive time away from other key activities as there are limited hours in a day. This means developers are spending less time writing application code and more time working on the cloud-native infrastructure (think crafting YAML files for Kubernetes, tweaking and building Docker files, and reading up on Istio best practices). The time and mindshare spent eats away at what businesses want their developers to be doing: writing core business code.
This leads to a sort of vicious cycle. If you don’t invest the time working on cloud-native infrastructure, you’ll miss the best practices necessary to optimize your architecture. The result is a sub-optimal architecture that is slow (costing you customers) or needs to compensate by scaling up (costing you money).
SEE ALSO: Microservices Mesh – Part I
Companies are trapped in this cycle of losing money due to their investment in cloud-native configuration or sub-optimal architectures. The short-term cost of under-investing in architecture may be easy to ignore, but as the cloud provider invoices increase, the impact of those architecture decisions becomes clear.
Monitoring and DevOps
It’s no coincidence that along with the rise of cloud-based architecture there is also an increase in the need for DevOps. Managing cloud deployments is a full-time job and requires a different set of skills than simple development. Businesses attempt to resolve this vicious cycle by hiring more skilled DevOps resources, which is a great solution if you can afford it.
Although the toolset is evolving, modern DevOps practices are also evolving to keep pace. A typical DevOps hire has experience with some or all the tools necessary to deploy a cloud-native application. Many are also experienced in monitoring the state of a production deployment. Keeping track of application health metrics on a production cloud by itself is hard to learn but is definitely worth a lot in today’s job market.
DevOps is a young field but is growing fast. Many developers start their careers joining a company right out of school and are not trained in DevOps. Once they gain this experience, a large number start their own consulting practices drawing on these new, in-demand skills.
While building a DevOps team (or training a new one) is an easy fix, this can become the highest expense in a growing company. A typical DevOps salary well exceeds $100,000, so building an average-sized team can cost close to a million dollars annually.
Automating Best Practices
So, what do we make of all this? Clearly, the cloud-native application is not going away. As this architecture continues to become more commonplace, some of the associated costs will decrease. Schools will add more cloud-native programs, DevOps resources will increase in supply to keep pace with demand. But this is still several years away.
For enterprises seeking a solution today, my best advice is to decrease costs through better tooling. Better tooling greatly lowers the barrier to entry for these technologies spreading knowledge about best practices throughout your organization simply by using the software. Through using modern tools, you automatically reap the benefits of the experts who created the tools. Paying your developers to learn and relearn the latest stack and monitoring tools is a far more expensive proposition.
By allowing you access to expertise, good tooling also reduces the initial barriers to entry into the cloud-native world. For hands-on learners, you can get up and running with a Kubernetes cluster configured by the best in the field in about ten mouse clocks. It doesn’t get easier than that!
Finally, tooling allows you to replace expensive manual processes with codified, automated processes. A lot of costs are incurred when organizations lean too heavily on manual processes, both in terms of salary as discussed above, and because manual processes are error-prone. Good tooling allows you to replace developer salaries with less costly software licenses. And because the software is testable, repeatable, and automated, it reduces the risk of manual error.
These days, the state of modern application design reminds me of my undergraduate education in mechanical engineering (later I switched to Computer Science). Back then we didn’t have graphing calculators or computers, so a lot of the math was manual: looking up the algorithm in a log table, a sine value in a trigonometry table or property values in a thermodynamic chart. Many of my classmates developed reputations for quickly looking things up and knowing where to find information. This type of experience is handy, but it’s not what you’re there to learn. In fact, it gets in the way of what you want to learn.
Similarly, today cloud architectures are in a truly manual phase. If you want to use microservices, there’s a lot that gets in your way: learning Kubernetes, configuring Istio, managing networking policies… These things are important, but they’re not what cloud-native is all about. They are additional costs on top of a powerful, useful stack. We can do better.
Founder CEO CloudPlex.io, Inc
Get similar stories in your inbox weekly, for free
Share this story:
Get deep visibility into the performance of your complex enterprise applications and cloud native workloads. Identify potential issues, improve productivity, and ensure that your business and end users are unaffected by downtime and substandard performance ...
We tested ManageEngine Applications Manager to monitor different Kubernetes clusters. This post shares our review …
Harness the power of artificial intelligence (AI) and machine learning (ML) to monitor your IT resources with Site24x7's artificial intelligence for IT operations (AIOps) and machine learning operations (MLOps). Improve mean time to repair (MTTR) issues with the help of Site24x7 AIOps ...
In this post we'll dive deep into integrating AIOps in your business suing Site24x7 to …