Kubernetes for the absolute beginner - Part I

Kubernetes (pronounced "koo-burr-NET-eez" or "kyu-burr-NET-eez") comes from a Greek word that means 'pilot' or 'helmsman.' An apt name, seeing as Kubernetes, helps you navigate the choppy seas of containerized applications.

What does Kubernetes do? In a nutshell, it provides a platform or tool to help you quickly orchestrate or scale up your containerized applications, especially on Docker.

What does all that mean? What on earth are containerized applications and Docker? What is 'orchestration'? All good questions that we will answer shortly.

Containerization and Containers

So what is a container?

Well, first consider a virtual machine (VM), which is, just as the name says, a virtual server that you can connect to remotely, such as the servers you can start on Amazon Web Services (AWS's) EC2 or on Microsoft's Azure platforms.

Next, think of a web-based application developed and running on a VM – perhaps it includes a MySQL database, a React.js frontend, and some Java libraries, all running on the Ubuntu operating system (OS). Don't worry if you are not familiar with one of those technologies – for now, just keep in mind that an application consists of various components, services, and libraries, and it runs in an environment such as an operating system.

Now, our application is packaged into one VM, which remember, includes our Ubuntu OS. This makes VM's really bulky – typically a few to several gigabytes in size.

The VM contains a whole operating system with all its libraries, and most of them are not used by our applications. If you need to recreate or backup or scale-up this application, you need to copy the entire unwieldy thing, and then wait for the several minutes it will take to start it in a new environment. If you want to upgrade the version of a given component, say a React.js app, you will need to rebuild the entire machine image. Also, if two of your apps are using a common dependency, upgrading the latter will impact both apps, while sometimes, you need to upgrade this very dependency for just one app, not both. This is what we can describe as dependency hell.

The solution to this mess is containers. A container is the next level of abstraction after virtual machines, in which each component of the whole application is individually packaged into a standalone unit. And each of these units is what's called a container. In this way, we achieve full portability (the ability to run an app on any OS or environment) by separating code and app services from the underlying architecture. So in our case, the Ubuntu OS is one unit (container). The MySQL platform and the database within it is another container. The React.js environment and its attendant libraries - that's another container.

But wait. How is the MySQL database 'running' on its own? Surely a database must be itself run on an OS, right? Quite true.

A higher-level container such as the MySQL container will actually include the necessary libraries to communicate and integrate with the lower-level OS container. So you can think of containers as being individual elements of a stack, with each element being dependent on the one below it. And this is similar to how shipping containers are stacked on a ship or at a port, with each container's stability depending on the one below it for support. So at their heart, application containers are a controlled execution environment. They allow you to define that environment from the ground up, starting from the operating system to the individual versions of libraries you want to use to the version of your code you want to add.

Containers

Note: An important concept related to containers is 'microservices.' This refers to the creation and packaging of an application's individual components as independent services so that each one can be easily replaced, upgraded, debugged, etc. In our example, we would create one microservice each for our React.js frontend, another microservice for the MySQL database, another for the Java middleware part, and so on. It is obvious that microservices are complementary to and mesh well with containerization. Read more about microservices in Microservices Mesh – Part I.

And now, Docker

You now have a pretty good idea of what containers are, right? Now, just as Amazon.com is the best-known site for online retail, Docker is the most widely used platform for creating and managing applications via containers, aka the most widely used containerization tool. You can read about Docker in much more detail in this article.

Docker is an open-source project, initially developed in 2013. It allows you to package and create containers and container-based apps. It is supported on all Linux distributions as well as Windows and macOS.

There are other containerization tools, such as CoreOS rkt, Mesos Containerizer, and LXC. However, according to all the containers surveys, the vast majority (~80%) of containerized apps are run on Docker.

Now Back to Kubernetes

Now that you have a clear overview of Docker and containers let's get back to Kubernetes.

First, a brief history: Kubernetes was originally developed in the mid-2000s by Google engineers for their internal use. In 2014 they released it as an open-source project for use by anyone.

And how does Kubernetes come in when talking about containers? Let's go back to our containerized web-application example. Assume our app is growing successfully, and we are signing up an increasing number of users every day.

Now we need to scale up our backend resources so that users navigating to our website do not notice pages taking forever to load, and those annoying page or site timeouts. One solution would be to simply increase the number of containers and then employ one or more load-balancers to distribute the incoming load (in the form of user requests) to our containers. This would work well, but only up to a point. When we start to get to hundreds of thousands or millions of user requests, then even this approach is itself not scalable. You would require dozens, perhaps hundreds of load-balancers, which is another headache in itself. We would also run into problems if we wanted to perform any upgrades to our site or app because the load balancing cannot take this into account. We would need to individually configure each load balancer, and then upgrade the containers served by that balancer. Imagine the amount of manual labor you'd have to undertake when you have 20 load balancers and 5 or 6 small updates every week.

What we need is a way to roll out any changes all at once to all containers we want. And also a way to easily spin up, or orchestrate, new and ready-to-go containers whenever we need. And bonus points if this spinning-up process can itself be automated. Well, we're in luck, because this is exactly what Kubernetes does!

In the next section of this two-part article, we will delve some more into the innards of Kubernetes to understand exactly how it works. Stay tuned.