Kubernetes for the absolute beginner - Part II
In the previous part of this article, we looked at why Kubernetes is needed and also defined its underlying technologies – containerization and Docker. If you are a complete newbie and do not know those terminologies well, it is advisable to first read through that first article.
Next, let’s look at how exactly Kubernetes works, its various components and services, and more details about how it is used to orchestrate and manage and monitor containerized environments. For simplicity’s sake, we will assume we are using Docker containers, even though as previously mentioned, Kubernetes supports several other container platforms apart from Docker.
Kubernetes Architecture and Components
First off, it is important to realize that Kubernetes utilizes the ‘desired state’ principle. This means that you define the desired state of your components, and it is up to the Kubernetes engine to align them to this state. For instance, you want to always have your web server running in 4 containers for load balancing purposes, and your database replicated into 3 different containers for redundancy purposes. This is your desired state. If anyone of these 7 containers fails, the Kubernetes engine will detect this and automatically spin up a new container to ensure the desired state is maintained.
Now let’s define some important Kubernetes components, as also covered in more detail in this blog post:
- When you first set up Kubernetes, you create a cluster. All other components are part of a cluster. You can also create multiple virtual clusters, called namespaces, which are part of the same physical cluster. This is very similar to how you can create several virtual machines on the same physical server. If you don’t need and therefore don’t explicitly define any namespaces, then your cluster is created in the default namespace that always exists.
- Kubernetes runs on nodes, which are individual machines within the cluster. Nodes may correspond to physical machines if you run your own hardware, but more likely correspond to virtual machines running in the cloud. Nodes are where your application or service is deployed, where the work in Kubernetes gets done. There are 2 types of nodes – the master node and worker nodes.
- The master node is a special node that controls all the others. On the one hand, it’s a node like any other in the cluster, which means it’s just another machine or virtual instance. On the other hand, it runs the software that controls the rest of the cluster. It sends messages to all the other nodes in the cluster to assign work to them, and they report back via an API Server on the master.
- The master node also itself contains a component called the API Server. This API is the only endpoint for communication from the nodes to the control plane. The API Server is critically important because this is the point through which worker nodes and the master communicate about the status of pods, deployments, and all the other Kubernetes API objects.
- Worker nodes do the real work in Kubernetes. When you deploy containers or pods (to be defined shortly) in your application, you’re deploying them to be run on the worker nodes. Workers have the resources to host and run one or more containers.
- Kubernetes’ logical, not physical, unit of work is called a pod. A pod is analogous to a container in Docker. Remember we saw earlier that containers let you create independent, isolated units of work that can be run independently. But to create complex applications such as a web server, you often need to combine multiple containers, which then run and are managed together in one pod. This is what pods are designed for – a pod allows you to take multiple containers and specify how they come together to create your application. And this further clarifies the relationship between Docker and Kubernetes – a Kubernetes pod usually contains one or more Docker containers, all managed as a unit. Read more about pods and containers in this other blog post.
- A Kubernetes service is a logical set of pods. Think of a service as a logical grouping of pods which provides a single IP address and DNS name through which you can access all pods within the service. With a service, it is very easy to set up and manage load balancing, and this helps a lot when you need to scale out your Kubernetes pods as we shall see shortly.
The Replication Controller or ReplicaSet is another key feature of Kubernetes. It is the component responsible for actually managing the pod lifecycle – starting pods when instructed by the service or if pods go offline or are accidentally stopped, and also killing pods if the service instructs to do so, perhaps because of decreased user load. So in other words, the replication controller helps achieve our desired state regarding the specified number of running pods.
What is Kubectl?
You use the kubectl utility to communicate with the Kubernetes cluster and your pods. Kubectl is an interface/ environment that consists of several Bash-like commands. For example, to list all pods in your cluster you issue this command: kubectl get nodes. And this other command to view your cluster information: kubectl cluster-info. You can also use the Yaml language to create definitions of almost all the objects we have discussed in this blog series. This other blog post delves some more into Yaml for Kubernetes.
Autoscaling in Kubernetes
Remember that one of the reasons for setting up Kubernetes, rather than using Docker containers directly, is because of Kubernetes’ ability to autoscale to meet the demands of our workload.
Autoscaling is achieved by setting up your cluster to increase the number of nodes as service demand increases, and also reduce the number of nodes as demand decreases. But also, keep in mind that nodes are ‘physical’ structures – we put ‘physical’ in quotes because remember that many times, they are actually virtual machines. Anyway, the fact that nodes are physical machines means that our cloud platform must allow the Kubernetes engine to create new machines. Google Cloud has always supported this ability, and Microsoft Azure and Amazon’s EKS service recently added support for it as well. You can read more about a comparison of the various cloud providers’ support for Kubernetes in this article.
In this 2-part introduction, we have only offered an intro to Kubernetes. If you would like to read and learn much more, especially on how to install and actually start working in Kubernetes, then checkout the other Kubernetes tutorials, check out Cloudplex’s great blog and how-to series. You can also go through the official Kubernetes documentation site.