Hey there, welcome to the Level Up article on Kubernetes. Before beginning the article, I wanted to ask you a few questions. Are you or your team in need of using Kubernetes for container orchestration? Do you wish to learn Kubernetes and are confused where to begin from? Are you willing to transform your organization? Do you want to simplify container software orchestration? Then I would like to tell you that this article is the answer to all such questions.
Kubernetes is meant to simplify things and this article is meant to simplify Kubernetes for you! Kubernetes is a powerful open-source system that was developed by Google. It was developed for managing containerized applications in a clustered environment. Kubernetes has gained popularity and is becoming the new standard for deploying software in the cloud.
Learning Kubernetes is not difficult (if the tutor is good) and it offers great power. The learning curve is a little steep. So let us learn Kubernetes in a simplified way. The article covers Kubernetes’ basic concepts, architecture, how it solves the problems, etc.
Kubernetes offers or in fact, it itself is a system that is used for running and coordinating applications across numerous machines. The system manages the lifecycle of containerized applications and services. For managing the lifecycle, it uses different methods that foster predictability, scalability, and high availability.
A Kubernetes user gets the freedom to decide as well as define how the applications should run and communicate. The user is also allowed to scale up/down the services, perform rolling updates, switch traffic between different application versions, and more. Kubernetes also offers different interfaces and platform primitives for defining/managing applications.
There are different types of hardware required for Kubernetes. Understand one thing that Kubernetes itself doesn’t need hardware, but the functioning system needs all the hardware.
What is a node in Kubernetes? One of the smallest units of computing in Kubernetes is known as a node. It is a singular machine and resides in a cluster. Node doesn’t necessarily need to be a physical machine or a part of the hardware. It is either a physical machine or a virtual machine. For the data center, a node is a physical machine. Similarly, for Google Cloud Platform, a node is a virtual machine. For now, we are discussing the hardware of Kubernetes. So let us consider it accordingly. But don’t limit a node as the “hardware part”.
There is always a layer of abstraction for nodes in any machine. But here, there is no need to worry about the characteristics of a machine. In fact, we can simply view each machine as a set of CPU and RAM resources. These machines are in a cluster and their resources can be used as per the requirement. When it comes to Kubernetes, you think in this way! The moment you give freedom of utilizing resources from any machine, the system becomes flexible and dynamic. Now any machine can substitute any other machine within a Kubernetes cluster.
We have already discussed nodes, right? They appear to be small and cute processing units. They do their job in their own small home. All this sounds perfect, but it isn’t the Kubernetes way yet! So there comes the cluster. You don’t need to worry about the state of individual nodes because they are a part of the cluster. For example, if the individual nodes aren’t performing well, there should be someone to manage all this. Also, a cluster is a collection of multiple nodes. All the nodes pool together their resources and together make a powerful machine.
The cluster is intelligent. Do you know why? When the programs are deployed onto the cluster, it dynamically handles the distribution. In short, it assigns tasks to individual nodes. In between the process, if any node is added or removed, the cluster shifts the work as per the need. The programmer need not concentrate on such things like which individual machine is running the code, etc. Oh, I just remember something really interesting. Do you remember “Borg” from Star Trek? Where did this name come from? It came from an internal Google project based on Kubernetes.
As mentioned, the programs run on the cluster and are powered by the nodes. But they don’t run on specific nodes. The programs run dynamically. Thus, there is a need for storing the information and it can’t be stored randomly in any file system. Why? For example, a program saves the data to a file. But later on, the program is relocated to another node. Next time when the program needs the file, it won’t be at the expected place. The location address is changed. To solve this problem, the traditional local storage related to each node is considered as a temporary cache for holding programs. But any locally saved data can’t be expected to persist.
So who stores data permanently? Yes, the persistent volumes store it permanently. The cluster manages the CPU and RAM resources for all the nodes. But, the cluster is not responsible for storing data permanently in the persistent volume. The local drives and cloud drives can be attached to the cluster like a persistent volume. It is something like plugging an external drive to the cluster. The persistent volumes offer a file system. It can be mounted to the cluster without being associated to any specific node.
So this is the hardware part of Kubernetes. Now let us move on to the software part.
The overall concept of Kubernetes is based on software. So this is the main part of Kubernetes.
In Kubernetes, Linux containers host the programs. These containers are globally accepted and already have pre-built images. The images can be deployed on Kubernetes. Do you know what is containerization? It allows you to create Linux execution environments.
The programs, as well as its dependencies, are packed in one single file and shared on the internet. So anyone can download the container and deploy it on their infrastructure as per the requirement. Deployment is hassle-free with just a little setup. A container can be created with the help of a program. This enables the formation of effective CI and CD pipelines.
The containers are capable of handling multiple programs. But it is recommended to limit one process per container because it helps in troubleshooting. Updating the containers is easy and the deployment is easy if it is small. It is better to have many small containers, rather than a big one.
Kubernetes has some unique features and one of them is that it doesn’t run the containers directly. It rather wraps up one or more containers into a pod. The concept of a pod is that any containers within the same pod use the same resources and the same local network.
The benefit is that the containers can communicate with each other easily. They are isolated but are readily available for communication.
The pods can replicate in Kubernetes. For example, an application becomes popular and a single pod isn’t able to sustain the load. At that moment, Kubernetes can be configured for deploying new replicas of the pod as per the requirement.
But it is not necessary that replication occurs only during heavy load. A pod can replicate during normal conditions as well. This helps in uniform load balancing and resisting failures.
Pods are capable of holding multiple containers but one should limit one or two if possible. The reason is that the pods scale up and down as a single unit. The containers within the pod must also scale together with the pods. Their individual needs aren’t important at this stage. On the other side, this leads to wastage of resources and expensive bills.
To avoid all this, limit the pods to few containers. If you ever come across the term “side-cars”, it means helper containers. So there is the main process container and there could be some helper containers.
If you notice, pods are the basic units in Kubernetes. But they aren’t launched directly on a cluster. They are managed by more than one layer of abstraction. This overall makes “deployment”. The main purpose is to declare the number of replicas running at a time.
When the deployment is added, it spins up the number of pods and monitors them. Similarly, if the pod doesn’t exist anymore, it deployment re-creates it.
The fun part is that with deployment, there is no need to deal with pods. By declaring the state of the system, everything is managed automatically.
We have discussed all the basic concepts of Kubernetes. Using them, you can create a cluster of nodes. Once the cluster is made, it is time to launch deployments of pods on the cluster. But how will you allow external traffic to your application? We haven’t discussed this yet.
As per the concept of Kubernetes, it offers isolation between pods and the outside world. To communicate with a service running within a pod, the outsider needs to open a channel. The channel is a medium of communication. It is known as “ingress”.
There are numerous ways for adding an ingress to the cluster. The most common being through an Ingress Controller or Load Balancer. We can discuss the difference between both these methods, but it is not required at the moment. It is more technical.
For now, you should understand that Ingress is important for experimenting with Kubernetes. While you have everything correct, if you don’t consider Ingress, you will not reach anywhere!
After discussing the deployment part of Kubernetes, it is necessary to understand the importance of Kubernetes.
Containers are virtual machines. They are lightweight, scalable, and isolated. The containers are linked together for setting security policies, limiting resource utilization, etc. If your application infrastructure is similar to the image shared below, then container orchestration is necessary.
It might be Nginx/Apache + PHP/Python/Ruby/Node.js app running on a few containers, communicating with the replicated database. Container orchestration will help you manage everything by yourself.
Consider that your application keeps on growing. For example, you keep on adding more features/functionalities and at some point in time, you realize that it has suddenly become a huge monolith.
Now, it is impossible to manage the vast application because it eats up your CPU and RAM. So you finally decide to split the application into smaller chunks. Each one of them with a specific task. Now, your infrastructure looks like this:
So you need a caching layer with some queue system for a better asynchronous performance. Now, there are challenges like service discovery, load balancing, health checks, storage management, auto-scaling, etc.
Under all such hectic circumstances, who will come to your help? Yes, container orchestration will be your savior! The reason is that container orchestration is extremely powerful and solves most of the challenges.
The main players are Kubernetes, AWS ECS, and Docker Swarm. Out of all these, Kubernetes is the most popular! Kubernetes offers the largest community. Kubernetes solves all the major issues described. It is portable and runs on most of the cloud providers, bare-metal, hybrids, and the combination of all these. Further, it is also configurable and modular. It provides features like auto-placement, auto-restart, auto-replication, and auto-healing of containers.
The most important thing of all, Kubernetes has an active online community. The members of this community meet-up online as well as in person, in major cities of the world. An international conference “KubeCon” has proved to be a huge success. There is also an official Slack group for Kubernetes. Major cloud providers like Google Cloud Platform, AWS, Azure, DigitalOcean, etc also offer their support channels. So what are you waiting for?
Learning Kubernetes is not as difficult as you think. I recommend the Kubernetes Video Tutorial Series. It offers stepwise, detailed Kubernetes learning. Level Up also offers a Learn Kubernetes from a DevOps guru Video Course on Kubernetes. So let us begin!