A Minimalistic Guide to Kata Containers

Recently I discovered an interesting project called Kata Containers. It is an open source project hosted by OpenStack foundation. Kata Containers is the merger of Hyper.sh runV and Intel’s Clear Containers.

Kata Containers provide the isolations guarantee of a virtual machine and speed and ease of use of containers. As shown in the image below, virtual machines in the top left provide the strictest form of isolation but they are slow to boot up and their size on disk range from 500MB to GBs. On the other hand, containers in the bottom right are fast and nimble but they don’t provide the strictest form of isolation. Kata Containers are best of both worlds. They provide the speed of containers and security and isolation guarantees of virtual machine.

katacontainers-vs-containers-vs-vms

Containers face shared kernel problem, where if on a single host you have multiple containers, if one of those containers gets exploited, you can potentially have access to all the other containers on that host.

Kata Containers are highly optimised virtual machines that run the end user application in a container. So in essence, there is a one-to-one mapping between container and virtual machine as shown below. These virtual machines are lightweight and optimised so you don’t pay the huge cost of running traditional virtual Machines.

katacontainers-architecture

The main difference between containers and kata containers is that containers rely on software virtualisation provided by kernel where as Kata containers rely on hardware virtualisation. Containers for different workloads share the same OS kernel which leads to security and privacy concerns. Kata Containers are addressing this need of securely running disparate workloads. They are fast to boot as the virtual machines use a trimmed down version of OS that’s only responsible for booting the VM and handling over the control to the container.

Kata containers are OCI compatible runtime which means you can use them with container orchestration platforms like Kubernetes. The below image shows how Kata Containers will work with Kubernetes.

kata-containers-architecture

Kubernetes Tip: How to access Pod metadata in containers running inside the pod?

Today, I faced a requirement where a running container need to access Pod’s metadata. For my usecase, the running container need to know the namespace it belonged to. After spending time on Google, I learnt about Kubernetes Downward API that exposes Pod information to running container in the form of environment variables.

Continue reading “Kubernetes Tip: How to access Pod metadata in containers running inside the pod?”

The Kubernetes Guide: Part 1: Learn Kubernetes by deploying a real-world application on it

This is the guide I wish I had when I was starting my Kubernetes journey. Kubernetes is a complex technology with many new concepts that takes time to get your head around. In this guide, we will take an incremental approach to deploying applications on Kubernetes. We will cover what and why of Kubernetes and then we will learn how to deploy a real-world application on Kubernetes. We will first run application locally, then using Docker containers, and finally on Kubernetes. The guide will also cover Kubernetes architecture and important Kubernetes concepts like Pods, Services, Deployment.

In this guide, we will cover following topics:

  1. What is Kubernetes?
  2. The real reasons you need Kubernetes
  3. Kubernetes Architecture
  4. Deploying a real world application on Kubernetes

What is Kubernetes?

Kubernetes is a platform for managing application containers across multiple hosts. It abstracts away the underlying hardware infrastructure and acts as a distributed operating system for your cluster.

Kubernetes is a greek for Helmsman or Pilot (the person holding the ship’s steering wheels)

kubernetes-os

Kubernetes play three important roles:

  1. Referee
    • Kubernetes allocates and manages access to fixed resources using build in resource abstractions like Persistent Volume Claims, Resource Quotas, Services etc
    • Kubernetes provides an abstracted control plane for scheduling, prioritizing, and running processes.
    • Kubernetes provides a sandboxed environment so that applications do not interfere with each other.
    • Kubernetes allows users to specify the memory and CPU constraints on the application. It will ensure application remain in their limits.
    • Kubernetes provides communication mechanism so that services can talk among each other if required.
  2. Illusionist
    • Kubernetes gives the illusion of single infinite compute resource by abstracting away the hardware infrastructure.
    • Kubernetes provides the illusion that you need not care about underlying infrastructure. It can run on a bare metal, in data centre, on the public cloud, or even hybrid cloud.
    • Kubernetes gives the illusion that applications need not care about where they will be running.
  3. Glue
    • Kubernetes provides common abstractions like Services, Ingress, auto scaling, rolling deployment , volume management, etc.
    • Kubernetes comes with security primitives like Namespaces, RBAC that applications can use transparently

I learnt about the three roles – Referee, Illusionist, and Glue from the book Operating Systems Principles and Practices by Thomas Anderson and Michael Dahlin

Continue reading “The Kubernetes Guide: Part 1: Learn Kubernetes by deploying a real-world application on it”

Kubernetes Tip: How to refer one environment variable in another environment variable declaration?

In Kubernetes, one way to pass configurable data to containers is using environment variable. Below is a pod definition that uses two environment variables.

apiVersion: v1
kind: Pod
metadata:
  name: api
spec:
  containers:
    - image: com.shekhargulati/api
      name: api
      env:
        - name: DATABASE_NAME
          value: "mydb"
        - name: DATASOURCE_URL
          value: jdbc:mysql://mysql:3306/mydb      
      ports:
        - containerPort: 8080

As you can see in the above Pod definition, we are using database name mydb twice. Isn’t it will be awesome if we can use DATABASE_NAME in the DATASOURCE_URL?

Kubernetes supports this use case by providing $(VAR) syntax as shown below.

apiVersion: v1
kind: Pod
metadata:
  name: api
spec:
  containers:
    - image: com.shekhargulati/api
      name: api
      env:
        - name: DATABASE_NAME
          value: "mydb"
        - name: DATASOURCE_URL
          value: "jdbc:mysql://mysql:3306/$(DATABASE_NAME)"
      ports:
        - containerPort: 8080