Issue #25: 10 Reads, A Handcrafted Weekly Newsletter For Software Developers

The time to read this newsletter is 150 minutes.

The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn and relearn. — Alvin Toffler

  1. The Hard Truth About Innovative Cultures: 20 mins read. This post answers a question that I was struggling to find the right answer. If there is one post that you should read this week then It should be this one. From the post:

    > A tolerance for failure requires an intolerance for incompetence. A willingness to experiment requires rigorous discipline. Psychological safety requires comfort with brutal candor. Collaboration must be balanced with a individual accountability. And flatness requires strong leadership. Innovative cultures are paradoxical. Unless the tensions created by this paradox are carefully managed, attempts to create an innovative culture will fail.

  2. When AWS Autoscale Doesn’t: 15 mins read. This post by folks at Segment share valuable lessons on AWS autoscaling. The key points for me in the post are:
    1. AWS autoscaling for ECS follows the formula new_task_count = current_task_count * ( actual_metric_value / target_metric_value ). The ratio actual_metric_value/target_metric_value limit the magnitude of scale out event. To overcome this, you either have to reduce the target value leading to over scale all the time or use custom CloudWatch metric
    2. The default cool down time for scale out event is 3 minutes and cooldown for scale in event is 5 minutes
  3. Multiply your time by asking 4 questions about the stuff on your to-do list: 10 mins read. This post won’t tell you how to magically make each day 38 hours long (we’re still working on that). But by assessing our tasks in terms of their significance, we can free up more time tomorrow.

  4. Dotfile madness: 10 mins read. I just counted my home directory has more than 30 hidden directories. The post makes a valid argument against proliferation of dot files and dot directories. The author writes:

    > Avoid creating files or directories of any kind in your user’s $HOME directory in order to store your configuration or data. This practice is bizarre at best and it is time to end it. I am sorry to say that many (if not most) programs are guilty of doing this while there are significantly better places that can be used for storing per-user program data.

  5. Life of a SQL query: 15 mins read. What happens when you run a SQL statement? We follow a Postgres query transformation by transformation as a query is processed and results are returned.

  6. Splitting Up a Codebase into Microservices and Artifacts: 10 mins read. This is the first post that you should read if you are thinking about Microservices. I like the way this post first talked about using module boundary to split the code base. If module boundaries are not enough then you should think about Microservices. In my opinion, you should choose Microservices 1) to scale engineering organization 2) the real need for your polyglot environment depending on your business problem.

  7. Golang Datastructures: Trees: 20 mins read. This is an awesome read even if you can’t comprehend Golang. This beautifully written post explains how to implement a simple DOM tree in Golang. It shows implementation of breadth first search and depth first search algorithms to implement find functionality. I thoroughly enjoyed this post.

  8. Deploying Python ML Models with Flask, Docker and Kubernetes: 30 mins read. This is an extensive tutorial that shows how to deploy Python Flask applications on Kubernetes. It covers how to deploy Machine Learning (ML) models into production environments by exposing them as RESTful API Microservices hosted from within Docker containers, that are in-turn deployed to a cloud environment.

  9. A Minimalistic Guide to Kata Containers: 5 mins read. This is a short post that I wrote about Kata Containers. Kata Containers provide the best of containers and virtual machines. Read the post to learn more.

  10. Building a Better Profanity Detection Library with scikit-learn: 15 mins read. This post covers how you can write your own profanity filter using machine learning. The author starts by giving reasons why he didn’t use existing profanity libraries and then he goes over the steps required to create your own profanity detection library.

https://www.youtube.com/watch?v=6oPj-DW09DU

A Minimalistic Guide to Kata Containers

Recently I discovered an interesting project called Kata Containers. It is an open source project hosted by OpenStack foundation. Kata Containers is the merger of Hyper.sh runV and Intel’s Clear Containers.

Kata Containers provide the isolations guarantee of a virtual machine and speed and ease of use of containers. As shown in the image below, virtual machines in the top left provide the strictest form of isolation but they are slow to boot up and their size on disk range from 500MB to GBs. On the other hand, containers in the bottom right are fast and nimble but they don’t provide the strictest form of isolation. Kata Containers are best of both worlds. They provide the speed of containers and security and isolation guarantees of virtual machine.

katacontainers-vs-containers-vs-vms

Containers face shared kernel problem, where if on a single host you have multiple containers, if one of those containers gets exploited, you can potentially have access to all the other containers on that host.

Kata Containers are highly optimised virtual machines that run the end user application in a container. So in essence, there is a one-to-one mapping between container and virtual machine as shown below. These virtual machines are lightweight and optimised so you don’t pay the huge cost of running traditional virtual Machines.

katacontainers-architecture

The main difference between containers and kata containers is that containers rely on software virtualisation provided by kernel where as Kata containers rely on hardware virtualisation. Containers for different workloads share the same OS kernel which leads to security and privacy concerns. Kata Containers are addressing this need of securely running disparate workloads. They are fast to boot as the virtual machines use a trimmed down version of OS that’s only responsible for booting the VM and handling over the control to the container.

Kata containers are OCI compatible runtime which means you can use them with container orchestration platforms like Kubernetes. The below image shows how Kata Containers will work with Kubernetes.

kata-containers-architecture

Using Java Flight Recorder to Profile Spring Boot applications

Few months back I had to do performance optimisation of a low latency application. The tool that helped me a lot was Java Flight Recorder. Today, I had to do some similar work and I completely forgot how I was able to launch flight recorder GUI. In this short post, I will show you the process that I followed to create flight recorder recordings.

From the official documentation,

Java Flight Recorder (JFR) is a tool for collecting diagnostic and profiling data about a running Java application. It is integrated into the Java Virtual Machine (JVM) and causes almost no performance overhead, so it can be used even in heavily loaded production environments. When default settings are used, both internal testing and customer feedback indicate that performance impact is less than one percent. For some applications, it can be significantly lower. However, for short-running applications (which are not the kind of applications running in production environments), relative startup and warmup times can be larger, which might impact the performance by more than one percent. JFR collects data about the JVM as well as the Java application running on it.

Continue reading “Using Java Flight Recorder to Profile Spring Boot applications”

Gradle Tips

Over the last few years I have started using Gradle as my primary build tool for JVM based projects. Before using Gradle I was an Apache Maven user. Gradle takes best from both Apache Maven and Apache Ant providing you best of both worlds. Gradle borrows flexibility from Ant and convention over configuration, dependency management and plugins from Maven. Gradle treats task as first class citizen just like Ant.

A Gradle build has three distinct phases – initialization, configuration, and execution. The initialization phase determine which all projects will take part in the build process and create a Project instance for each of the project. During configuration phase, it execute build scripts of all the project that are taking part in build process. Finally, during the execution phase all the tasks configured during the configuration phase are executed.

In this post, I will list down tips that I have learnt over last few years.

Continue reading “Gradle Tips”

Using Jenkins Config File Provider Plugin to allow Jenkins slave to access Maven’s global settings.xml

This week I had to write a Jenkins pipeline script (Jenkinsfile) that involved publish build artifacts to Nexus. The project is a Java Maven based Spring Boot application. The build script uses Maven Nexus plugin to publish build artifacts to Nexus. The build pipeline was executed on slaves running on OpenShift container platform. The build pipeline was failing when deploying to Nexus. The build was failing with Return code is: 401, ReasonPhrase: Unauthorized.

It was clear that issue is related to Nexus credentials not available to Maven. The Nexus credentials were available in Maven global settings.xml that resides under ~/.m2/settings.xml. In our case, settings.xml was available on the Jenkins master. As build pipeline ran on slave it does not had access to Jenkins settings.xml.

Continue reading “Using Jenkins Config File Provider Plugin to allow Jenkins slave to access Maven’s global settings.xml”

Kubernetes Tip: How to access Pod metadata in containers running inside the pod?

Today, I faced a requirement where a running container need to access Pod’s metadata. For my usecase, the running container need to know the namespace it belonged to. After spending time on Google, I learnt about Kubernetes Downward API that exposes Pod information to running container in the form of environment variables.

Continue reading “Kubernetes Tip: How to access Pod metadata in containers running inside the pod?”

The Kubernetes Guide For Java Developers: Part 1: Learn Kubernetes by deploying a real-world application on it

This is the guide I wish I had when I was starting my Kubernetes journey. Kubernetes is a complex technology with many new concepts that takes time to get your head around. In this guide, we will take an incremental approach to deploying applications on Kubernetes. We will cover what and why of Kubernetes and then we will learn how to deploy a real-world application on Kubernetes. We will first run application locally, then using Docker containers, and finally on Kubernetes. The guide will also cover Kubernetes architecture and important Kubernetes concepts like Pods, Services, Deployment.

In this guide, we will cover following topics:

  1. What is Kubernetes?
  2. The real reasons you need Kubernetes
  3. Kubernetes Architecture
  4. Deploying a real world application on Kubernetes

What is Kubernetes?

Kubernetes is a platform for managing application containers across multiple hosts. It abstracts away the underlying hardware infrastructure and acts as a distributed operating system for your cluster.

Kubernetes is a greek for Helmsman or Pilot (the person holding the ship’s steering wheels)

kubernetes-os

Kubernetes play three important roles:

  1. Referee
    • Kubernetes allocates and manages access to fixed resources using build in resource abstractions like Persistent Volume Claims, Resource Quotas, Services etc
    • Kubernetes provides an abstracted control plane for scheduling, prioritizing, and running processes.
    • Kubernetes provides a sandboxed environment so that applications do not interfere with each other.
    • Kubernetes allows users to specify the memory and CPU constraints on the application. It will ensure application remain in their limits.
    • Kubernetes provides communication mechanism so that services can talk among each other if required.
  2. Illusionist
    • Kubernetes gives the illusion of single infinite compute resource by abstracting away the hardware infrastructure.
    • Kubernetes provides the illusion that you need not care about underlying infrastructure. It can run on a bare metal, in data centre, on the public cloud, or even hybrid cloud.
    • Kubernetes gives the illusion that applications need not care about where they will be running.
  3. Glue
    • Kubernetes provides common abstractions like Services, Ingress, auto scaling, rolling deployment , volume management, etc.
    • Kubernetes comes with security primitives like Namespaces, RBAC that applications can use transparently

I learnt about the three roles – Referee, Illusionist, and Glue from the book Operating Systems Principles and Practices by Thomas Anderson and Michael Dahlin

Continue reading “The Kubernetes Guide For Java Developers: Part 1: Learn Kubernetes by deploying a real-world application on it”