Using Testcontainters with Kotlin Spring Boot App

In this quick post I will show you how to use Testcontainers with Kotlin Spring Boot app. Testcontainers is a Java library that you can use to write integration tests that work with real third-party services like databases, message queues like Kafka, or any other service that can be packaged as a Docker container image.

Most developers use in-memory databases or fakes to test with external dependencies but to test against real stack you need to use real services. This week I faced as issue where one of my test was failing because a MySQL function was unavailable in H2. I was using in-memory H2 database in tests. My application used MySQL in the production mode. So, my valid MySQL query was not working when run in a test. This made me think of replacing H2 with MySQL database for tests.

I was aware of Testcontainers but had never used it in any application. So, this was the first time I used it. For most parts I liked it. I don’t have to work around limitations of H2 and I can hope I will not discover any issues because of difference between H2 and MySQL. Testcontainers is not perfect. It makes test slow. My build time has increased from 6 mins to 9 mins just because of Testcontainers. I am relying more on my CI server to run the complete test suite.

Testcontainers can do much more than running databases. You can use Testcontainers for:

  • Stream processing solutions like Apache Kafka and Apache Pulsar
  • AWS Localstack
  • Selenium tests
  • Chaos tests
My Thoughts On Monorepo

A monorepo is a software development strategy where a single version control repository has source code for multiple projects, libraries, and applications irrespective of their programming language. Also, the organizations using Monorepo strategy often use a common build tool (like Bazel, Pants, Buck) to manage all the source code. Some of the popular examples of organizations that employ monorepo strategy are Google, Facebook, Twitter, Microsoft, and Uber.

Before we start let me give some context on my background so that you can better understand my thoughts on Monorepo. 

I head technology at an IT services organization. Most of the products that I build are using Microservices architecture, have multiple frontends(web and mobile). The biggest product that I recently built had close to 30 microservices, 1 web client written in React,  and native mobile app built using React Native. These numbers are nowhere near the numbers big product companies have shared. 

I prefer Macroservices over Microservices. I think most products don’t need more than 10 microservices. 

Writing scripts in Java 11 and beyond

Many Java users hate it at times for being too verbose. Many of us have started using other languages like Kotlin or Scala for their terseness and expressiveness. One of the feature that Java programmers like about these modern languages is their ability to do quick experimentation. There are times you do not want to create a project in your IDE, configure build tool, create package structure and then write the code. There is a lot of yak shaving involved before you can automate that one thing that is nagging you. You just want to write a script and move on.

Before JDK 11, Java language did not give you the means to achieve this. Since Java 11, you can write single source file scripts to automate tasks. You do not have to compile the Java file before executing it. You write your Java file and then directly execute it.

Sending Outlook Calendar Invite using Java Mail API

Today I had to implement a functionality related to sending Outlook Calendar invite. The app backend was written in Java. As it turns out the simple task of sending a calendar invite is much more complicated than I expected. You have to construct the message with all the right parameters set else your calendar invite will not behave as you expect. I wasted an hour or two figuring out why RSVP buttons are not coming in the invite. As it turned out it was one of the missing parameters that caused the issue. Also, I wanted to send calendar invite to both outlook and Google calendar.

Writing Bencode Parser in Kotlin

This week I decided to write some Kotlin for fun. The best way to learn something while having fun is to build something with it. So, I decided to write a Bencode parser. From the Wikipedia[1],

Bencode is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.

Bencode supports four data types: Strings, Integers, Lists, and Dictionaries.

Strings are encoded as <string length encoded in base ten ASCII>:<string data>. So, spam becomes 4:spam

Integers are encoded as i<integer encoded in base ten ASCII>e. So, positive 3 becomes i3e and negative 3 becomes i-3e

Lists are encoded as l<bencoded values>e. So, list of [spam, eggs] become l4:spam4:eggse

Finally dictionaries are encoded as d<bencoded string><bencoded element>e. So, dictionary { "cow" => "moo", "spam" => "eggs" } becomes d3:cow3:moo4:spam4:eggse. You can have dictionary of any of the types supported.

I took the above examples from BitTorrent specification document[2].

Now that we understand about Bencode let’s start building the parser.

Paper Summary: Monarch: Google Planet-Scale In-memory Time Series Database

This week I read Monarch paper by Google engineers. The paper covers in detail design decisions involved in building Monarch. Monarch as the title of the paper suggests is an in-memory time series database. It is used by Google monitoring system that monitors most of the Google web properties like Gmail, Youtube, and Google Maps.

Every second, the system ingests terabytes of time series data into memory and serves millions of queries.

These are some very big numbers. Most of us do not have to deal with such large volume of data in our day to day work. Reading this paper can help us understand how engineers building such system make design decisions and tradeoffs.

usql: The Universal command-line interface for SQL databases

Recently I discovered a useful utility called usql while trying to find a better CLI for MySQL. I personally prefer psql PostgreSQL command-line tool so I was trying to find a similar tool for MySQL. During my search for the psql like MySQL CLI I stumbled upon usql – universal command-line interface for SQL database. I like playing with CLI tools as they play a big role in improving the developer experience.

I discovered that there is another popular project in Microsoft world with the same name. It is called U-SQL. U-SQL is the new big data query language of the Azure Data Lake Analytics service.

Microsoft’s Distributed Application Runtime (Dapr)

We are living in a world where every other day we see a new technical innovation. For most of us mere mortal it is not easy to quickly figure out if the technology is worth our time. Making sense of new technical innovations is one of the core aspects of my current role so by taking the pain of writing this series I will help myself as well.

In this post, I will cover Dapr. Dapr stands for distributed application runtime.

What does distributed application runtime means?

These days most of us are developing distributed systems. If you are building an application that uses Microservices architecture then you are building a distributed system.

A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system.

Distributed application runtime provides you the facilities that you can use to build distributed systems. These facilities include:

  1. State management. Most applications need to talk to a datastore to store state. Common examples like PostgreSQL, Redis, etc.
  2. Pub/sub. For communication between different components and services.
  3. Service to service communication. This also includes retries, circuit breaking.
  4. Observability. To bring visibility into systems.
  5. Secret management. For storing password and keys.
  6. Many other

One important thing to note is that these are application level concerns. Distributed application runtime does not concern itself with infrastructure or network level concerns.

Present vs Future Focussed People

The below is the text from Hell Yeah or No book. It gives a good mental model to reason how one live their life. I don’t think one has to entirely in one list or the other. You will be more aligned to one list than the other. Also, I don’t think one is better than the other.

Present-focused people

  • Pursue pleasure, excitement, and novelty
  • Focus on immediate gratification
  • Especially appreciate life, nature, and the people around them
  • Are playful, impulsive, and sensual
  • Avoid anything boring, difficult, or repetitive
  • Get fully immersed in the moment and lose track of time
  • Are more likely to use drugs and alcohol
  • Are better at helping others than helping themselves

Future-focused people

  • Delay gratification
  • Are driven with self-discipline because they vividly see their future goals
  • Tend to live in their minds, picturing other selves, scenarios, and possible futures
  • Especially love their work
  • Exercise, invest, and go for preventative health exams
  • Are better at helping themselves, but worse at helping others
  • Are more likely to be successful in their careers, but often at the expense of personal relationships, which require a present focus

Creating Visualization for Organization Entity Chart with Multiple Parents using dagre-d3

Last week I had to create a visualisation for organization entity chart. The complexity was that an entity can be created by merging of two entities. So, I was looking for a solution that gives the flexibility to have two parent nodes. I started with D3.js but quickly figured out that tree charts in D3.js can’t have two parent nodes. After a lot of googling and trying many different libraries I decided to create that using Graphviz. The only problem was that I wanted to do it on the client side.

Below is the chart that I wanted to create. As you can see below – Dummy Org ABC Limited has two parents Dummy Org India Ltd and Dummy Org East Africa DMCC. Similarly, Dummy Org Devs Project has two parents Dummy Org Infrastructure Limited and Dummy Org Infrastructure & Dummy Org Project

