Tag Archives: unit-testing

Paper Summary: Simple Testing Can Prevent Most Failures

Today evening, I decided to read paper Simple Testing Can Prevent Most Failures: An analysis of Production Failures in Distributed Data-intensive systems.

This paper asks an important question

Why widely used distributed data systems designed for high availability like Cassandra, Redis, Hadoop, HBase, and HDFS. experience failures and what can be done to increase their resiliency?

We have to answer this question keeping in mind that these systems are developed by some of the best software developers in the world following good software development practices and are intensely tested.

These days most of us are building distributed systems. We can apply the findings shared in this post to build systems that are more resilient to failure.

The paper shares:

Most of the catastrophic system failures are result of incorrect handling of non-fatal errors explicitly signalled in the software

This falls into 1) empty error handling blocks, or error blocks with just log statement 2) the error handing aborts the clusters on an overly-general exception 3) the error handling code contains expressions like “FIXME” or “TODO” in the comments.

Most of the developers are guilty of doing all the three above mentioned. Developers are good at finding that something will go wrong but they don’t know what to do when something goes wrong. I looked at the error handling code in one of my projects and I found the same behaviour. I had written TODO comments or caught general exceptions. These are considered to be bad practices but still most of us end up doing.

Overall, we found that the developers are good at anticipating possible errors. In all but one case, the errors were checked by the developers. The only case where developers did not check the error was an unchecked error system call return in Redis.

Another important point mentioned in the paper is

We found that 74% of the failures are deterministic in that they are guaranteed to manifest with an appropriate input sequence, that almost all failures are guaranteed to manifest on no more than three nodes, and that 77% of the failures can be reproduced by a unit test.

Most popular open source projects use unit testing so it could be surprising that the existing tests were not good enough to catch these bugs. Part of this has to do with the fact that these bugs or failure situations happens when a sequence of events happen. The good part is that sequence is deterministic. As a software developer, I could relate to the fact that most of us are not good at thinking through all the permutation and combinations. So, even though we write unit tests they do not cover all scenarios. I think code coverage tools and mutation testing can help here.

It is now universally agreed that unit testing helps reduce bugs in software. Last few years, I have worked with few big enterprises and I can attest most of their code didn’t had unit tests and even if parts of the code had unit tests those tests were useless. So, even though open source projects that we use are getting better through unit testing most of the code that an average developer writes has a long way to go. One thing that we can learn from this paper is to start write high quality tests.

The paper mentions specific events where most of the bugs happen. Some of these events are:

  1. Starting up services
  2. Unreachable nodes
  3. Configuration changes
  4. Adding a node

If you are building distributed application, then you can try to test your application for these events. If you are building applications that uses Microservices based architecture then these are interesting events for your application as well. For example, if you call a service that is not available how your system behaves.

As per the paper, these mature open-source systems has mature logging.

76% of the failures print explicit failure related messages.

Paper mentions three reasons why that is the case:

  1. First, since distributed systems are more complex, and harder to debug, developers likely pay more attention to logging.
  2. Second, the horizontal scalability of these systems makes the performance overhead of outputing log message less critical.
  3. Third, communicating through message-passing provides natural points to log messages; for example, if two nodes cannot communicate with each other because of a network problem, both have the opportunity to log the error.

Authors of the paper built a static analysis tool called Aspirator for locating these bug patterns.

If Aspirator had been used and the captured bugs fixed, 33% of the Cassandra, HBase, HDFS, and MapReduce’s catastrophic failures we studied could have been prevented.

Overall, I enjoyed reading this paper. I found it easy to read and applicable to all software developers.

Angular 4 Karma Loading CSS in Unit Testing

Today, while writing unit test for one of my Angular 4 application I wanted to load bootstrap css. My components uses Bootstrap for styling. The problem is when test run they don’t load CSS so they are not rendered correctly during unit tests. Angular 4 relies on Karma as test runner for unit tests. To load your external CSS file like bootstrap, you have to add files option to karma.conf.js. After making the change, run your test cases again.

module.exports = function (config) {
    basePath: '',
    frameworks: ['jasmine', '@angular/cli'],
    singleRun: false,
    files: [

Software Development 101: Validate your assumptions

In my last assignment, I was asked to mentor a software development team as part of the Dojo program. I am not a big believer in Training initiatives, but because Dojo program has a different format I decided to take up the assignment. The Dojo program involves working on a real project with the team, helping them embrace good software development practices, solving team’s real problems, and finally delivering a quality software. In this post, I want to talk about a lesson that I had shared with the team — the lesson which I named it as Validate your assumptions. Continue reading

Using Spring Boot @SpyBean

Today, a colleague asked me to help him write a REST API integration test. We use Spring’s MockMvc API to test the REST API. The application uses MongoDB with Spring Data MongoDB. The application uses both MongoTemplate and Mongo based repositories for working with MongoDB. To make tests work independent of MongoDB, we mock Spring MongoDB repository interfaces.

Continue reading

Caitie McCaffrey Talk: The Verification of a Distributed System

This is a great video that explains importance of developer testing in writing robust distributed system. She talked about unit testing, integration testing, and verification language that can be used to verify a system. Unit testing is the first thing a developer can do to make sure distributed system is correct. She shared a number of anecdotes in the talk that make this talk easy to understand.

Unit testing ngModel in Angular 4

Angular 4 is a good choice for building modern web applications. It comes bundled with Test utilities that makes it easy to write good quality test cases quite easily. Angular official testing guide does a good job explaining how to get started with testing Angular components. One thing that it does not cover is how to test components that use two way data binding using ngModel directive. Last week I had to write test case for a component that uses data binding so I end up learning a lot about testing such components. In this blog, I will share my learnings with you.

Continue reading