shekhargulati

Issue #18: 10 Reads, A Handcrafted Weekly Newsletter for Humans

The total time to read this newsletter is 130 minutes.

Fortune favors the prepare mind. — Louis Pasteur

Three Sales Mistakes Software Engineers Make: 15 mins read. This post by PipelineDB folks talk about three mistakes sales mistakes software engineers make. I myself find it difficult when I have to take part in any sales initiative. The truth is we all have to sell. Sales do not always mean selling a product. It could be as simple as sharing your idea with the audience. It requires social skills that most software engineers lack. This post talk about three sales mistakes:
1. Building a product before validating the market for it. This is part of the lean philosophy. I don’t think it is always feasible that you will have an audience with which you can validate your idea. So, I think in some cases it makes sense to building a functional MVP and drive from there. The MVP should not take more than 3 months.
2. Talking instead of listening. The key message here is that listen to your audience and ask open ended questions. Some example of questions you can ask:
3. How do you think about this problem?
  1. To what extent is this a priority?
  2. Why are you interested in this topic?
  3. .. Etc
4. Mistaking interest for demand. Until you get money in your account your work is not done. IBM salespeople use BANT to qualify sales lead.
1. 1. Do they have enough budget to purchase the product?
  2. Do they have the authority to make the purchase?
  3. Do they need your product?
  4. Will the transaction be completed in a timeline that is acceptable to you?
Amazon’s HQ2 Spectacle Isn’t Just Shameful—It Should Be Illegal: 20 mins read. This is yet another story of corporate getting things the way they want. They take billions of dollar subsidy from the government and return quite less. It is true in all parts of the world. In India, we have seen loan worth crores of rupees given to corporate. When the time comes to return back, system allows them to get away easily. All governments are hand in glove with corporates. We just don’t matter.
Cloud Computing without Containers: 15 mins read. This looks interesting. I thought containers are the best we can go. But, as mentioned in this post, there are other possibilities like Isolates that can provider more efficient and economic alternatives. This post does a good job comparing how Isolates compare against AWS Lambda that underneath uses containers. I will dig deeper into it more. Overall, a great post by Cloudfare folks.
Things I learned from working at Shopify: 10 mins read. This is a great post by Budi Tanrim, a software engineer at Shopify. In this post, he talked about why he left an amazing job at Shopify to go back to Indonesia. Few points from his post that resonated with me:
1. Come with a learning mindset: I often go for consulting assignments and there is always a tendency in me to come up with solutions before understanding the problem statement well enough. As he mentioned in his post, try to first understand the context and then think about solution.
2. Be comfortable with being uncomfortable: When life events do not go as planned don’t get uncomfortable. If you get stressed then things will get more worse. It is always better to take a step back and think about the situation again.
3. Prepare before presenting your work: This is an essential if you want to make an impact. Many time right words doesn’t come out at the right time so preparation helps a lot.
4. Make decision log to have firmer decision: I also started doing it but I am not consistent. I agree with Budi that it is essential to document your decisions. The time you spent today in documenting your decision will serve you tomorrow when you might need to explain your rationale.
What would a message-oriented programming language look like?: 10 mins read. Author thinks answer is Erlang. I have not worked with Erlang but I have used Akka with Scala in the past. Erlang like Akka is based on Actor model and when you work with actor model different objects communicate with each other using message passing. So, may be the answer is the language that has support for Actor model.
Lessons from the data lake, part 1: Architectural decisions: 15 mins read.This post by AutoTrader engineers goes over the architectural decisions they took in building their own data lake. They started with on-premise solution but soon faced series of operational issues. The author writes Cluster computing on-premises is hard and expensive, cloud is easier. After failing with on-premise solution, they decided to build a new solution using Amazon S3 and Apache Spark delivered through AWS EMR solution. They used Terraform for provisioning cloud resources. They build five different zones to impose structure on their lake. They confined data to five ‘zones’ – in practice, five S3 buckets – named transient, raw, refined, user and trusted.They used Apache Avro for achieving schema on read.
There’s Seldom Any Traffic on the High Road: 5 mins read. Another meaningful post on Farnam Street. This post makes an important point of not reacting when someone behaves rudely of you. As author writes, She was being rude. Yes. But that wasn’t the best version of her. I see the value of learning this skill. Making enemies is expensive. Sometimes you don’t even realize how expensive.
Peeking under the hood of redesigned Gmail: 15 mins read. This post does a good analysis of performance issues with new Gmail interface. Using the tools available in Google Chrome, author was able to find possible reasons for bad performance of Gmail. It is sad that Gmail team does not use the facilities provided by Google’s own browser. I will recommend reading this article as you can apply the same learning for your website as well.
Dealing with significant Postgres database bloat — what are your options? 15 mins read. When data is updated or deleted in Postgres, new data is written. The old data then needs to be vacuumed. That unvacuumed data is known as bloat. Here’s a look at how you can deal with it.
Scalability Worst Practices: 10 mins read. This is an old blog published in 2008. The worst practices are still applicable today. So, give it a read.

Thinking about software system in terms of reliability, scalability, and maintainability

A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. – Leslie Lamport

Last six months I was building pricing engine for a client. The application was built using multiple components:

We had a change data capture pipeline built using AWS Kinesis that read data from IBM DB2 and writes to PostgreSQL to keep database in sync with changes happening in the source system
We were storing denormalised documents in AWS ElastiCache i.e. Redis
We had a batch job that was doing one time load of the PostgreSQL database
We had a near cache that helped us process our worst requests in few hundred milliseconds

When you build a system using multiple independent components then you have to keep in mind that you are building a data system that it needs to provide certain guarantees. In our case, we had to guarantee:

AWS ElastCache i.e Redis will be updated with changes happening in the source system in less than 30 seconds
Near-cache will be invalidated and updated with latest data so that clients accessing the system will get consistent results. Keeping a global cache like Redis is easier to keep in sync than keeping near-cache in sync. We came up with a novel way to keep near cache in sync with the global cache.
Data will not be lost by our change data capture pipeline. If processing of a message failed then we retry the message
There will be times when different data components i.e. PostgreSQL, Redis, and near-cache will have different state. But, eventually it should become consistent
That there will be a mechanism to observe state of the system at any point of time

Like it or not systems that we are building are becoming more and more distributed. This means there are many more ways they can fail. To help build software systems that meets the end goal, we should keep following three concerns in our mind. These should be defined as clearly as possible so that every team member keep these in mind while building software systems.

Reliability
Scalability
Maintainability

Continue reading “Thinking about software system in terms of reliability, scalability, and maintainability”

The Compound Effect

We all are looking for a quick way to earn money, lose weight, build relationships, get promotion in our job, or become successful in life. I have failed numerous times with my effort to achieve my goals. We all give up too quickly. Failing is not an issue if you failed after you have given your best. Most of the time we fail because we don’t try hard enough. We give up too soon.

Continue reading “The Compound Effect”

Taking Java heap dump programmatically

The following code snippet can be used to take heap dump of Java program programmatically.

import java.io.File;
import java.io.IOException;
import java.lang.management.ManagementFactory;

import javax.management.MBeanServer;

import com.sun.management.HotSpotDiagnosticMXBean;

public abstract class HeapDumper {

    private static final HotSpotDiagnosticMXBean HOT_SPOT_DIAGNOSTIC_MX_BEAN = getHotspotDiagnosticMxBean();
    private static final String HOTSPOT_BEAN_NAME = "com.sun.management:type=HotSpotDiagnostic";

    private static HotSpotDiagnosticMXBean getHotspotDiagnosticMxBean() {
        MBeanServer server = ManagementFactory.getPlatformMBeanServer();
        try {
            return ManagementFactory.newPlatformMXBeanProxy(
                    server, HOTSPOT_BEAN_NAME, HotSpotDiagnosticMXBean.class);
        } catch (IOException error) {
            throw new RuntimeException("failed getting Hotspot Diagnostic MX bean", error);
        }
    }

    public static void createHeapDump(File file, boolean live) {
        try {
            HOT_SPOT_DIAGNOSTIC_MX_BEAN.dumpHeap(file.getAbsolutePath(), live);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

Issue #17: 10 Reads, A Handcrafted Weekly Newsletter for Humans

Database Isolation Levels And Their Effects On Performance And Scalability: 10 mins read. It is always good to refresh the foundational concepts. This post covers database isolation levels:
1. Read Uncommitted: This level allows one transaction to read uncommitted data written by another transaction. This isolation level allows dirty reads.
2. Read Committed: This level allows one transaction to read data committed by another transaction. PostgreSQL uses READ COMMITTED as the default isolation level.
3. Repeatable Read: This level ensures that during a transaction you are guaranteed to read the same data that were committed when the transaction was started even if you make multiple read calls. MySQL uses this level as default.
4. Serializable: This isolation level ensures that all transactions occur in a completely isolated fashion, meaning as if all transactions in the system were executed serially, one after the other.
How Sharding Works: 20 mins read. Database sharding is a complex topic to master. There are so many database and they all handle sharding differently. This post gives a good introduction to sharding and different ways sharding is implemented by different databases.
1. Algorithmic sharding: This is implemented at the client side using an algorithm like hash(key) % number of servers in the database cluster
2. Dynamic sharding: This is implemented using a locator service. Clients make call to locator service and it tells them which node to talk to.
3. Entity groups: This approach stores related entities in the same partition to provide additional capabilities with in a single partition. This is a popular approach to shard a relational database.
4. Hierarchical keys and Column-oriented databases
How One Website Exploited Amazon S3 to Outrank Everyone on Google: 20 mins read.
The interesting ideas in Datasette: 15 mins read.
More than 9 million broken links on Wikipedia are now rescued : 10 mins read.
Kubernetes for personal projects? No thanks! : 10 mins read. The article goes over reasons why you shouldn’t run Kubernetes cluster for a small project. I agree with the point. Your goal should be to build application rather than fighting with infra. I found Docker compose based deployment sufficient for my side projects. I provision a docker machine on AWS and then deploy containers using Docker compose. It works great when you are small. I think the same argument for Microservices and Monolithic applications. Don’t use Microservices architecture for small projects.
Rate limiting for distributed systems with Redis and Lua: 15 mins read. This post explains how you can implement API rate limiting in your application. It shows how to do that using Redis and Lua scripts. It covers two use cases for API rate limiting 1) rate limiting upstream clients and rejecting calls above the limit 2) rate limiting downstream clients to ensure that they can maintain allowed calls per second.It uses Token Bucket and Leaky bucket algorithms to meet the use cases.
A brief history of High Availability: 20 mins read. This article covers the history of how databases have evolved to support availability and consistency. It covers Active-Passive, Active-Active, and Multi-Active approaches to design available database clusters.

Two-phase commit protocol

Welcome to the fourth post in the distributed systems series. In the last post, we covered ACID transactions. ACID transactions guarantee

Atomicity: Either all the operation succeed or none
Consistency: System moves from one consistent state to another at the successful completion of a transaction
Isolation: Concurrent transactions do not interfere with each other
Durability: After successful completion of a transaction all changes made by the transaction persist even in the case of a system failure

If the database is running on a single machine then it is comparatively easier to guarantee ACID semantics in comparison to a distributed database. Following are the reasons you would want to run a database in a distributed fashion:

To be fault-tolerant
To handle more reads and writes

Let’s assume our’s is a read intensive application and our single machine database is not able to scale to our demand. One of the solution to scale read is achieved through replication. The most common replication topology is single master and multiple slaves. All the writes go to the master and reads are performed on slaves. Data from master is replicated to the salves synchronously or asynchronously. In this post, we will assume synchronous replication.

Continue reading “Two-phase commit protocol”

The Minimalistic Guide to ACID Transactions

Welcome to the third post of distributed system series. So far in this series, we have looked at service discovery and CAP theorem. Before we move along in our distributed system learning journey, I thought it will be useful to refresh our memory with understanding of ACID transactions. ACID transactions are at the heart of relational databases. The knowledge of ACID transactions is useful when building distributed applications.

Understanding ACID transactions

A transaction is a sequence of operations that form a single logical unit of work. These transactions are executed on a shared database system to perform a higher-level function. An example of higher-level function is transferring money from one account to another. Transactions represent a basic unit of change in the database. It either executed in its entirety or not at all.

ACID (Atomicity, Consistency, Isolation, and Durability) refers to a set of properties that a database transaction should guarantee even in the event of errors, power failure, etc. The canonical example of ACID transaction is transfer of funds from one bank account to another. In a single fund transferring transaction, you have to check the account balance, debit one account, and credit another transaction. ACID properties guarantee that either money transfer from one account to other occur correctly and permanently or in case of failure both accounts have the same initial state. It would be unacceptable if one account was debited but the other account was credited.

Database transactions are motivated by two independent requirements:

Concurrent database access: Multiple clients can access the system at the same time. This is achieved by the Isolation property of ACID transaction.
Resiliency to system failures: System remains in consistent state in case of a system failure. This is provided by Atomicity, Consistency, and Durability properties of ACID transaction.

Continue reading “The Minimalistic Guide to ACID Transactions”