Curiosity is, in great and generous minds, the first passion and the last – Samuel Johnson
Monorepos: Please don’t!: 20 mins read. In this post, Matt Klein gives reasons to why monorepo approach does not provide benefits most often cited by monorepo proponents. His recommendation is to go with polyrepo structure. The post makes four valid arguments:
Organizations that use monorepo spend considerable engineering resources on building tools to work with monorepos. Most organisations don’t have such luxury.
Monorepo makes it difficult to open source internal projects as you have single commit history
Most VCS are not meant to be used for large monolithic repositories. There is some work done by Microsoft as part of its git VFS project but it has some rough edges.
The last interesting point that post makes is The frank reality is that, at scale, how well an organization does with code sharing, collaboration, tight coupling, etc. is a direct result of engineering culture and leadership, and has nothing to do with whether a monorepo or a polyrepo is used.
The main drawback of polyrepo approach is that it creates a culture where different teams own different parts of the code. There are few people in organization aware of the big picture. This point is beautifully put by Adam Jacob in his post – Monorepo: please do!.
My take on this is somewhere in between. For example, if you are building a web application then I like to keep all backend Microservices in one repository and front-end application in another repo. I think the best is somewhere in between the both approaches. Taking either too far does not work. Read More »
He who learns but does not think, is lost! He who thinks but does not learn is in great danger — Confucius
Stop Learning Frameworks: 15 mins read. This post makes a great point that we should focus more on gaining deep understanding of software development fundamentals than learning new frameworks. Frameworks come and go and most of the time knowledge you gain is not portable. Focussing on core software development topics like algorithms, unit testing, design patterns, HTTP, DDD, etc will give you better return for your time. If you encounter a new technology at work then you will learn it anyhow.
Another post on similar lines that I read this week said Stack is irrelevant. Author writes, I’ve started with almost no knowledge about the stack and been up to speed in less than 3 months or so. This is why I always tell people to pick up technology and language agnostic problem solving skills because those are the only skills transferable across stacks.
Analyzing Hacker News book suggestions in Python: 15 mins read. A quick tutorial that will teach you how to do simple text processing in Python. This tutorial covers how to find top book recommendations in an HN thread. The approach used in simple and most programmers can easily follow it.
The Major Features in Postgres 11: 20 mins read. This is 24 page PDF slidedeck covering important features introduced in Postgres 11. My favourite features from the list are:
How to Get Over Productivity Guilt: 10 mins read. Just the kind of post that you need to end your year. I feel productivity guilt on many days. I am trying to be better at it. Else, it kills all your happiness and you end up doing nothing. Prioritize the essentials tasks that you need to do and get them done. Life will be good. Don’t over stress yourself.
The business case for serverless: 10 mins read. I spent few months on building an application using Serverless. I uses AWS stack to build the application.I enjoyed Serverless experience but I found that pace of development goes down as you are not able to do end-to-end testing on your local machine. This post does a good job in making a business case for Serverless. The main point author makes is that Serverless increases developer velocity. Author concludes the post by writing One day, complexity will grow past a breaking point and development velocity will begin to decline irreversibly, and so the ultimate job of the founder is to push that day off as long as humanly possible. The best way to do that is to keep your ball of mud to the minimum possible size— serverless is the most powerful tool ever developed to do exactly that.
Another interesting post on Serverless that I read this week is by Amazon ‘s Tim Bray — Serverless Everything. He proposed an interesting way to look at Serverless using data plane and control plane analogy. Give it a read, it is not that long. The Amazon MQ service, which is a managed version of the excellent Apache ActiveMQ open-source message broker. To make this usable by AWS customers, we had to write a bunch of software to create, deploy, configure, start, stop, and delete message brokers. In this sort of scenario, ActiveMQ itself is called the “data plane” and the management software we wrote is called the “control plane”. The control plane’s APIs are RESTful and, in Amazon MQ, its implementation is entirely serverless, based on Lambda, API Gateway, and DynamoDB.
Benchmark PostgreSQL With Linux HugePages: 15 mins read. PostgreSQL is my first choice when it comes to RDBMS. This post covers how we can configure Linux HugePages configuration to improve performance of Postgres database. Author concludes the post by writing One of my key recommendations is that we must keep Transparent HugePages off. You will see the biggest performance gains when the database fits into the shared buffer with HugePages enabled. Deciding on the size of huge page to use requires a bit of trial and error, but this can potentially lead to a significant TPS gain where the database size is large but remains small enough to fit in the shared buffer.
How we built Globoplay’s API Gateway using GraphQL: 15 mins read. This blog start with the reason why Globo’s team decided to use GraphQL instead of REST for building API. The main reason outlined in the post for choosing GraphQL is the ease with which you can support different requirements for different devices. After covering the why GraphQL, the post talks about how you can get started with it. In mid 2018, we had two backends for frontends (BFF) doing very similar tasks: One for web, and another for iOS, android and TV. As much as I love the “backend for frontend” idea (and how cool it sounds), we could not keep the current architecture. Not only because of the reasons I just said, but because each BFF was serving slightly different content to its clients while the business team started to ask for something new: Ubiquity among all clients. the more I reviewed everything we needed to support, the more GraphQL started to make sense. While TVs need a big program poster, mobiles need a small one. We need to show exactly the same video duration among all clients. TVs should provide detailed information about each program, but iOS and android could show only a poster + program title.
Envoy Proxy at Reddit: 20 mins read. The post goes into depth on why Reddit moved to Envoy for service to service communication. What I like in this post is how they incorporated a new technology in a step by step manner rather than going the Big Bang approach. It is a well written post so you will end up learning a lot about how big sites like Reddit introduce new technologies in their ecosystem and how they architecture evolve over time. The post outlines three requirements Reddit team had from their service mesh choice. Envoy and its ecosystem fit all of these requirements.
Performance: Avoid adding a performance bottleneck at all costs. Any performance losses at the proxy level need to be offset by considerable feature gains. The two biggest considerations here were resource utilization and latency impact. Our mesh approach accounts for a sidecar proxy on every host, so we wanted the solution to be one that we were comfortable running on every host and at every hop in the network.
Features: The biggest differentiator among the options was the possibility of L7 Thrift support in the proxy. Thrift is our main inter-service RPC protocol and without first-class support for the behavior control we want in a service mesh, it wouldn’t make sense to switch to something that would just be providing the same basic TCP load balancing we’re getting out of HAProxy. We’ll address this in the next section.
Integrations and Extensibility: Being able to contribute or request integrations and possibly extend out-of-the-box functionality was also a core requirement. The network proxy needed to be able to evolve with Reddit’s service needs and developer feature requests.
Going Head-to-Head: Scylla vs Amazon DynamoDB: 30 mins read. This post compares ScyllaDB with Amazon DynamoDB. Author writes, Scylla is a drop-in replacement for Cassandra, implemented from scratch in C++. Cassandra itself was a reimplementation of concepts from the Dynamo paper. So, in a way, Scylla is the “granddaughter” of Dynamo. That means this is a family fight, where a younger generation rises to challenge an older one. It was inevitable for us to compare ourselves against our “grandfather,” and perfectly in keeping with the traditions of Greek mythology behind our name.
The conclusion from the post says it all
DynamoDB failed to achieve the required SLA multiple times, especially during the population phase.
DynamoDB has 3x-4x the latency of Scylla, even under ideal conditions
DynamoDB is 7x more expensive than Scylla
Dynamo was extremely inefficient in a real-life Zipfian distribution. You’d have to buy 3x your capacity, making it 20x more expensive than Scylla
Scylla demonstrated up to 20x better throughput in the hot-partition test with better latency numbers
Last but not least, Scylla provides you freedom of choice with no cloud vendor lock-in (as Scylla can be run on various cloud vendors, or even on-premises).
I will end this newsletter with a short video — The Electronic Coach. This short video shows how Donald Knuth build a mainframe program that helped a basketball team win 11 out of 14 matches. This is an early example of computer used in data driven decision making. In case you don’t know Donald Knuth, he is one of the greatest computer scientist. His book The Art of Computer Programming is included by American Scientist on its list of books that shaped the last century of science.
The total estimated time to read this newsletter is 190 minutes.
The secret of getting ahead is to get started – Mark Twain
Facial recognition: It’s time for action: 30 mins read. This is a post by Microsoft on the need for government regulation and responsible industry measures to address advancing facial recognition technology. This is a welcome step by Microsoft and it shows them that they are on the right side of the issue. The post lays out the potential dangers of facial recognition if it is not regulated. The three main problems outlined in the post are:
Certain uses of facial recognition increase the risk of decisions, and, more generally, outcomes that are biased and, in some case, in violation of laws prohibiting discrimination
Intrusion into people’s privacy
The use of facial recognition technology by a government for mass surveillance can encroach on democratic freedoms
Microsoft has also defined six principles that they are adopting to address the concerns. These are 1) Fairness 2) Transparency 3) Accountability 4) Non-discrimination 5) Notice and consent 6) Lawful surveillance.
Why You Should Never, Ever Use Quora: 15 mins read. I personally don’t use Quora for last many years. I find it full of gossip and useless questions and answers. The author makes a good point about Quora lack of intent to make knowledge accessible. Quora does not provide any API or data export tool. They have explicitly forbidden Internet Archive from indexing their web site. Also, you will be forced to login before you can see full answer text. Moreover, they are having trouble making money. So, you never know if they will exist few years down the line. This makes it even more important that they allow shareability of their data.
The Swiss Army Knife of Hashmaps: 30 mins read. This post covers new implementation of HashMap based called Hashbrown. Hashbrown is based on Google’s SwissTable implementation. The blog starts from HashMap basics covering hashes and different implementations of HashMap using linear probing, Robin Hood hashing, and finally talking about Hashbrown. This post gives you a good understanding of HashMap.
Why you need both rituals and routines to power your workday: 10 mins read. The post covers the importance of routines and rituals to make most of the day. Routine is a series of regularly followed actions. Rituals, on the other hand, are those symbolic actions performed at key moments that help us move through the day smoothly.
How Pinterest runs Kafka at scale: 10 mins read. This post talks about how Pinterest is using Kafka. Pintrest has one of the largest Kafka deployments in the cloud. Their Kafka deployments runs in three AWS regions. They make use of MirrorMaker to transport data among three regions. They created and open sourced DoctorKafka, a Kafka operations automation service to perform partition reassignment during broker failure for operation automation. They use d2.2xlarge instances for brokers.
Go has key design elements required for building distributed systems
Go’s concurrency model is relatively easy to implement
It is easy too get started and fun to write
Serverless Tip: Don’t overpay when waiting on remote API calls: 15 mins read. I consider as good software developer, you should not shy away from validating your assumptions. This post goes in depth on how the author did detailed analysis to validate his hypothesis. Author writes, My hypothesis was that by lowering the memory configuration, that the execution of the Lambda function would be slower and perhaps not as cost effective. He then carried out experiments to validate his hypothesis. As it turns out, functions that make remote API calls can be broken down into small, asynchronous components with low memory settings. We get the same performance and significantly reduce our costs, especially at scale.
Our learnings from adopting GraphQL: 20 mins read.In this post, Netflix Marketing Technology team shares their learning in adopting GraphQL. Author writes, We have been running GraphQL on NodeJS for about 6 months, and it has proven to significantly increase our development velocity and overall page load performance.
Your Intuition Is Wrong, Unless These 3 Conditions Are Met: 10 mins read. Daniel Kahneman, author of Thinking Fast and Slow explains why most intuitions are wrong. I loved how he compared two definitions of intuitions and explained why the second is better than the first. The first definition is Intuition is defined as knowing without knowing how you know. The second definition is intuition is thinking that you know without knowing why you do.
Bonus: I will end this week newsletter with a great talk on decision making.
For software developers this is good as they can expect same code to work in both Google Chrome and Microsoft new browser. This means less code to maintain and possibly fewer cross browser bugs.
For the web as a whole, it might not be great news. With Microsoft adopting Chromium, Google is more powerful. The only other viable choices are Firefox and Safari. Firefox team also wrote a post on this topic and highlighted some of the dangers that lie ahead of us.
Faster and simpler with the command line: deep-comparing two 5GB JSON files 3X faster by ditching the code : 15 mins read. This post shows power of jq. jq is a command-line utility for working with JSON. It can pretty print JSON or you can use it to manipulate JSON. The team at Genius was facing an issue where they wanted to compare two big JSON files (5GB in size). They used jq to convert JSON files to a single format. Then, using diff they were able to find the difference between two JSON files. This is a good post that shows how to use a tool effectively to solve a problem.
The deepest problem with deep learning: 30 mins read. The author raises an important point that Deep learning is not the panacea. It solves certain problems but not suitable for every problem. I am not into AI but still I feel we need to understand limitations of a technology.
Software Sprawl, The Golden Path, And Scaling Teams with Agency: 15 mins read. I enjoyed reading this post. It covers an interesting challenge that high-performing engineering teams face. High performing engineering teams have the autonomy and mastery to use the best tools possible for the job. This usually means multiple programming languages, multiple databases, different messaging systems, etc. This works great when organization is small but sooner this makes your operation team overwhelmed. Charity lays down a five step process to control the software sprawl. Read the post to know learn more about this topic.
The Forgotten History of OOP: 30 mins read. I have heard this multiple times in last couple of years in many different posts or talks. Each time I hear this, my mind start to think is Actor model object orientation done right. As mentioned in this post, Alan Kay(who coined object-oriented term) never thought object orientation means class based inheritance and polymorphism. For Alan, the key ideas in OOP are:
Cache warming: Agility for a stateful service: 15 mins read. This post talks about Cache Warmer utility built by Netflix. Cache plays a big role in almost all serious applications. Cache is the only way you can achieve low response times. There are times when you need to spin up a new cache cluster. In this post, Netflix engineers explained multiple approaches they used to build Cache warmer utility. I also built Cache warmer utility recently in one of my projects. It is good to learn how industry stalwarts are doing it. This gives new ideas that we can apply in our systems as well.
You Should Build your Next App on a Boring Stack: 10 mins read. The author summed up it well “The best stack is the one that works the best for you. One you know well enough to have your gut feeling point you in the right direction 90% of the time an error occurs. Because that is what will help you build reliable, working software. And in turn, earn your customer’s loyalty.”
Most of your design choices will be driven by what your product does and who is using it.
Focus on figuring out what people need, and try to come up with a solution to their problem, even if it has a lot of manual steps.
Then think about ways to automate, spend your time coding and destroying, and use third parties where it makes sense.
Don’t scale but always think, code, and plan for scaling.
Build your system step by step, don’t address system design issues based on features that are not mature yet, and finally always try to find the best trade-off between the time you will spend and the gain in performance, money, and lowered risk.
React Native at Picnic: 10 mins read. The team at Picnic shares why they decided to use React Native instead of Native mobile app or PWA.
An important requirement for the new application was that there should be no device or operating system lock-in
Needs to operate well under uncertain networking conditions. Hence, offline support is very important
We decided to use TypeScript instead of Flow
For state persistence, we use redux combined with redux-persist for offline support
On the UI-side, we use styled components for styling and storybook to document our UI components. Snapshots are automatically generated for each story by using StoryShots and React Native Storybook Loader.
Any argument about syntax, we defer to Prettier.
Finally, as the cherry on the cake, we use husky to run pre-commit and pre-push hooks that verify that all code that we check in is up to the standards that we have set for ourselves.
How to Enjoy Life: 15 mins read. I could relate to each and every word written by the author. Life is all about enjoying small little things. Most of the time it is just the way we look at something. Author writes, “The habit of taking even mild pleasure in such tasks would be life-changing, because most of what we do during a typical day isn’t done for enjoyment’s sake: laundry, exercise, office work, dishes, dusting. We do these things because they make life better in some less immediate sense; they’re rewarding, but not necessarily as you do them.”.
The total time to read this newsletter is 130 minutes.
Fortune favors the prepare mind. — Louis Pasteur
Three Sales Mistakes Software Engineers Make: 15 mins read. This post by PipelineDB folks talk about three mistakes sales mistakes software engineers make. I myself find it difficult when I have to take part in any sales initiative. The truth is we all have to sell. Sales do not always mean selling a product. It could be as simple as sharing your idea with the audience. It requires social skills that most software engineers lack. This post talk about three sales mistakes:
Building a product before validating the market for it. This is part of the lean philosophy. I don’t think it is always feasible that you will have an audience with which you can validate your idea. So, I think in some cases it makes sense to building a functional MVP and drive from there. The MVP should not take more than 3 months.
Talking instead of listening. The key message here is that listen to your audience and ask open ended questions. Some example of questions you can ask:
How do you think about this problem?
To what extent is this a priority?
Why are you interested in this topic?
Mistaking interest for demand. Until you get money in your account your work is not done. IBM salespeople use BANT to qualify sales lead.
Do they have enough budget to purchase the product?
Do they have the authority to make the purchase?
Do they need your product?
Will the transaction be completed in a timeline that is acceptable to you?
Amazon’s HQ2 Spectacle Isn’t Just Shameful—It Should Be Illegal: 20 mins read. This is yet another story of corporate getting things the way they want. They take billions of dollar subsidy from the government and return quite less. It is true in all parts of the world. In India, we have seen loan worth crores of rupees given to corporate. When the time comes to return back, system allows them to get away easily. All governments are hand in glove with corporates. We just don’t matter.
Cloud Computing without Containers: 15 mins read. This looks interesting. I thought containers are the best we can go. But, as mentioned in this post, there are other possibilities like Isolates that can provider more efficient and economic alternatives. This post does a good job comparing how Isolates compare against AWS Lambda that underneath uses containers. I will dig deeper into it more. Overall, a great post by Cloudfare folks.
Things I learned from working at Shopify: 10 mins read. This is a great post by Budi Tanrim, a software engineer at Shopify. In this post, he talked about why he left an amazing job at Shopify to go back to Indonesia. Few points from his post that resonated with me:
Come with a learning mindset: I often go for consulting assignments and there is always a tendency in me to come up with solutions before understanding the problem statement well enough. As he mentioned in his post, try to first understand the context and then think about solution.
Be comfortable with being uncomfortable: When life events do not go as planned don’t get uncomfortable. If you get stressed then things will get more worse. It is always better to take a step back and think about the situation again.
Prepare before presenting your work: This is an essential if you want to make an impact. Many time right words doesn’t come out at the right time so preparation helps a lot.
Make decision log to have firmer decision: I also started doing it but I am not consistent. I agree with Budi that it is essential to document your decisions. The time you spent today in documenting your decision will serve you tomorrow when you might need to explain your rationale.
What would a message-oriented programming language look like?: 10 mins read. Author thinks answer is Erlang. I have not worked with Erlang but I have used Akka with Scala in the past. Erlang like Akka is based on Actor model and when you work with actor model different objects communicate with each other using message passing. So, may be the answer is the language that has support for Actor model.
Lessons from the data lake, part 1: Architectural decisions: 15 mins read.This post by AutoTrader engineers goes over the architectural decisions they took in building their own data lake. They started with on-premise solution but soon faced series of operational issues. The author writes Cluster computing on-premises is hard and expensive, cloud is easier. After failing with on-premise solution, they decided to build a new solution using Amazon S3 and Apache Spark delivered through AWS EMR solution. They used Terraform for provisioning cloud resources. They build five different zones to impose structure on their lake. They confined data to five ‘zones’ – in practice, five S3 buckets – named transient, raw, refined, user and trusted.They used Apache Avro for achieving schema on read.
There’s Seldom Any Traffic on the High Road: 5 mins read. Another meaningful post on Farnam Street. This post makes an important point of not reacting when someone behaves rudely of you. As author writes, She was being rude. Yes. But that wasn’t the best version of her. I see the value of learning this skill. Making enemies is expensive. Sometimes you don’t even realize how expensive.
Peeking under the hood of redesigned Gmail: 15 mins read. This post does a good analysis of performance issues with new Gmail interface. Using the tools available in Google Chrome, author was able to find possible reasons for bad performance of Gmail. It is sad that Gmail team does not use the facilities provided by Google’s own browser. I will recommend reading this article as you can apply the same learning for your website as well.
Read Uncommitted: This level allows one transaction to read uncommitted data written by another transaction. This isolation level allows dirty reads.
Read Committed: This level allows one transaction to read data committed by another transaction. PostgreSQL uses READ COMMITTED as the default isolation level.
Repeatable Read: This level ensures that during a transaction you are guaranteed to read the same data that were committed when the transaction was started even if you make multiple read calls. MySQL uses this level as default.
Serializable: This isolation level ensures that all transactions occur in a completely isolated fashion, meaning as if all transactions in the system were executed serially, one after the other.
How Sharding Works: 20 mins read. Database sharding is a complex topic to master. There are so many database and they all handle sharding differently. This post gives a good introduction to sharding and different ways sharding is implemented by different databases.
Algorithmic sharding: This is implemented at the client side using an algorithm like hash(key) % number of servers in the database cluster
Dynamic sharding: This is implemented using a locator service. Clients make call to locator service and it tells them which node to talk to.
Entity groups: This approach stores related entities in the same partition to provide additional capabilities with in a single partition. This is a popular approach to shard a relational database.
Kubernetes for personal projects? No thanks! : 10 mins read. The article goes over reasons why you shouldn’t run Kubernetes cluster for a small project. I agree with the point. Your goal should be to build application rather than fighting with infra. I found Docker compose based deployment sufficient for my side projects. I provision a docker machine on AWS and then deploy containers using Docker compose. It works great when you are small. I think the same argument for Microservices and Monolithic applications. Don’t use Microservices architecture for small projects.
Rate limiting for distributed systems with Redis and Lua: 15 mins read. This post explains how you can implement API rate limiting in your application. It shows how to do that using Redis and Lua scripts. It covers two use cases for API rate limiting 1) rate limiting upstream clients and rejecting calls above the limit 2) rate limiting downstream clients to ensure that they can maintain allowed calls per second.It uses Token Bucket and Leaky bucket algorithms to meet the use cases.
A brief history of High Availability: 20 mins read. This article covers the history of how databases have evolved to support availability and consistency. It covers Active-Passive, Active-Active, and Multi-Active approaches to design available database clusters.
Here are 10 reads I thought were worth sharing this week. The total time to read this newsletter is 165 minutes. This week has stories on writing, remote code execution on Facebook servers, peter principle, Java 11 ZGC, Serverless patterns, PostgreSQL fast column creation, and few more.
Leadership is nature’s way of removing morons from the productive flow. – Dilbert