The time to read this newsletter is 175 minutes.
Good friends, good books, and a sleepy conscience: this is the ideal life – Mark Twain
The time to read this newsletter is 180 minutes.
Wealth is the ability to fully experience life. — Henry David Thoreau
- Don’t get clever with login forms: 10 mins read. This post points to a valid concern related to cleverness of login forms. Author through a set of examples explain why clever login forms end up confusing users. Another example of clever login experience that author does not cover is https://login.microsoftonline.com . I agree with author recommendations for login page:
- Have a dedicated page for login
- Expose all required fields
- Keep all fields on one page
- Don’t get fancy.
- Why Google Needed a Graph Serving System: 30 mins read. In this post, author shares his story of building a distributed graph database that can answer queries with relationship. The post goes over various Graph based systems developed at Google and why Google failed to build a distributed Graph database that does not suffer from depth join problem. This post highlights an interesting point related to Google’s struggle to build innovative solution because of their internal politics. Building a distribued graph database that does not suffer from depth join problem is a herculean task. Dgraph an open source database developed by the author along with others in community is trying to build such a system.
- You probably don’t need a single-page application: 10 mins read. I agree in entirety with the author that best solution to build web application is somewhere in middle i.e. building hybrid apps. Build SPA only for parts where you need rich interaction and keep most other pages server rendered.
- Google wants Cloud Services Platform to Borg your datacenter: 20 mins read. This post gives insight into why Google made the move to build and open source Kubernetes. Google knew they are going to have a hard time beating AWS and Azure. So, they built and released Kubernetes and hoped it becomes a successful project with big community. This means cloud just became an implementation detail and most big enterprises started considering Kubernetes as a choice of softwaere to build a modern hybrid datacenter. Google’s Cloud Service Platform(CSP) will give enterprises a hardened Kubernetes, Istio, Knative software distribution. CSP is going to be a game changer for Google Cloud. Also, many OpenShift users might consider going for CSP. Interesting time ahead!
- Four Techniques Serverless Platforms Use to Balance Performance and Cost: 30 mins read. This is the best article I have read on Serverless. It starts by helping reader understand architecture of Serverless platform and then it talks about elephant in the room — cold start problem associated with Serverless platforms. The article covers four techniques that is employed by different Serverless platforms to overcome cold start issue. The techniques mentioned in the post are following:
- Function resource sharing
- Function resource pooling
- Function prefetching
- Function prewarming.
- Lessons from 6 software rewrite stories: 20 mins read. Another amazing read for this week. This post through real examples explain when it is fine to rewrite software. If you are building software for long, you will have come across advice by Joel Spolsky that rewriting software is the single worst strategic mistake that a software company can make. The post author tells the other side of the story in this post. The key take away from the post is
- Once you’ve learned enough that there’s a certain distance between the current version of your product and the best version of that product you can imagine, then the right approach is not to replace your software with a new version, but to build something new next to it — without throwing away what you have.
- How to build a distributed throttling system with Nginx + Lua + Redis: 15 mins read. This post covers how to build API rate limiting system with Nginx, Lua, and Redis. Instructions mentioned in the post are clear and to the point.
- Monte Carlo Simulation with Python: 20 mins read. The post explains Monte Carlo simulation using a simple but realisitic example. As per wikipedia,
- Monte Carlo methods are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. Their essential idea is using randomness to solve problems that might be deterministic in principle. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization, numerical integration, and generating draws from a probability distribution.
- 5 Ways To Process Feedback At Work Without Triggering A Stress Response: 10 mins read. This post covers an important aspect of professional life — taking feedback. The author suggests following:
- Keep an open mind about receiving feedback. Focus on how your work can be improved with some extra perspective.
- Don’t respond right away, take a few seconds to really process the feedback. You can assess rationally and logically, without undue emotion.
- Make sure you understand the feedback. In cases where you don’t, ask questions! The feedback giver should be happy to discuss specific points deeper to help clarify their suggestions.
- Be humble and gracious! Let them know you appreciate that they gave their time and energy to help make you more successful.
- Don’t let constructive criticism go in one ear and out the other. Take what you hear, implement it, and follow-up.
- How to Organize your Monolith Before Breaking it into Services: 15 mins read. This post talks about an intermediary stage between monolithic and microservices – a monolithic organized by domain without the entanglement or fragility of our original codebase. I agree with author in entirety that we should start with monolithic and modularise applications based on sub domains by applying DDD principles. If required in future, we can easily make these subdomain functional modules to services. It is great to read post like this as they provide valuable information that is usually missing in most posts found on the web.
The time to read this newsletter is 150 minutes.
The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn and relearn. — Alvin Toffler
- The Hard Truth About Innovative Cultures: 20 mins read. This post answers a question that I was struggling to find the right answer. If there is one post that you should read this week then It should be this one. From the post:
> A tolerance for failure requires an intolerance for incompetence. A willingness to experiment requires rigorous discipline. Psychological safety requires comfort with brutal candor. Collaboration must be balanced with a individual accountability. And flatness requires strong leadership. Innovative cultures are paradoxical. Unless the tensions created by this paradox are carefully managed, attempts to create an innovative culture will fail.
When AWS Autoscale Doesn’t: 15 mins read. This post by folks at Segment share valuable lessons on AWS autoscaling. The key points for me in the post are:
- AWS autoscaling for ECS follows the formula
new_task_count = current_task_count * ( actual_metric_value / target_metric_value ). The ratio
actual_metric_value/target_metric_value limit the magnitude of scale out event. To overcome this, you either have to reduce the target value leading to over scale all the time or use custom CloudWatch metric
- The default cool down time for scale out event is 3 minutes and cooldown for scale in event is 5 minutes
- Multiply your time by asking 4 questions about the stuff on your to-do list: 10 mins read. This post won’t tell you how to magically make each day 38 hours long (we’re still working on that). But by assessing our tasks in terms of their significance, we can free up more time tomorrow.
Dotfile madness: 10 mins read. I just counted my home directory has more than 30 hidden directories. The post makes a valid argument against proliferation of dot files and dot directories. The author writes:
> Avoid creating files or directories of any kind in your user’s $HOME directory in order to store your configuration or data. This practice is bizarre at best and it is time to end it. I am sorry to say that many (if not most) programs are guilty of doing this while there are significantly better places that can be used for storing per-user program data.
Life of a SQL query: 15 mins read. What happens when you run a SQL statement? We follow a Postgres query transformation by transformation as a query is processed and results are returned.
Splitting Up a Codebase into Microservices and Artifacts: 10 mins read. This is the first post that you should read if you are thinking about Microservices. I like the way this post first talked about using module boundary to split the code base. If module boundaries are not enough then you should think about Microservices. In my opinion, you should choose Microservices 1) to scale engineering organization 2) the real need for your polyglot environment depending on your business problem.
Golang Datastructures: Trees: 20 mins read. This is an awesome read even if you can’t comprehend Golang. This beautifully written post explains how to implement a simple DOM tree in Golang. It shows implementation of breadth first search and depth first search algorithms to implement find functionality. I thoroughly enjoyed this post.
Deploying Python ML Models with Flask, Docker and Kubernetes: 30 mins read. This is an extensive tutorial that shows how to deploy Python Flask applications on Kubernetes. It covers how to deploy Machine Learning (ML) models into production environments by exposing them as RESTful API Microservices hosted from within Docker containers, that are in-turn deployed to a cloud environment.
A Minimalistic Guide to Kata Containers: 5 mins read. This is a short post that I wrote about Kata Containers. Kata Containers provide the best of containers and virtual machines. Read the post to learn more.
Building a Better Profanity Detection Library with scikit-learn: 15 mins read. This post covers how you can write your own profanity filter using machine learning. The author starts by giving reasons why he didn’t use existing profanity libraries and then he goes over the steps required to create your own profanity detection library.
The time to read this newsletter is 155 minutes.
I read a book one day and my whole life was changed – Orhan Pamuk
The time to read this newsletter is 165 minutes.
If we encounter a man of rare intellect, we should ask him what books he reads. – Ralph Waldo Emerson
The time to read this newsletter is 150 minutes.
Curiosity is, in great and generous minds, the first passion and the last – Samuel Johnson
- Monorepos: Please don’t!: 20 mins read. In this post, Matt Klein gives reasons to why monorepo approach does not provide benefits most often cited by monorepo proponents. His recommendation is to go with polyrepo structure. The post makes four valid arguments:
- Organizations that use monorepo spend considerable engineering resources on building tools to work with monorepos. Most organisations don’t have such luxury.
- Monorepo makes it difficult to open source internal projects as you have single commit history
- Most VCS are not meant to be used for large monolithic repositories. There is some work done by Microsoft as part of its git VFS project but it has some rough edges.
- The last interesting point that post makes is The frank reality is that, at scale, how well an organization does with code sharing, collaboration, tight coupling, etc. is a direct result of engineering culture and leadership, and has nothing to do with whether a monorepo or a polyrepo is used.
The main drawback of polyrepo approach is that it creates a culture where different teams own different parts of the code. There are few people in organization aware of the big picture. This point is beautifully put by Adam Jacob in his post – Monorepo: please do!.
My take on this is somewhere in between. For example, if you are building a web application then I like to keep all backend Microservices in one repository and front-end application in another repo. I think the best is somewhere in between the both approaches. Taking either too far does not work.
The time to read this newsletter is 165 minutes.
He who learns but does not think, is lost! He who thinks but does not learn is in great danger — Confucius
- Stop Learning Frameworks: 15 mins read. This post makes a great point that we should focus more on gaining deep understanding of software development fundamentals than learning new frameworks. Frameworks come and go and most of the time knowledge you gain is not portable. Focussing on core software development topics like algorithms, unit testing, design patterns, HTTP, DDD, etc will give you better return for your time. If you encounter a new technology at work then you will learn it anyhow.
Another post on similar lines that I read this week said Stack is irrelevant. Author writes,
I’ve started with almost no knowledge about the stack and been up to speed in less than 3 months or so. This is why I always tell people to pick up technology and language agnostic problem solving skills because those are the only skills transferable across stacks.
- Analyzing Hacker News book suggestions in Python: 15 mins read. A quick tutorial that will teach you how to do simple text processing in Python. This tutorial covers how to find top book recommendations in an HN thread. The approach used in simple and most programmers can easily follow it.
- The Major Features in Postgres 11: 20 mins read. This is 24 page PDF slidedeck covering important features introduced in Postgres 11. My favourite features from the list are:
- Partitioning table by hash
- Parallel hash joins
- Finer-Grained access control
- Materialized views vs. Rollup tables in Postgres: 15 mins read. I loved this post for being clear and to the point. It teaches when to use Materialized view and rollup tables. I could see myself using it in near future. Great post!
- How to Get Over Productivity Guilt: 10 mins read. Just the kind of post that you need to end your year. I feel productivity guilt on many days. I am trying to be better at it. Else, it kills all your happiness and you end up doing nothing. Prioritize the essentials tasks that you need to do and get them done. Life will be good. Don’t over stress yourself.
- The business case for serverless: 10 mins read. I spent few months on building an application using Serverless. I uses AWS stack to build the application.I enjoyed Serverless experience but I found that pace of development goes down as you are not able to do end-to-end testing on your local machine. This post does a good job in making a business case for Serverless. The main point author makes is that Serverless increases developer velocity. Author concludes the post by writing
One day, complexity will grow past a breaking point and development velocity will begin to decline irreversibly, and so the ultimate job of the founder is to push that day off as long as humanly possible. The best way to do that is to keep your ball of mud to the minimum possible size— serverless is the most powerful tool ever developed to do exactly that.
Another interesting post on Serverless that I read this week is by Amazon ‘s Tim Bray — Serverless Everything. He proposed an interesting way to look at Serverless using data plane and control plane analogy. Give it a read, it is not that long.
The Amazon MQ service, which is a managed version of the excellent Apache ActiveMQ open-source message broker. To make this usable by AWS customers, we had to write a bunch of software to create, deploy, configure, start, stop, and delete message brokers. In this sort of scenario, ActiveMQ itself is called the “data plane” and the management software we wrote is called the “control plane”. The control plane’s APIs are RESTful and, in Amazon MQ, its implementation is entirely serverless, based on Lambda, API Gateway, and DynamoDB.
- Benchmark PostgreSQL With Linux HugePages: 15 mins read. PostgreSQL is my first choice when it comes to RDBMS. This post covers how we can configure Linux HugePages configuration to improve performance of Postgres database. Author concludes the post by writing
One of my key recommendations is that we must keep Transparent HugePages off. You will see the biggest performance gains when the database fits into the shared buffer with HugePages enabled. Deciding on the size of huge page to use requires a bit of trial and error, but this can potentially lead to a significant TPS gain where the database size is large but remains small enough to fit in the shared buffer.
- How we built Globoplay’s API Gateway using GraphQL: 15 mins read. This blog start with the reason why Globo’s team decided to use GraphQL instead of REST for building API. The main reason outlined in the post for choosing GraphQL is the ease with which you can support different requirements for different devices. After covering the why GraphQL, the post talks about how you can get started with it.
In mid 2018, we had two backends for frontends (BFF) doing very similar tasks: One for web, and another for iOS, android and TV. As much as I love the “backend for frontend” idea (and how cool it sounds), we could not keep the current architecture. Not only because of the reasons I just said, but because each BFF was serving slightly different content to its clients while the business team started to ask for something new: Ubiquity among all clients. the more I reviewed everything we needed to support, the more GraphQL started to make sense. While TVs need a big program poster, mobiles need a small one. We need to show exactly the same video duration among all clients. TVs should provide detailed information about each program, but iOS and android could show only a poster + program title.
- Envoy Proxy at Reddit: 20 mins read. The post goes into depth on why Reddit moved to Envoy for service to service communication. What I like in this post is how they incorporated a new technology in a step by step manner rather than going the Big Bang approach. It is a well written post so you will end up learning a lot about how big sites like Reddit introduce new technologies in their ecosystem and how they architecture evolve over time. The post outlines three requirements Reddit team had from their service mesh choice. Envoy and its ecosystem fit all of these requirements.
- Performance: Avoid adding a performance bottleneck at all costs. Any performance losses at the proxy level need to be offset by considerable feature gains. The two biggest considerations here were resource utilization and latency impact. Our mesh approach accounts for a sidecar proxy on every host, so we wanted the solution to be one that we were comfortable running on every host and at every hop in the network.
- Features: The biggest differentiator among the options was the possibility of L7 Thrift support in the proxy. Thrift is our main inter-service RPC protocol and without first-class support for the behavior control we want in a service mesh, it wouldn’t make sense to switch to something that would just be providing the same basic TCP load balancing we’re getting out of HAProxy. We’ll address this in the next section.
- Integrations and Extensibility: Being able to contribute or request integrations and possibly extend out-of-the-box functionality was also a core requirement. The network proxy needed to be able to evolve with Reddit’s service needs and developer feature requests.
- Going Head-to-Head: Scylla vs Amazon DynamoDB: 30 mins read. This post compares ScyllaDB with Amazon DynamoDB. Author writes, Scylla is a drop-in replacement for Cassandra, implemented from scratch in C++. Cassandra itself was a reimplementation of concepts from the Dynamo paper. So, in a way, Scylla is the “granddaughter” of Dynamo. That means this is a family fight, where a younger generation rises to challenge an older one. It was inevitable for us to compare ourselves against our “grandfather,” and perfectly in keeping with the traditions of Greek mythology behind our name.
The conclusion from the post says it all
- DynamoDB failed to achieve the required SLA multiple times, especially during the population phase.
- DynamoDB has 3x-4x the latency of Scylla, even under ideal conditions
- DynamoDB is 7x more expensive than Scylla
- Dynamo was extremely inefficient in a real-life Zipfian distribution. You’d have to buy 3x your capacity, making it 20x more expensive than Scylla
- Scylla demonstrated up to 20x better throughput in the hot-partition test with better latency numbers
- Last but not least, Scylla provides you freedom of choice with no cloud vendor lock-in (as Scylla can be run on various cloud vendors, or even on-premises).
I will end this newsletter with a short video — The Electronic Coach. This short video shows how Donald Knuth build a mainframe program that helped a basketball team win 11 out of 14 matches. This is an early example of computer used in data driven decision making. In case you don’t know Donald Knuth, he is one of the greatest computer scientist. His book The Art of Computer Programming is included by American Scientist on its list of books that shaped the last century of science.