My Notes on GitLab Postgres Schema Design

I spent some time going over the Postgres schema of Gitlab. GitLab is an alternative to Github. You can self host GitLab since it is an open source DevOps platform.

My motivation to understand the schema of a big project like Gitlab was to compare it against schemas I am designing and learn some best practices from their schema definition. I can surely say I learnt a lot.

I am aware that best practices are sometimes context dependent so you should not apply them blindly.

The Gitlab schema file structure.sql [1] is more than 34000 lines of code. Gitlab is a monolithic Ruby on Rails application. The popular way to manage schema migration is using the schema.rb file. The reason the Gitlab team decided to adopt structure.sql instead is mentioned in on of their issues [2] in their issue tracker.

Now what keeps us from using those features is the use of schema.rb. This can only contain standard migrations (using the Rails DSL), which aim to keep the schema file database system neutral and abstract away from specific SQL. This in turn means we are not able to use extended PostgreSQL features that are reflected in schema. Some examples include triggers, postgres partitioning, materialized views and many other great features.

In order to leverage those features, we should consider using a plain SQL schema file (structure.sql) instead of a ruby/rails standard schema schema.rb.

The change would entail switching config.active_record.schema_format = :sql and regenerate the schema in SQL. Possibly, some build steps would have to be adjusted, too.

Now, let’s go over the things I learnt from Gitlab Postgres schema.

Below are some of the tweets from people on this article. If you find this article useful please share and tag me @shekhargulati

Continue reading “My Notes on GitLab Postgres Schema Design”

Improve Git Monorepo Performance

Today, I was exploring source code of the Gitlab project and experienced poor performance of the git status command. Gitlab is an open source alternative to Github.

Below is the output of git status command

 time git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
git status  0.20s user 1.13s system 88% cpu 1.502 total

The total here is the number of seconds it took for the command to complete.

The same was the case for the git add command.

time git add .
git add .  0.21s user 1.11s system 115% cpu 1.146 total

So both commands took more than a second to finish.

These commands are slow because they need to search the entire worktree looking for changes. When the worktree is very large, Git needs to do a lot of work.

Continue reading “Improve Git Monorepo Performance”

Why I like gRPC?

I have started using gRPC for service to service communication between Microservices and I am liking it so far. 

I still prefer to expose APIs to the external world(browser, mobile, or  third-party) using either REST or GraphQL. I am aware that you can use gRPC in Mobile apps and you can use grpc-web in web frontends. But, I have not used gRPC for those use cases yet.

I have earlier used REST(JSON over Http) and/or some form of Event-driven communication for service to service communication. They both work but the programming model leaves much to be desired.

gRPC is a HTTP/2 based modern and efficient inter-process communication style developed by Google. It is heavily used at Google and many other major tech companies such as Square, Lyft, Netflix, CockroachLabs, Salesforce, and many others.

As shown in the picture below gRPC builds on top of HTTP/2 and SSL as the efficient and secure transport layer. It uses Protocol Buffer for defining API contracts and efficient serialization. gRPC core provides the framework to do efficient service to service communication. gRPC tooling generates clients and servers that are used by the application tier.

Now that we understand gRPC basics I will share my reasons to prefer gRPC for service to service communication.

Continue reading “Why I like gRPC?”