Software systems are complex. There are many ways to manage this complexity. One approach might be to break the software into many small pieces stored in many repositories. I once thought this. With inexperience, this is natural – most of the tech chatter we hear today is about micro-services, distributed systems, and the various problems of scaling software (Docker, Kubernetes, etc). However, most of us are not Google, and many of us are solving problems where one or two cloud servers can easily handle the device/user load. In the early days of product development, time is much better spent on solving the user’s problems as simply as possible and doing things that don’t scale. Even creating too many directories in your source code early on can be a mistake – especially with typed languages that are easy to refactor. If the product is successful, there will be time/resources later to figure out how to make it scale.
There are some systems that are inherently distributed. Two examples that come to mind are browser front-ends and IoT systems. The computing happens in physically different locations, so there is no way to avoid being distributed. There may also be systems where you need to blend vastly different technologies written in different languages (such as machine learning). In these cases, distributed technologies such as Protobuf and Nats.io are very useful. There are also cases where you want to make some code public/OSS and some private – in this case, multiple repositories may make sense. Large organizations and massive scale have challenges that lend to splitting things up. However, there are also large projects (such as the Linux kernel) that have done very well as a monolithic project in a monorepo. Does your project have 28 million lines of code?
Working with technologies (such as Go, Rust, and Elm) that don’t require complex build/distribution tooling brings some of the joy back into programming. The best solution to many problems is to make things simpler, not add layers of complexity and tooling.
Below are my notes on these topics:
Monolith vs Microservices
- To misquote a wise man: “Some people, when faced with a coupling problem, think ‘I know, I’ll use microservices!’. They now have two problems.” (can’t find the source of this quote)
- From gophers slack:
- Sometimes I say microservices are a way to turn Conway’s Law from a liability to an asset
- Generalizing: microservices are a solution to organizational problems, not technical problems
Monorepo vs Polyrepo
- Google https://dl.acm.org/doi/pdf/10.1145/2854146
- advocates polyrepo:
- digital ocean
- Many of the excellent tools in the Go ecosystem work even better when used in a monorepo. For example, gorename is incredibly useful. Just recently, I was able to use it to rename an identifier from foo.FooDB to foo.DB in every single Go source file in the entire repository. The ability to refactor en-masse in this fashion is invaluable. Any changes made will be compiled and tested across all internal Go projects.
- each team is responsible for their own deployment