Why Google stores billions of lines of code in a single repository

Meta Info

Authors: Rachel Potvin, Josh Levenberg (Google)

Google chose to stick with the central repository due to its advantages.
The monolithic model of source code management is not for everyone, e.g., organizations where large parts of the codebase are private or hidden between groups.

Piper: The distributed source-code repository
- Implemented on top of standard Google infrastructure (originally Bigtable, now Spanner)
- Reply on the Paxos algorithm to guarantee consistency across replicas
CitC (Clients in the Cloud): The workspace client
- With a cloud-based storage backend and a Linux-only FUSE13 file system
Critique: The code-review tool
Tricorder: Static analysis system
- Code quality, test coverage, and test results
Rosie: large-scale cleanups and code changes
1. Create a large patch; find-and-replace
2. Split the large patch into smaller patches; test them independently; send for code review; commit them automatically once they pass tests and a code review

Google’s monolithic software repository is used by 95% of its software developers worldwide.
The Google codebase includes
- approximately 1 billion files
- a history of 35 million commits
- 86TB of data (excluding release branches)
Over 99% of files stored in Piper are visible to all full-time Google engineers.
Over 80% of Piper users today use CitC.

Tooling investments for both development and execution
- Code-indexing system
- Automated test infrastructure
- Build infrastructure
- Code search and browsing tools
Codebase complexity
- Unnecessary dependencies → binary size bloating
Efforts invested in code health

Git (distributed version control systems)
- A team at Google is focused on supporting Git, which is used by Google’s Android and Chrome teams outside the main Google repository.
- Important for these teams due to external partner and open source collaborations.
- The Git community strongly suggests and prefers developers have more and smaller repositories.
  - Git-clone will copy all content to one’s local machine.
Mercurial
- An experimental effort

Last updated 1 year ago

Was this helpful?