I’m a great proponent of building our own tools. As developers, we perform many repetitive tasks, big and small. Many of these tasks might be made easier by using an appropriate tool instead.
We often concentrate too much on solving more significant problems. Sometimes it doesn’t even need to be a real problem for a tool to be valuable. We have to find the sweet spots in our workflow.
The best investment is in the tools of one’s own trade
– Benjamin Franklin
This is the story of why and how I built a multi-repository Git CLI helper tool, Tortuga, and what I’ve learned doing it.
Finding Your Sweet Spot
In my company, we have to deal with a multitude of Git repositories every day. Our main product consists of a core project, and up to 3 client-specific sub-projects — each in its own repository.
Currently, I have 20 repositories in my
code folder that need to be fetched, pushed, and rebased regularly.
Most Git tools I use are focused on dealing with one repository at a time. Sometimes you miss something, and you might end up in rebase/merge hell.
We needed a tool to bring all our repositories up-to-date, with as little friction as possible. After seeing my colleagues and myself struggling with this menials task every day, I’ve decided to automate it.
Even though it’s only an internal tool, I treated it like a real project. I could’ve written a small shell script without much planning beforehand, and be done with it. But if a tool has to handle a certain amount of complexity and edge-cases, we should do it right.
Building a tool will consume time and focus, and we need to employ them wisely. Defining the requirements, use-cases and risks, long-term maintenance, etc., will help immensely to achieve the best possible outcome.
Thinking about our tool, these are the basic requirements that come to mind:
- Fetch, push, and rebase multiple repositories
- Parallel, not serial
- Easy distribution
- Works on all dev machines (Linux, macOS, and Windows)
Seems simple enough. But after digging a little deeper, softer and nice-to-have requirements emerged:
- Handle uncommitted changes gracefully
- Work with multiple repositories at once (multi-threading)
- Display changes / dry-run
- CI/CD considerations
- Open-Source distribution
Don’t just think about how you would use the tool.
Ask your colleagues what the tool’s purpose means to them and how they intend to use it. Only this way, every possible use-case is visible to us as the developer of the tool. And we can minimize bugs and misuse of the tool from the early on.
After speaking with my colleagues, one additional use-case was revealed: “single-directory” use. What if we call the tool in a directory that is already a Git repository? If we just check the subfolders for repositories, the tool won’t do anything. But why shouldn’t we be able to update a single directory? The tool compacts multiple Git commands into a single call, so it also can save some time when used in a single repository.
Our tools should also be as safe as possible. But absolute safety is an impossible goal.
In the case of our little Git tool, safety is simple.
We need to preserve uncommitted changes by stashing and re-applying them after fetching.
And not use
git push --force.
Even if something goes wrong, we still go the stashes.
Our job doesn’t end after the first release of our work. If the tool is good and helpful, it might be used for a long time, and become essential to certain workflows.
What if it breaks down due to an OS update? What if the developer is no longer at the company, and no one can take over due to lack of documentation, or the programming language used?
Maintainability should influence our consideration of the building parts we end up choosing. This doesn’t mean we have to use the languages and frameworks we always use. But finding common ground with your fellow developers will come a long way.
Documentation is often an afterthought. But even in small or personal projects, it can save our butt. Just because we know right now the “why” and “how” of every little detail doesn’t mean we know them in 6 months.
It doesn’t have to be a complete project documentation with its own wiki. But to document our design decisions, especially the non-obvious ones, is essential.
What Language To Use
Choosing the right language for Tortuga wasn’t easy.
Most CLI tools I write start out as a simple shell script and most likely stay that way. But one of the requirements was “being parallel” to improve performance.
Multi-threading is possible in some form with a shell script. But the complexity it takes isn’t worth it, in my opinion. Especially if other options are available.
As you might know from my other articles, Java is my “daily driver”. It’s also the most used language at my company. Seemed like a natural fit.
Like choosing the language, we have to consider the complexity of the toolchain. So Java and GraalVM, as impressive as it is, didn’t make the cut.
I haven’t used Go or Rust enough to make an informed decision. Both are great languages, capable of building awesome CLI tools. So which one to choose?
Go declares itself as simple, reliable, and efficient.
And in my opinion, that’s true. Some might say the simplicity comes at the cost of missing features and being opinionated. This is also why Go is so successful and such a great language in the first place. It’s easy to understand, even if you don’t know it by heart. A simple cross-platform toolchain can build native, dependency-free, single executables.
It ticked a lot of boxes needed to fulfill the requirements of Tortuga. But I wanted to give Rust a chance, too.
Rust declares itself as reliable and efficient. If you know Rust, you know that the word “simple” is missing for a reason.
The language is designed for absolute reliability, correctness, and performance. That’s why the learning curve is waaaaay steeper than with Go.
It’s rigorous on correctness, using a borrow checker, to ensure memory safety. And we will end up “fighting” it a lot in the beginning, to even make our code just compile.
But it also ticks a lot of boxes for Tortuga, except the simplicity box.
To better grasp the implications of my choice, I’ve decided to build a small prototype with both languages. Starting a small project and using both languages’ tooling made me realize something: Tortuga won’t be a mission-critical tool, with low-risk.
The additional mental overhead needed for Rust’s correctness and guaranteed memory safety wasn’t worth the effort, at least for me. This doesn’t mean I don’t believe Rust isn’t an excellent language for CLI tools, quite the opposite.
Even though I love Rust’s ideas and concepts, I’ve chosen Go for the lower bar of entry and maintainability.
Building the tool itself wasn’t that hard except one thing: actually using Git.
Tortuga needs to be able to do the following Git actions:
- Get local branch name (
- Get upstream branch name (
- Count commits (
- Fetching remotes (
- Get current status (
- Rebase (
- Push to remote (
- Preserve uncommitted changes (
Initially, I planned to use a Go-based Git implementation to not have any dependencies.
But all the libraries I’ve found, like
go-git, didn’t support all the required commands.
Another option was using a C-library,
But using C-code in Go comes with a complexity I didn’t want.
Including implementing things like credential handling, Git config files, etc., all by myself.
In the end, I’ve used the locally installed Git directly via
os/exec, and parsing the output.
At first, I didn’t like relying on starting a process and dealing with its output very much. But it was the simplest way of working with Git, and it behaves just like the user using Git directly. Also, the GitHub CLI is built the same way, so it can’t be that wrong, can it?
The development started with the project repository hosted on our internal BitBucket server. I could’ve put the latest build on our NAS and tell my colleagues where to find it, and call it a day. But what about updates and bugfixes?
As I was nearing a first releasable version, I’ve decided to move to GitHub with tagged releases. Even though my company is the only user of Tortuga (as far as I know), I strongly believe we should share our tools.
Now a single
make release will build the code, run tests, and create a new release for all supported platforms: Linux (.deb-file), macOS (.tar.gz binary), and Windows (zipped .exe-file), for both 32-bit and 64-bit.
macOS users can also use my personal homebrew tap to always get the latest version. One of these days, I need to create a PPA for the Linux version…
It might seem silly to open-source such a small, internally used tool. But if it isn’t bound too tightly to our specific projects, or doesn’t reveal any secrets, why not? Maybe someone will use it similarly or can use some code for their own tools.
Building tools can be fun, and sometimes frustrating, too. Especially if we decide to use new technology or an unknown language. It’s not the best code I’ve ever written. But I’ve learned a lot about Git, Go, and releasing stuff on GitHub.