My summer interns are learning about Git this week. Most are completely new to version control, so I put together the following notes to outline how I use Git to manage my thoughts and development process.
I’ve ramped teammates up with Git in the past, but typically started with the API. I think that was a mistake. This time, I started the conversation with a focus on process, and that seemed to work well; feedback has been that the how was easy to digest since the what and why were covered up front.
I’d like to think my approach to SCM is a happy medium between Gitflow and just committing everything directly to main. Part of what makes it a good compromise is that it’s composable; if one component seems too cumbersome for my current project, it can be forgone as I see fit.
Let’s jump in from the beginning of a project:
Create a GitHub repository
Go to GitHub and click that big green button. I never use the GitHub UI to initialize any files, and instead add any required boilerplate once the project has been created.
Tip: Consider creating a Cookiecutter when you find yourself creating the same boilerplate over and over.
Start working on the master branch
Begin by cloning the repository. I prefer to do this using SSH (you can find the SSH-based URL for the remote on the project’s GitHub page). Once I’ve done this, the master branch will be the default branch.
Next, I do the bulk of my work, remembering to commit changes in meaningful increments along the way.
Tip: Use Make recipes to auto-format and lint code. I use black, isort, and flake8 on every Python project. When working with notebooks, I use nbqa and nbstripout as well. This practice makes my code much easier to review in the future. It also helps me focus on logic rather than presentation.
Tag the first MVP as v0.1.0
Eventually, I’ll get to a point where I have something that can be demoed or used by someone else. This is an MVP, or minimum viable product. It won’t have many features, nor will it be bug-free, but that’s okay.
Make sure to update any version numbers in the code, then I create an empty
git commit --allow-empty) with a message like:
v0.1.0 New features - Initial release
Finally, go to GitHub and create a new release (in Git terminology, this is also called a tag). Name the tag “v0.1.0”. I usually create a description of the release in Markdown like this:
# v0.1.0 ## New features - Initial release
Create GitHub issues for all new features, bugs, etc.
Once there’s been a release, I find it easier to think in terms of adding features and fixing bugs. Some teams use other tools like Jira or Azure DevOps to manage tickets1, but I always create GitHub Issues for all new features and bugs so that the technical discussion can be as close to the code as possible. For those who insist, links to external ticket systems can be added to the GitHub Issue.
Create a develop branch
One principle of Gitflow that I try to emulate (at least after an initial
release) is that the master branch should always be “stable”; I follow the
controversial recommendation to create a long-lived develop branch (
checkout -b develop) from master.
For each GitHub issue, create a feature branch off of develop
Once on develop, I begin to create branches to resolve individual GitHub Issues. I’ve changed my opinion quite a bit on how to name branches, but lately my preference has been to list the general type of work (e.g., bug or feature) followed by the GitHub Issue number. For example:
| ———– | ———————————————————————————————— |
bug/1 | Fixing the bug noted in GitHub Issue #1 |
feat/51 | Adding a new feature requested in GitHub Issue #51 |
gen/7 | Updating a CI script, as noted in GitHub Issue #7 |
hotfix/24 | Making an urgent hotfix (i.e., will be merged directly into master) as noted in GitHub Issue #24 |
Create a PR for each branch back into develop
When a feature branch is ready for review, I push it to the GitHub remote and make a pull request (PR). The pull request gets based on developed (i.e., I ask to merge the branch into develop).
Ideally, projects have continuous integration (CI) pipelines that are triggered when the PR is created. The result of CI (e.g., if a test fails) informs whether I make additional changes. When working on a project with teammates, the PR may be reviewed by someone else who requests changes as well.
When I merge a PR on a solo project, I don’t use squash commits and create a merge message with the following format:
Name of the PR on GitHub (#20) Closes #19. Plus more narration about the changes, if that is warranted.
Create a release branch and PR into master
Eventually comes time for another release. On my development machine, I’ll
create a release branch named
release/v1.3.0 from develop, push it to the
remote, and create a PR with the following information:
# v1.3.0 ## New features - Added support for a cool widget (#3) - Another new thing (#10) ## General improvements and bugfixes - Some CI issue (#4) - A notable improvement to the documentation (#6)
Often a team will do some final integration testing or similar, which may result in additional modifications to this branch. Remember, as with your first MVP, to make version number updates and tie up any other housekeeping before finalizing the release.
By now, it should be clear how the above information can be put in the merge commit message as well.
Tag the merge commit
Once merged into master, make sure to tag the merge commit as a new release. I use the same information from the release PR and commit message in the release notes.
Take a pat on the back for another tidy release, and then jump back into the develop branch and work on your next round of issue-guided PRs. The fun never ends…
This seems like a ton of work. Can I skip some parts?
Yes. Some pieces here are more work than they are worth for certain projects. If you want to slim down this workflow, I’d recommend skipping the develop and release branches. Instead, you can still create pull requests but bae them on master.
I do not recommend skipping the following:
- GitHub Issues, feature branches, or PRs - These are the core components of this time of Gitflow derivative. When you bring them together, they make for a great way to track incremental improvements in your codebase.
- Release tagging - This really doesn’t take much effort, and I find it fun anyway. It’s an easy way to keep track of code that is out in the wild or being used by others; good tags represent important milestones you are most likely to roll back to.
Why is it called Benflow? Seems kind of bigheaded.
It’s not called anything. It’s just the name of the blog post. This isn’t a real “thing”, just some notes on what I do.
I don’t like this. Please don’t do it if you work with me.
I like to call these “Agile tickets” in a patronizing voice because c’mon, nobody really likes these, right? …Right? ↩