I'm a software engineer, ex-mathematician, and lifetime learner. This is my personal webpage, which includes links to things I've done throughout my entire career, many of which are about my time working on the open source Git project. Opinions are my own.
Investigating Git's architecture as a database, including its data store, synchronization, and sharding strategies. Git Merge, September 2022.
GitHub Blog, 2022
GitHub Blog, 2022
GitHub Blog, 2021
GitHub Blog, 2020
GitHub Blog, 2020
Azure DevOps Blog, 2020
GitHub Blog, 2020
Azure DevOps Blog, 2019
Azure DevOps Blog, 2019
Azure DevOps Blog, 2018
Azure DevOps Blog, 2018
Azure DevOps Blog, 2018
Azure DevOps Blog, 2018
Azure DevOps Blog, 2018
Azure DevOps Blog, 2018
Investigating Git's architecture as a database, including its data store and synchronization. A compressed version of my Git Merge talk. GitKon, October 2022.
Investigating Git's architecture as a database, including its data store, synchronization, and sharding strategies. Git Merge, September 2022.
Talking about Git, monorepos, contributing to open source, and being a principal engineer. PodRocket, February 2022.
A survey of some advanced Git features to help Git scale to the largest monorepos. GitHub Nova, October 2021.
Talking about monorepos and career trajectory. Software Engineering Daily, September 2021.
Talking about Git performance features and contributing upstream. Software at Scale, September 2021.
How can you make choices in your repository's structure to make it easier to scale? GitHub Universe, December 2020.
How can we solve scale for Git? How does Scalar solve this problem? Microsoft European Virtual Open Source Summit, June 2020.
How can we solve scale for Git? How does Scalar solve this problem? Git Merge, March 2020.
Talking about my career path, contributing to Git and open source. Software Engineering Unlocked, November 2019.
A discussion about Git, Git for Windows, and the commit-graph. Thrashing Code, September 2019.
An intro to some deep Git concepts, NCSU Software Engineering, August 2019, January 2020.
with Johannes Schindelin, Git Merge 2018.
The git for-each-ref
command now understands %(is-base:<committish>)
tokens to discover which of the filtered refs is the likely base of the given commit or branch. A
similar algorithm powers a new Azure DevOps feature for automatically choosing the best target branch
for a new pull request.
The git for-each-ref
command now understands %(ahead-behind:base)
tokens to output counts for comparing commit histories. This algorithm powers GitHub's
branches page.
By enabling this config option, your index writes will speed up dramatically while only losing a small probability of error detection. This will speed up `git status`, `git add`, and `git commit` commands, among others. A version of this feature was previously available in the `microsoft/git` fork and its version of Scalar. Git v2.40.0
Bundle URIs provide a way to offload object downloads to static content servers, speeding up clones and fetches while reducing load on the origin Git server. Git v2.38.0 (still under development)
A new option allowing a stack of topics to automatically have their tracking topic branches be updated during a rebase. Git v2.38.0
Git's bundle format pairs refs with a pack-file of objects. It can now include a `filter` capability, allowing a bundle to bootstrap a blobless partial clone. Git v2.36.0
The sparse index reduces the index size when using cone-mode sparse-checkout patterns, creating significant performance boosts for monorepos. Git v2.32.0, v2.33.0, and v2.34.0
A new builtin helps customize Git maintenance. Users can register their repositories to be maintained in the background using git maintenance start
.
Git v2.29.0, v2.30.0, v2.31.0
A new builtin helps users manage their sparse-checkout files. A new "cone mode" improves performance for a common pattern type. Git v2.25.0
Write the Git commit-graph after every git fetch
operation, ensuring that the repo is optimized for the new objects.
Git v2.24.0
Create new feature.*
config options that group other config options together. Provides recommendations for users who do not want to read every config setting.
Git v2.24.0
Speeds up
git commit-graph write
by amortizing writes across many small writes (and few big writes).
Git v2.23.0
Use the multi-pack-index to repack the object store incrementally in a highly-available environment. Git v2.23.0
Speeds up
git push
for developers working in a small cone of a large repo.
Git v2.21.0
Significantly speeds up
git log --graph
calls when the commit-graph feature is enabled.
Git v2.20.0
Verify the multi-pack-index file for corruption. Git v2.20.0
Collect and reorganize commit walking code, and improve several algorithms in the process. Git v2.20.0
Create a new file to index objects across multiple pack-files. Git v2.20.0
Verify the commit-graph file for corruption, and write the file automatically during repo maintenance. Git v2.20.0
Compute generation numbers in the commit-graph file and use them in some commit walks. Git v2.19.0
Create a new data structure and file format to store a compact representation of the commit history. Git v2.18.0
Significantly improve the mechanism for computing the shortest unambiguous abbreviation of an object ID. Git v2.17.0
Includes discussions of git for-each-ref
's new is-base
token and how it can help identify the base branch.
Discusses git sparse-checkout
command.
Discusses the git rebase --update-refs
feature in detail and how to use it in your workflow.
Includes discussions of scalar
in upstream Git as well as the git rebase --update-refs
feature.
Includes announcement that the sparse index is ready for mainstream use.
This blog post from Canva discusses how their monorepo is structured, using features like
feature.manyFiles
, background maintenance, and sparse-checkout.
Discusses sparse index improvements and partial bundles.
Includes updates to the sparse index feature.
More sparse index updates, such as `git status`, `git commit`, and `git add`.
Announces the sparse index.
Includes a description of the git maintenance
command and background maintenance.
Summarizes some of the Scalar announcement blog post, with additional details from an email interview.
Prominently discusses the sparse-checkout feature
Features a discussion of the sparse-checkout feature.
Features a discussion of feature macros and fetch.writeCommitGraph
.
Features a discussion of multi-pack-index repack/expire and incremental commit-graph.
Features a discussion of the incremental commit-graph feature.
I was awarded a Google Open Source Peer Bonus for my contributions to Git.
Links to BDFL.
Features a discussion of the review around the new sparse push algorithm.
Includes description of multi-pack-index feature.
John Briggs discusses several Microsoft contributions and how they improve performance for the Windows OS repository. Features commit-graph, multi-pack-index, background prefetch, and sparse push algorithm.
Features a summary of the support email thread "commit-graph is cool". Also discusses the RFC on Generation Number v2.
Includes an update on the commit-graph feature (in the "Cooking" section).
Highlighted in a developer spotlight.
A silly clicker game featuring graphs.
Source code
A collection of source code and values for functions in extremal combinatorics.
Source code