1

I would like to know if it is possible to do differential backups of a git repository, using git bundle like this:

First time:

git clone --mirror https://github.com/me/myrepo git bundle create base.bundle --all 

Every time I want to create a differential bundle:

cd myrepo git fetch --all # is this necessary? (because "git bundle" doesn't check remote) git bundle create diff.bundle $(git rev-parse HEAD)..HEAD --all 

My main question is if above method ensures that base.bundle and diff.bundle, when used together, contain the complete repository right from repository creation up until the point when diff.bundle was taken, including branches, tags, and whatever else there may be in a git repo that I'm not aware of.

1 Answer 1

2

Your first two commands work fine to create a base bundle. However, your second group of commands won't do the right thing. You do want the git fetch (--all is needed only if you have multiple remotes), but you want the newly created differential bundle to be done with "negative refspecs" for every ref that was a positive refspec in the previous bundle. That is:

git bundle create diff.bundle $(git rev-parse HEAD)..HEAD --all 

is clearly wrong for two reasons:

  1. The inner git rev-parse HEAD uses the current HEAD, which may not be correct;
  2. If the previous bundle (initial or previous differential) used refs/heads/br1, refs/heads/br2, refs/tags/t1, refs/remotes/origin/r1, and refs/remotes/origin/r2 as its positive refspecs via --all, you need negative refspecs that will produce each hash ID from all of the positive refspecs.

The easiest way to fix both of these is to:

  1. have the initial sequence end with git rev-parse --all with output saved somewhere;
  2. for creating a new differential bundle, use ^$hash for each hash ID listed in the saved output from the last save;
  3. after creating the new differential bundle, use git rev-parse --all again to get the positive refspec hash IDs.

So you'll end up with something along these lines:

git clone --mirror https://github.com/me/myrepo git bundle create $HOME/b/base.bundle --all git rev-list --all > $HOME/b/hashes 

followed by:

cd myrepo git fetch # --all if needed, but per the above there's just one remote git bundle create $HOME/b/diff.bundle $(sed s/^/^/ < $HOME/b/hashes) --all git rev-list --all > $HOME/b/hashes 

Warning: this is entirely untested. I'm also assuming that each diff.bundle is an increment to the previous diff.bundle here, i.e., these each need to be saved separately.

(You're probably best off using real backup software anyway, but this is likely to work.)

Sign up to request clarification or add additional context in comments.

4 Comments

This is very instructive - thanks a lot! Doing git rev-list --all > $HOME/b/hashes after every fetch/bundle-create makes it an incremental backup though. I'm just mentioning this, because my question was about differential backups. This is good to know, however and I can just omit that step to do differential backups.
I hadn't heard this "differential backups" term before, so I assumed it was an odd translation for "incremental backups", but yes, if you just want to replace a single incremental backup over and over, that's the way to do it.
I think both incremental backup and differential backup are common terms, in fact each one has a wikipedia page of their own.
Interesting. The page for differential backup mentions that Oracle "leverages a backward description", i.e., not everyone actually uses these words the same way. It looks like this terminology began to be used in the early 2010s (which puts it well after I learned these things; we used a hybrid scheme that incorporates both features but called it only "incremental" backup).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.