22

I'm getting ready to initiate a pull request on a branch I'm working on. On github, I see that I'm 5 commits ahead of master, so I'd like to squash my commits into one. I run a git log to see what the previous commits are:

git log --oneline 4363273 Updated Order_Entry with bulk UPDATE command e7e0c64 Updated Order Entry module and Orders Schema 2cff23e Merge branch 'order_schema' 104b2ce Orders Schema f7d57cf Order Entry updated to handle and log responses from LC afa1b7b Merge pull request #18 from project/bugfix/mockvenue 4b2c8d8 Return correct string in mock venue API 

Now I think I want to squash those top 5 commits listed above (4363273-f7d57cf). So I then run:

git rebase -i HEAD~5 4363273 Updated Order_Entry with bulk UPDATE command pick 44768b2 Add script to run simulation on a strategy pick f82ec8d Implement mock venue pick f7d57cf Order Entry updated to handle and log responses from LC pick 4b2c8d8 Return correct string in mock venue API pick 104b2ce Orders Schema pick 4363273 Updated Order_Entry with bulk UPDATE command 

How come the list of commits shown after running git rebase doesn't match the first 5 commits when I run git log. Particularly, why do e7e0c64 and 2cff23e appear in git log but not git rebase?

* 4363273 Updated Order_Entry with bulk UPDATE command * e7e0c64 Updated Order Entry module and Orders Schema |\ | * 2cff23e Merge branch 'order_schema' | |\ | | * 104b2ce Orders Schema | * | afa1b7b Merge pull request #18 from project/bugfix/mockvenue | |\ \ | | |/ | |/| | | * 4b2c8d8 Return correct string in mock venue API | |/ * | f7d57cf Order Entry updated to handle and log responses from LC |/ * 8ed2260 Merge pull request #17 from project/mockvenue 

enter image description here

2
  • 2
    git log sorts by date (by default) but git rebase must (and therefore does) use topological order. What happens if you add --graph to your git log --oneline? (Using --graph shows a graph of commits and forces git log to sort topologically.) (You might also add --decorate which will show you branch names and such, although that tends to be more interesting when using --all than when not using it.) Commented Feb 26, 2016 at 18:49
  • @torek I posted the graph. I'm looking through it and I still don't understand why e7e0c64 and 2cff23e don't appear when i run the rebase command. Commented Feb 26, 2016 at 19:29

1 Answer 1

34

Your graph shows numerous merges. This is the source of the issue. It's worth noting here that HEAD~5 means "five steps backwards following --first-parent".


Let's cover a bit of background first. Generally speaking, you can't rebase a merge, and rebase usually doesn't try (it usually just discards them). Using git rebase -p will try to preserve merges, and will often succeed, but it's very difficult to use interactively (because the edit script does not have a representation for merges).

We can see more once we understand how rebase works. Suppose that we have a commit graph like this:

 B - C - D <-- other-branch / ... - A \ E - F - G <-- your-branch 
  1. Rebase takes a series of commits, and turns them into changesets / patches. That is, if you have commits E, F, and G that follow commit A, git must produce a diff from A to E, then from E to F, and finally from F to G. These represent "how to convert from the base commit" (which is currently A) "to the tip, as a sequence of modifications."

  2. Then, rebase turns to the new base commit, in this case commit D, and applies these patches in sequence. The change from A to E, as applied to D, makes a new commit E'. The change from E to F, applied to E', makes a new commit F'. The final change becomes G'.

Changing a commit to a patch (by comparing it with its parent) and then applying the patch is literally a git cherry-pick. In other words, rebase is just a series of cherry-pick operations, picking commits E, F, and G onto a new branch extending from D. The new commit graph would look like this:

 B - C - D <-- other-branch / \ ... - A E' - F' - G' <-- your-branch \ E - F - G [reflog only] 

If we try to do this same thing with a merge commit, we run into a problem. A merge has two (or more) parents, and cannot be cherry-picked without human assistance: you must tell git cherry-pick which parent is the one to diff against.

What git rebase -p does is redo the merge, rather than attempt a cherry-pick. The rebase documentation has this example, in which you might rebase A onto Q:

 X \ A---M---B / ---o---O---P---Q 

They do not show the result but it should look like this (ideally, with the original A--M--B sequence greyed-out):

 X-------- \ \ A---M---B | / | ---o---O---P---Q | \ | A'--M'--B' 

Note that new commit M' has to be a merge between A' and (unchanged) commit X. This works if the merge is a normal (non-"evil") merge but is obviously a bit tricky, at the least.


Let's get back to your particular situation, where git log --graph has given the text below (which I've modified just a bit). It's a bit hard to turn sideways (the other graphs above have predecessor commits on the left, and successors on the right, while the git log --graph output has predecessors below and successors above), but I'll take a quick stab at it, by adding single-letter codes for each commit:

H * 4363273 Updated Order_Entry with bulk UPDATE command G * e7e0c64 Updated Order Entry module and Orders Schema |\ F | * 2cff23e Merge branch 'order_schema' | |\ E | | * 104b2ce Orders Schema D | * | afa1b7b Merge pull request #18 from project/bugfix/mockvenue | |\ \ | | |/ | |/| C | | * 4b2c8d8 Return correct string in mock venue API | |/ B * | f7d57cf Order Entry updated to handle and log responses from LC |/ A * 8ed2260 Merge pull request #17 from project/mockvenue 

Now with A as the leftmost commit and H as the rightmost:

 C---D /___/ \ //__--E-F /// \ A-----B-----G--H 

The direct (first-parent) line of ancestry goes from H to G, then to B, then to A. This means that HEAD~5 is a commit we can't even see (two to the left of A), and git rebase -i HEAD~5 should list all of these commits except for merges (D, F and G). That would be five commits: A, B, C, E, and H. But based on its log message, A is also a merge. We're missing information here and can't draw the complete graph (which is just as well since the compact form has a lot of lines in it).

In any case, this is in fact what's happening. The rebase command finds commits to cherry-pick by listing every commit reachable from the tip commit (HEAD, which is commit H) that is not reachable from the first excluded commit (HEAD~5, a commit two steps to the left of A, so we can't see it). It then throws out merge commits and will cherry-pick each remaining commit to build a new, linear branch.

Whether this makes sense, and which commits you should cherry-pick, are not something anyone else can answer for you.

Sign up to request clarification or add additional context in comments.

3 Comments

Wow. Thanks for that explanation. Really gave me some clarity. I thought I understood rebase, but clearly there is a lot more to it. So in my graph that you've alphabetized, commits A, D, F, and G don't appear when I run rebase because they're all merge commits?
Yes (sorry about delay in responding, I was off on a weekend trip).
What an answer. Perfectly explained. Thank you so much for being so helpful.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.