1

I have strange git rebase behavior for next example. Suppose, I have the local tree:

 ∨ b1--b2--b3 / a1--a2--a3 

and the remote tree

 b1--b2--b3 / a1--a2--a3--a4--a5--a6 

where the branch a is master

and at the current time, I am being at the b3. I call the next commands:

git pull origin master master // fixed typo git rebase master 

after this actions I have the tree, that look like:

 a4'--a5'--a6'--b1--b2--b3 / a1--a2--a3--a4--a5--a6 

instead

 b1--b2--b3 / a1--a2--a3--a4--a5--a6 

Why does it occur?

7
  • i think "git rebase origin/master" will fix your problem. or adding "--rebase" option to "git pull" so that it becomes "git pull --rebase". The problem is that you are doing a "git pull" which bring in a merge commit (when necessary) to your local master, and then you are rebasing against that. Commented Jun 28, 2017 at 6:36
  • I assume git pull origin mastert master is a typo / cut-and-paste error, and that you meant git pull origin master. I also assume you were on some branch other than your own local master when you ran git pull. Commented Jun 28, 2017 at 6:36
  • No, I did git pull origin master master Commented Jun 28, 2017 at 6:41
  • That's still slightly (and importantly) different than what is in your question. Commented Jun 28, 2017 at 6:44
  • 1
    "git pull origin master master" is the same as "git pull origin master" (the 2nd master is irrelevant). Well, almost the same. The auto-generated commit message for the merge commit will be slightly different, but that's just cosmetic. Commented Jun 28, 2017 at 18:54

1 Answer 1

7

I always advise people to avoid git pull until they are very familiar with Git, as it does too much, in a somewhat mysterious way, that is hard to explain. It's often (and reasonably accurately) described as equivalent to running git fetch followed by either git merge or git rebase, but the tricky part is that the arguments it passes to git merge or git rebase are not what you might expect.

That said, let's break these down and look closely at each step. I will also draw your commit graph a bit differently. (You call this a "tree" above, which is not the best terminology for two reasons. First, the commit graph is a graph, specifically a Directed Acyclic Graph or DAG. All trees are DAGs, but not all DAGs are trees. Second, Git uses the term tree to refer to "tree objects"; each commit carries a tree object.)

Your initial commit graph looks, I believe, like this:

 B1--B2--B3 <-- branch (HEAD) / A1--A2--A3 <-- master, origin/master 

Note that your local branch-name master points to commit A3. Your remote-tracking branch name origin/master also points to commit A3. Your current branch, however, is named branch (I had to invent this name) and it points to commit B3.

You now run git pull origin master master, or perhaps git pull origin mastert master (it matters which of these you use, and your comment contradicts your question). This git pull command runs git fetch with a number of arguments.

The git fetch step obtains new commits

Whenever you run git fetch, including when you use git pull to run git fetch, your Git calls up another Git, probably at the URL stored under your remote name origin. Here—as run by git pul origin master master or perhaps git pull origin mastert master—this other Git has delivered several new commits. Your Git stores these new commits in your repository, and if your Git is at least version 1.8.2, your Git updates your origin/master (and maybe your origin/mastert) to point to these new commits.

Your Git also stores in the special file FETCH_HEAD the IDs of the new commits and the name(s) of the branch head(s) obtained from the remote. Hence we can add this to the graph drawing:

 B1--B2--B3 <-- branch (HEAD) / A1--A2--A3 <-- master \ A4--A5--A6 <-- origin/master, FETCH_HEAD 

What's in FETCH_HEAD is a bit tricky. The contents of this file include the original names, i.e., master and perhaps mastert, as seen on the other Git, without any of the origin/ renaming that your Git normally uses. There may be some lines for tags too. Some of these lines may also be annotated with not-for-merge. All of the lines contain hash IDs for specific Git objects (commits and sometimes tags). The git fetch program leaves these lines specifically for the git pull program to examine.

Next, your git pull command extracts from FETCH_HEAD the hash IDs of the newly obtained commits. That is, it searches through FETCH_HEAD for hash IDs for commits that are not labeled not-for-merge. (I assume here you are doing a merge style pull rather than a rebase style pull.) It then runs git merge with these hash IDs.

The git merge step makes a merge commit

Again, it matters now whether you ran git pull origin master master or git pull origin mastert master. If you ran the latter, and there was branch named mastert, there will be two matching lines, and perhaps two different hash IDs. In this case, your git pull command will perform an octopus merge of these two hashes.

If not, you will get a normal merge of the one hash. I am going to assume you get a normal merge. The result is this:

 B1--B2--B3------M <-- branch (HEAD) / / A1--A2--A3 <-- master / \ / A4--A5-----A6 <-- origin/master, FETCH_HEAD 

Note that your branch name master still points to commit A3, not to commit A6.

You now run git rebase master.

Rebase strips off merges

What git rebase does is that it copies commits. The commits it copies are those you specify. The destination for the new copies is also something you specify.

When you run:

git rebase master 

you are telling Git to copy any commits that are on your current branch (i.e., branch) that are not reachable from your branch named master, and that are not merge commits (you will copy commits reachable from the merges, but not the merges themselves). Note that this is not using origin/master, but master. Hence the commits you ask your Git to copy are B1, B2, and B3, and A4, A5, and A6 ... in some order (the actual order somewhat difficult to predict, though the A and B groups will be copied together, as Git uses --topo-order when collecting the IDs of commits to copy).

The place you ask your Git to copy them is "after the tip of master", i.e., after A3.

If Git copies B1 first, it will generally retain B1 unchanged, because git rebase will do that if it can. If it copies A4 first, it could also try to retain A4 unchanged, for the same reason; but the mechanism that does this is foiled by this particular rebase (see comments below). Once Git has copied A4 through A6, it must copy B1 to a new commit whose parent is the A6' copy of A6 (this commit differs from the original B1 as B1's parent is A3).

(You can, however, force git rebase to copy anyway, e.g., with --no-ff or --force.)

Your outcome suggests that your Git chose to copy A4 first. This gives you:

 A4'-A5'-A6'-B1'-B2'-B3' <-- branch / A1--A2--A3 <-- master \ A4--A5--A6 <-- origin/master 

Without --force or --no-ff, if git rebase were slightly smarter, it could (but doesn't) produce:

A1--A2--A3 <-- master \ A4--A5--A6 <-- origin/master \ B1'-B2'-B3' <-- branch (HEAD) 

or (can and sometimes does) produce:

 B1--B2--B3--A4'-A5'-A6' <-- branch (HEAD) / A1--A2--A3 <-- master \ A4--A5--A6 <-- origin/master 
Sign up to request clarification or add additional context in comments.

5 Comments

I also prefer "git fetch" but I'm sad that it cache-busts all my remote branches, making the "--force-with-lease" option on git push pointless.
@G.SylvieDavies: you can run git fetch origin br1:origin/br1 (and, since 1.8.2, leave out the :origin/br1 part) but admittedly by this point it's really close to just resorting to git pull directly :-) It turns out there are some weird interactions with tag fetching as well: tags only get updated if you name an explicit local reference (such as the remote-tracking branch above, the :origin/br1 part).
It has to copy a4 rather than use it in-place, because nothing like a4 is present on (local) master (a3). Try for yourself: vm.bit-booster.com/bitbucket/plugins/servlet/bb_net/projects/BB/… (git clone; git checkout b; git pull origin master; git rebase master; git log --all --date-order --graph --decorate)
The logic for this mechanism is really interesting, "it will generally retain B1 unchanged, because git rebase will do that if it can". Here's the info from "git help rebase": "Note that any commits in HEAD which introduce the same textual changes as a commit in HEAD..<upstream> are omitted (i.e., a patch already accepted upstream with a different commit message or timestamp will be skipped)." I suspect it's running "git patch-id" to implement that logic.
@G.SylvieDavies: Aha, right, the reuse-instead-of-copy code path requires that the rebase code see the commit here (which it doesn't). It's not literally running git patch-id, but it is the same code: it uses git rev-list with the symmetric difference notation and the various --cherry options.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.