7

I merged develop branch into my feature branch, which resulted in merge conflicts after resolving those I committed and pushed, Now problem is the merge and conflict resolution changes are in one commit and it is hard to find what was done to resolve the conflicts. How can I have two separate commits one for merge and another for conflict fixes when there are merge conflicts?

4
  • 1
    Does this answer your question? A separate commit for conflict resolution with git merge Commented Apr 23, 2020 at 9:32
  • stackoverflow.com/questions/18860044/… Commented Apr 23, 2020 at 9:32
  • 1
    Unfortunately, the linked duplicate here has no accepted answer (and the one answer showing a step by step use of mergetool doesn't commit the conflict-marked files, it commits a no-op merge that preserves the original HEAD files instead). But I think this is a bad idea overall, myself. If you want to get the merge conflicts, just do the merge again. Commented Apr 23, 2020 at 19:08
  • There is no need to have a separate record of the conflicts, you can regenerate them any time you want by rerunning the merge. Commented Jun 11 at 4:25

1 Answer 1

4

If you really want to do this, you can—well, mostly. Git makes it quite difficult and I do not think it is a good idea. There are some conflicts you cannot capture this way.

I'll provide an outline for how to capture what you can, but not actual code for it. Instead, I will describe what the setup is, and what will go wrong.

Long

The issue here is this:

  • Git builds new commits from files that appear in Git's index (aka staging area).
  • Merge conflicts, with conflict markers, appear only in your work-tree.

The part that makes all of this make sense—because the above doesn't, unless and until you know this other part—is that when you're not in the middle of a conflicted merge, there are three active copies of each file.

Remember that commits act as snapshots: they have a complete copy of every file. But the snapshot copy of any given file inside a commit is stored in a special, read-only, Git-only format. It literally can't be changed and no programs other than Git can use it. Hence, when you use git checkout or git switch to select some particular commit to see and work on/with, Git must copy the files out of the commit, to a work area: your working tree or work-tree. These files are ordinary everyday files. The committed files are still there, in the current commit, so this provides two copies of each file:

  • There is the frozen one in the current commit: HEAD:README.md for instance. Run git show HEAD:path to see it.

  • And, there is the normal everyday file in README.md: use whatever viewer you like to see it, and whatever editor you like to change it.

But in between these two, Git keeps a third copy1 of the file. This copy is in Git's index, which Git also calls its staging area. This copy is in the frozen format—but unlike a committed copy, you can replace it, wholesale, with a new copy. That's what git add does: it takes the work-tree copy, compresses it into the special Git format, and puts that copy into Git's index, ready to be committed.

  • To see the index copy, run git show :path, e.g., git show :README.md.

Normally the index copy will match either the HEAD copy (because you just checked out a commit, or just committed) or the work-tree copy (because you just git add-ed a file), or will match both other copies (git status says nothing to commit, working tree clean). But it's possible to:

  • check out some commit (all three match)
  • modify the work-tree file (HEAD and index match, work-tree doesn't)
  • git add the modified file (HEAD doesn't match, index and work-tree do)
  • modify the file some more

and now all three copies are different. There is nothing fundamentally wrong here: that's how Git works, and git add -p and git reset -p are designed to let you manipulate this kind of situation deliberately. They work by copying the index copy of the file out to a temporary file, and then let you patch this temporary file, one diff-hunk at a time, and copy it back into the index copy.

In any case, this is the normal setup, when you're not in a conflicted merge:

  • HEAD represents the current commit, and the current commit has a copy of every file that you can't change. You can change which commit is the current commit (by checking out some other commit), but you cannot change the files stored inside these commits. There's easy access to the HEAD copy of a committed file, and git status and git diff and so on will look at these copies.

  • The index stores a copy of every file. You can change these copies. Normally you do that by copying the work-tree file, as it is right now, into the index copy, using git add. Or, you change it by copying the HEAD copy back into the index, using git reset.

  • The work-tree stores a copy of every file. This copy is yours: Git only overwrites it when you tell Git to overwrite it. Git doesn't use it when you git commit: Git uses the copies that are in Git's index.

But, when you enter into the conflicted-merge state, the index has been expanded. Instead of just containing one copy of a conflicted file, it now contains three. Now, things get tricky.


1Technically, the index holds references, rather than actual copies, but the effect is the same, unless you start using git ls-files --stage and git update-index to delve into the low level details.


Merges with conflicts

As you've found, when you run:

git checkout somebranch git merge other 

sometimes Git is able to do the merge and finish on its own, and sometimes it gets some of the merge done, but spits out some CONFLICT messages and stops in the middle of the merge.

There are actually two different kinds of conflicts, which I like to call high level and low level. The ones most people encounter first, because they are the most common, are low-level conflicts. They're generated in Git's ll-merge.c code, where ll stands for "low level", hence the name.

Merging, in Git, uses a pretty standard three-way merge algorithm. Git actually uses a recursive variant as the default; you can disable it using git merge -s resolve, but there is rarely any reason to do that. Any three-way merge needs three input files: a common (shared) merge base version, a left-side or local or --ours version, and a right-side or remote or --theirs version. The merge simply compares the base to both left and right. This produces a set of changes to be made. The merge combines the changes: if the left side fixes the spelling of a word on line 42, take that change; if the right side deletes line 79, take that change too.

Conflicts—or more specifically, low level conflicts—occur when the left and right side try to make different changes to the same region of a single file. Here Git simply does not know whether to take the left side change, the right side change, both, or neither. So it stops the merge with a conflict (after going on to merge whatever else it can merge on its own).

High level conflicts occur when there are whole-file changes. That is, the left side change might include the direction: rename README.md to README.rst. If the right side didn't rename README.md, or did rename it but to README.rst too, that's OK. But what if the right side says rename README.md to README.html: how should Git combine these changes?

Again, Git just gives up and declares a conflict. This time, though, it's a high level conflict.

In both cases, what Git does in Git's index is simple: it just keeps all the copies. To be able to tell the three different README.md files apart—assuming no complicated rename conflicts—it simply numbers the files in the index:

  • git show :1:README.md shows you the merge base version;
  • git show :2:README.md shows you the --ours version; and
  • git show :3:README.md shows you the --theirs version.

Git writes out a new README.md work-tree copy with conflict markers, but the original three inputs are still there, in the index. Your job, as the person completing the merge, is not necessarily to fix up the work-tree copy. Git doesn't need that copy: that one is for you. Git needs the final version, in Git's index.

The index numbers above are slot numbers, and the final copy goes into slot zero, which erases the other three slots. Your job is to come up with the correct README.md and put that in slot zero.

One easy way to do this is to edit the work-tree README.md–complete with its conflict markers—until you have the correct merged result. Then, you write this file back to the work-tree and run git add README.md. That copies README.md from your work-tree into the index, as usual: the copy goes into slot zero, erasing the other three slots.

The existence of the other three slot entries—the :1:README.md, :2:README.md, and/or :3:README.md—is what marks the file as conflicted. Now that they're all gone, the file is no longer conflicted.

You can use any procedure you like to put the correct file into slot zero. That's all Git really cares about: that the correct file go into slot zero, and the other three slots get removed. A fancy tool, as invoked by git mergetool, might be convenient for you, but in the end, it works by copying the final result into slot zero and erasing the other slots. Git does not care about your work-tree file at all; Git just needs its index fixed-up.

When you get a high level conflict, such as a rename/rename conflict or a modify/delete conflict, Git records this in Git's index too—but this time, it's recorded by the fact that there are some slots that aren't occupied. Remember, the slots go with the file source: merge base = slot 1, ours = 2, theirs = 3. So if the merge base had README.md, we have README.rst, and they have README.html, what you end up with is:

  • :1:README.md exists, but :2: and :3: don't
  • :2:README.rst exists, but :1: and :3: don't
  • :3:README.html exists, but :1: and :2: don't

Your job is to remove all three of these and put something in some slot zero. It doesn't have to be named README.md or README.rst or whatever: perhaps you might create a slot zero file named README.who-knows.

Your new merge commit, when you make it, will consist of whatever files are in slot zero. You cannot make a commit until all higher-numbered staging slots have been cleared out. So you must resolve each conflicted file yourself: only then can you run git merge --continue or git commit to make the final merge commit result.

You can simply run git add on all the conflicted files. If the work-tree has a low-level-conflicted README.md in it, with conflict markers, that copies the work-tree version into index slot zero and erases the other three slots. If that was the only conflict, you're now good to commit. The problem is that you've lost all three input files: you'll have to re-merge later, and resolve the conflicts. But you can just use git add on each file, and then commit.

This doesn't work very well with high-level conflicts: if there is a rename/rename conflict, which name should you use? If there is a modify/delete conflict, do you keep the modified file, or do you keep the deletion?

Whatever you choose here, you have resolved that conflict. The merge commit will store, as its new snapshot, whatever you put in the slot zero index entries.

If you have stored the conflicted files, and want the conflicts back, the only way to get that is to re-perform the merge—or, equivalently, save the merge-conflict data (input files and/or index). It's not clear which of these is easier: both have lots of potential issues. The one with the fewest traps, I think, is to use git merge-file, which runs a low-level merge on three input files.

Conclusion

So you could, for each low-level conflicted file:

  1. Extract the three copies of the file somewhere. (Note: git checkout-index has options to do this. This is how git mergetool provides the three copies to your merge tool.)
  2. git add the conflicted file from the work-tree, to resolve the conflict, taking the marked-up version as the correct resolution.
  3. Run git merge --continue to commit the merge.
  4. Use git merge-file on the files saved in step 1, to re-create the conflicts.
  5. Resolve the conflicts by hand.
  6. git add the resulting files, to copy them to Git's index.
  7. Make a new commit.

This is a lot of work to do something that Git doesn't do, and it doesn't handle high-level conflicts well. Other Git tools will assume that the commit contains the correct resolution, so you're laying traps for other people who will assume the tool knows the right thing. And it's not clear to me, at least, why you want to do this—why anyone would want to do this—when you can find the same conflicts later by running:

git checkout <hash> git merge <hash> 

where the two hash values are the hash IDs for the two commits that somebranch and otherbranch identified at the time you ran the original git merge command. Those two hash values are easy to find from the merge commit itself: they are its first and second parent respectively. Hence if $M contains the merge hash ID:

git rev-parse $M^1 $M^2 

shows you the two hash IDs you need to repeat the merge to re-obtain the conflicts. The only thing missing here is any options you supplied to the git merge command. Git does not save them (I think it should)—but you can manually save them in your log message, if nothing else.

Sign up to request clarification or add additional context in comments.

3 Comments

As for why someone might want to do this: I'm in a situation that has me contemplating this issue now. In short, I'm dealing with a very large merge that touches many areas of the code, with hundreds of conflicts which no single developer is equipped to handle alone. Yet it seems that git is set up to require any conflicts be handled in a single commit by a single developer. I am trying to find a way to collaborate with others on such a monster merge. It occurred to me to do this with a series of incremental commits but it seems I would have to fight the tool to do so. Any ideas?
@Joseph: Git badly needs some kind of collaborative merge tool-set, but it just doesn't have any. Someone will have to write one (or several).
Disappointing, but what I feared. I have expanded on this, if you care to take a crack at it: stackoverflow.com/questions/73658025/…

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.