29

Every time I see a conflict on something like imports or method signature changes (e.g. renames of variables) in my SCM I wonder if there is something like a language aware diff/merge method that can handle the more annoying small changes that can happen on a shared project. Is there anything out there that handles conflicts more smoothly, working in a Unix environment?

4
  • 3
    Good idea. Sounds like the concept for your next open source project :) Commented Nov 29, 2009 at 22:16
  • Well, the "low hanging fruit" cases are so easy that I still believe that somebody must have thought about that bevore I startet this question. Commented Nov 30, 2009 at 19:44
  • Seems to be a dup for stackoverflow.com/questions/523307/semantic-diff-utilities Commented Nov 30, 2009 at 19:50
  • agreed, this should probably be closed. and agreed, I've always wondered why merges couldn't be made smarter in this way. Commented Dec 1, 2009 at 21:47

6 Answers 6

6

I agree that it would be awesome if such a tool exists, but there are none that I'm aware of. The reason I believe that there are none is because the merge algorithm for each SCM (whether it is git, hg, bzr, svn, etc) works on the lowest common denominator, which is simply plain text. For these SCM tools to really understand the language syntax and semantics, they would have to include the ability to parse the language. It seems like this is simply too big a task for any SCM to include the ability to parse Java, C#, Python, Ruby, Groovy, C, C++, etc., not to mention that each one of these languages have different syntaxes between version (e.g. Java generics did not exist until 1.5). So the SCM would have to include the ability to detect or be configured to know what language and version of the language the source code is written in.

I think that it would be more likely that any language-dependent merge feature would be found in a 3rd party merge tool (e.g. the merge > tool setting in .gitconfig and the ui > merge setting in .hgrc). This tool could be configured to know that any .java files in your project are written in Java 1.6 and then uses the parsing features in the JDK to generate the AST and perform some "deep" analysis of whether the change was meaningful in the context of that language.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes, that's what I mean by "merge command". But the question is still if there is something like that.
2

I'm looking for the exact same thing. Those merge tools vendors should probably address this sort semantic, language-aware merge.. if not, I'll have to become one:)

For now, as a poor man's trick, I sometimes preprocess the 3 files (base, ours, theirs) to their 'canonical form' by feeding them through Eclipse's Code Cleanup/Organize Imports/Order Members.

Although limited, this works nicely: last time it reduced the number of conflicts to ~200 into 2. Am planning to wrap this into a script, and plug into git's merge tool.

Have also written script autoresolve java import conflicts, which simply keeps both side of the imports and adds comments to explain what's going on and what todo: 'organise imports'.

2 Comments

Concerning java import conflicts, I'd suggest to drop the conflicting import section (or even all imports) and let the IDE re-insert them. You obviously need to agree on a common import ordering for this to work. Sometimes the imports are not unique (e.g., java.util.List vs. java.awt.List), but the cases without an obvious solution are pretty rare. Actually, there's a better solution: Remove the conflict markers only, leave possibly duplicated imports there, and let finally the IDE clean it up.
Yep, that's what I did (referring to "Have also written script autoresolve java import conflicts"). Thanks for clarifying anyway
2

Mergiraf offers syntax-aware merging for a variety of languages, and typically handles conflicts of Java imports for instance.

Git can be configured to delegate merging of specific files to it, by registering it as a merge driver. The tool can also be invoked manually after encountering a conflict.

For diffing, there also exist syntax-aware tools that can be configured with Git, such as difftastic.

Comments

0

To make it easier for anyone landing on this page. This question is a dupe of http://stackoverflow.com/questions/523307/semantic-diff-utilities (its replyed to in the main question, but not obvious)

And the current tool I am aware of (The answer for the quest above) is symantic merge - https://www.semanticmerge.com

There is also https://www.devart.com/codecompare which is close to what you want

Comments

-1

You might want to look into having everyone on your team share the same IDE settings for things like order of imports, formatting, etc., to avoid conflicts like this from occurring in the first place.

2 Comments

This doesn't actually solve the problem. Consider for example some Java code "import a; import e;". Suppose I add "import b;" and you add "import c;", both in proper alphabetical order. When it comes time to merge, we will get a merge conflict. If we agree to put imports in alphabetical order, then there is no ambiguity about what the right merge is-- but the tools generate merge conflicts because they aren't aware of coding conventions.
The most common failure to merge seems that any addition of code conflicts with any other addition of code, at the same spot. There should be a general solution to this "add all the things!" style of merging, that works well enough in all syntaxes. (Default to inclusion of all insertions, as a default merge when the merge happens in a certain block or range.).
-1

doesn't git rebase solve this problem? any variable renames will be accounted for in the associated commits. git rebase lets you stay in sync with upstream commits. as long as you rebase frequently (daily ish?) you shouldn't be getting stupid conflicts like that, and if you are, they are probably real conflicts and not solvable by a java grammar parser.

1 Comment

Rebase will update any pre-existing mentions of renamed variables, but not any new mentions added by commits in the topic branch. If there's been substantial renaming, that can cover quite a lot.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.