If I have a file like this
foo bar bat hukarz foo bar bat , then I would like to be made aware that there is one region that is identical to another region
foo bar bat The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.
Lingo4G and the Carrot2 engine defines this as Document Overlap and Pairwise Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.
I was thinking Emacs might have a mode where it could use the Carrot2 engine for identifying identical or similar regions in a buffer, like they do with Carrot2:

git-highlight. If it's a one-off situation, I'd recommend to just usediffrasdiff -u file1 file2 | diffr. Otherwise (unless there's a mode that solves this) you'd need to dig out the highlight functional from Emacs and write a minor mode around it.cat myfile | sort > myfile_sorted && …(potentially with-uparameter), and there's not much point in writing some complicated algo around it.