If I have a file like this
foo bar bat hukarz foo bar bat , then I would like to be made aware that there is one region that is identical to another region
foo bar bat The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.
Lingo4G and the Carrot2 engine defines this as Document Overlap and Pairwise Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.
I was thinking Emacs might have a mode where it could use the Carrot2 engine for identifying identical or similar regions in a buffer, like they do with Carrot2:
