Skip to main content
added 311 characters in body
Source Link
Jason Hunter
  • 1.2k
  • 8
  • 15

If I have a file like this

foo bar bat hukarz foo bar bat 

, then I would like to be made aware that there is one region that is identical to another region

foo bar bat 

The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.

Lingo4G and the Carrot2 engine defines this as Document Overlap and Pairwise ​Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.

I was thinking Emacs might have a mode where it could use the Carrot2 engine for identifying identical or similar regions in a buffer, like they do with Carrot2:

Carrot2 engine identifying identical or similar regions in a file

If I have a file like this

foo bar bat hukarz foo bar bat 

, then I would like to be made aware that there is one region that is identical to another region

foo bar bat 

The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.

Lingo4G defines this as Document Overlap and Pairwise ​Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.

If I have a file like this

foo bar bat hukarz foo bar bat 

, then I would like to be made aware that there is one region that is identical to another region

foo bar bat 

The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.

Lingo4G and the Carrot2 engine defines this as Document Overlap and Pairwise ​Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.

I was thinking Emacs might have a mode where it could use the Carrot2 engine for identifying identical or similar regions in a buffer, like they do with Carrot2:

Carrot2 engine identifying identical or similar regions in a file

edited tags
Link
Drew
  • 80.9k
  • 10
  • 125
  • 265
added 89 characters in body
Source Link
Jason Hunter
  • 1.2k
  • 8
  • 15

If I have a file like this

foo bar bat hukarz foo bar bat 

, then I would like to be made aware that there is one region that is identical to another region

foo bar bat 

The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.

Lingo4G defines this as Document Overlap and Pairwise ​Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.

If I have a file like this

foo bar bat hukarz foo bar bat 

, then I would like to be made aware that there is one region that is identical to another region

foo bar bat 

The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.

Lingo4G defines this as Pairwise ​Similarity, as in how to identify identical text fragments in documents.

If I have a file like this

foo bar bat hukarz foo bar bat 

, then I would like to be made aware that there is one region that is identical to another region

foo bar bat 

The reason is that I have have some large text files and I have identical regions, often more than one time. I want to clean them up.

Lingo4G defines this as Document Overlap and Pairwise ​Similarity, as in how to identify identical text fragments in documents and returning information useful for visualization of such regions.

Source Link
Jason Hunter
  • 1.2k
  • 8
  • 15
Loading