Questions tagged [text-processing]
The text-processing tag has no summary.
55 questions
-1 votes
3 answers
440 views
Is there a text distance (or string similarity) algorithm which accounts for the distance between characters?
I'm interested in finding a text distance (or string similarity) algorithm which computes a greater distance (or lower similarity) when characters are further apart. For example, I want the distance ...
-3 votes
1 answer
164 views
Applying a file diff to a new file [closed]
Suppose I have file a.txt, b.txt and c.txt: a.txt: Hello, I like cake. b.txt: Hello, I like turtles. c.txt: go away, I don't like you I suspect the difference between a.txt and b.txt is ...
0 votes
1 answer
274 views
Convert RTF to HTML when it's saved to the database or when it's rendered?
Users have the ability to enter and save text in a rich text editor which is eventually stored in a database and then rendered on a site. Is it better to convert the RTF to HTML when it's stored to ...
3 votes
1 answer
263 views
Integrating TeX into a Java desktop application
Looking to integrate TeX equations in a TeX-agnostic fashion, suitable for either ConTeXt or LaTeX, into a Java-based desktop Markdown editor. The possibilities are numerous, but I'm not sure what ...
2 votes
0 answers
62 views
How to test generated text
I am creating a text generation algorithm for my master's research. I have a dialogue between two people and I would like to simulate one part of the conversation with naturally generated text (not ...
4 votes
2 answers
647 views
Database structure for word co-occurrence frequencies in a large corpus
I would like to store the frequencies with which words co-occur with each other over a variety of contexts in a large (> 1 billion tokens) text corpus. I need to store the word pair, the type of co-...
0 votes
1 answer
90 views
How to implement tracking of changes in text documents à la MS-Word/Apple Pages
I want to implement tracking of changes in plain-text documents, in a way similar to how it works in MS Word or Apple Pages. What I am unsure of is the data model and how to store it. Goal The ...
0 votes
2 answers
13k views
Name and code to space between lines/paragraphs
I’m seeking a term and possibly the code behind what would help me implement that term in Python. I have been working on a text-based Python journaling application. When I want to review my ...