Return to Revisions

3 of 4

replaced http://apple.stackexchange.com/ with https://apple.stackexchange.com/

edited Apr 13, 2017 at 12:45

Stackoverflow has related questions under PDF-parsing covering things such as PDFBox and Apache's TIKA that the PDFBox uses. The ruby code below extracts writing from PDF. You need to have good enough resolution for this type of codes to work robustly. So get a good enough scanner with large resolution and then see if some of the softwares work.

Examples

https://github.com/yob/pdf-reader/tree/master/examples

SO threads

[Edit]

I am not sure whether I understood your problem now. You want to add OCR layer to different kinds of material such as random photos, screenshots, PDFs without OCR layer and so on? I don't know the solution but I am sure someone knows so asked a specific question how to do it with Automator and some OCR software:

Automator-script with an OCR-software to automatically add OCR to material?

answered Mar 10, 2013 at 18:57

hhh

3.9k
24
60
88