9

I want to get the orientation of a scanned document. I saw this post Pytesseract OCR multiple config options and I tried to use --psm 0 to get the orientation.

target = pytesseract.image_to_string(text, lang='eng', boxes=False, \ config='--psm 0 tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz') 

But I get an error:

FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/jy/np7p4twj4bx_k396hyc_bnxw0000gn/T/tess_dzgtpadd_out.txt' 

3 Answers 3

11

I found another way to get the orientation using pytesseract:

print(pytesseract.image_to_osd(Image.open(file_name))) 

This is the output:

Page number: 0 Orientation in degrees: 270 Rotate: 90 Orientation confidence: 21.27 Script: Latin Script confidence: 4.14 
Sign up to request clarification or add additional context in comments.

2 Comments

It can detect script or font? What if the document contains different font?
This is a good solution, but found that it's not very accurate. In a small experiment I did on 9 rotated (right, left, down) PNG document pages, it detected the rotation correctly on only 6.
8

Instead of writing regex to get the output from a string , pass the parameter Output.DICT to get the result as a dict

from pytesseract import Output im = cv2.imread(str(imPath), cv2.IMREAD_COLOR) newdata=pytesseract.image_to_osd(im, output_type=Output.DICT) 

The sample output looks as follows: Use the dict keys to access the values

{ 'page_num': 0, 'orientation': 90, 'rotate': 270, 'orientation_conf': 1.2, 'script': 'Latin', 'script_conf': 1.11 } 

Comments

3

@lads has already mentioned the method whic can find orientation. I have just used re to get by how much degree do we need to rotate the image.

imPath='path_to_image' im = cv2.imread(str(imPath), cv2.IMREAD_COLOR) newdata=pytesseract.image_to_osd(im) re.search('(?<=Rotate: )\d+', newdata).group(0) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.