1

Ok so I've been trying to change my image to whatever works, but I cannot seem to find the right settings..

This is the image: enter image description here

As you can see picture is already as simple as anything, but it still cannot recognize '1 BB' from the Image.. Any tips?

img = Image.fromarray(img) imp_arr = np.asarray(img) imp_arr = (np.floor(imp_arr / 140.0) * 255.0).astype('uint8') img = Image.fromarray(imp_arr, mode='L') width, height = img.size img = img.resize((width*3, height*3), Image.BICUBIC) width, height = img.size img = img.resize((width*2, height*2), Image.HAMMING) width, height = img.size img = img.resize((int(width*0.3), int(height*0.3)), Image.BICUBIC) img = ImageEnhance.Brightness(img).enhance(0.7) img = ImageEnhance.Sharpness(img).enhance(2) img = ImageEnhance.Contrast(img).enhance(2) amount = pytesseract.image_to_string(img, config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789') 

This is just an example, of what I've tried to adjust it correctly to get the correct text to string. Some of the times it works other times it prints out gibberish. The thing is.. It needs to work every single time, expecially for a picture as clear as this one. Is there a mastermind who has a simple solution to this problem? Thank you in advance.

1 Answer 1

1

After installing Tesseract OCR, Pillow and pytesseract, I saved your image as igor.png and ran the following code, which I found in the docs of pytesseract:

#!/usr/bin/env python from PIL import Image import pytesseract print(pytesseract.image_to_string(Image.open("igor.png"))) 

It prints the expected result:

1BB 

If I correct a bit your initial code by adding the letter B to the tessedit_char_whitelist, it works as well.

Sign up to request clarification or add additional context in comments.

4 Comments

Litterally using the same code as you did.. I get the result: "5)," This doesn't make any sense :S To get back on the whitelist, it doesn't do much as the gibberish is coming trough.. Even with the whitelist on I get the same weird result..
@IgorMarkovic probably due to different version of Tesseract. Newer one detects more accurately
I installed tesseract with my package manager and it gave me 3.04.01.
Thanks to the both of you I've found a way for it to work. I had v5 alpha, so I tried v4, that didn't work either. Although, v3.05.01 worked like a charm! So earlier versions work in my case, newer ones are not working as their supposed to. I could not find v3.04.01 though, how did you install that one? One minor complication with v3.05.01 is that tesseract needs to be on the same drive as your python file somehow: github.com/madmaze/pytesseract/issues/50

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.