Pytesseract image to text problem in Python

Question

Please check the following image:

I am using the following code to extract text from the image.

img = cv2.imread("img.png") txt = pytesseract.image_to_string(img)

But the result is showing different than the original one:

It is showing the following result:

+BuFl

But it should be:

+Bu#L

I don't know what the problem is. I am pretty new in Pytesseract.

Is there anyone who can help me to sort out the problem?

Thank you very much.

Ahmet · Accepted Answer · 2022-01-08 20:57:29Z

One way of solving is applying otsu-thresholding

Otsu's method automatically finds the threshold value unlike global thresholding.

The result of applying Otsu's threshold will be:

import cv2 import pytesseract img = cv2.imread("Tqom8.png") # Load the image img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Convert to gray thr = cv2.threshold(gray, 0, 128, cv2.THRESH_OTSU)[1] txt = pytesseract.image_to_string(gray, config='--psm 6') print(pytesseract.__version__) print(txt)

Result:

0.3.8 +Bu#L

Also make sure to read the Improving the quality of the output

Collectives™ on Stack Overflow

Pytesseract image to text problem in Python

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related