1

Im a beginner in Tesseract-OCR and I am doing this project in Python to recognize multiple separated characters in one image. I looked up the documentation for PyTesseract and I could not find any reference for detecting multiple characters in different positions.

enter image description here

I tried changing the configurations but I still cannot detect any characters. My idea is to scan the available characters on the image, print their bounding boxes and find the center of each bounding box, print out the character's rotation in degrees.

Can anyone help me? Thanks.

1 Answer 1

2

Maybe this is what you mean.

import cv2 import numpy as np import pytesseract import imutils img = cv2.imread("srj8n.png") cv2.imshow("original", img) # turn into gray for next processing gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV * cv2.THRESH_OTSU)[1] thresh = cv2.bitwise_not(thresh) # omit the underline kernel = np.ones((4, 4), np.uint8) erosion = cv2.erode(thresh, kernel, iterations=1) # dilate to make the line thicker kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 12)) dilation = cv2.dilate(erosion, kernel, iterations=1) # find the contour cntrs = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1] result = img.copy() for c in cntrs: # for each letter, create red rectangle x, y, w, h = cv2.boundingRect(c) cv2.rectangle(result, (x, y), (x + w, y + h), (0, 0, 255), 2) # prepare letter for OCR box = thresh[y:y + h - 2, x:x + w] box = cv2.bitwise_not(box) box = cv2.GaussianBlur(box, (3, 3), 0) # retreive the angle. For the meaning of angle, see below # https://namkeenman.wordpress.com/2015/12/18/open-cv-determine-angle-of-rotatedrect-minarearect/ rect = cv2.minAreaRect(c) angle = rect[2] # put angle below letter font = cv2.FONT_HERSHEY_SIMPLEX bottomLeftCornerOfText = (x, y+h+20) fontScale = 0.6 fontColor = (255, 0, 0) lineType = 2 cv2.putText(result, str(angle), bottomLeftCornerOfText, font, fontScale, fontColor, lineType) # do the OCR custom_config = r'-l eng --oem 3 --psm 10' text = pytesseract.image_to_string(box, config=custom_config) print("Detected :" + text + ", angle: " + str(angle)) cv2.imshow("result", result) cv2.waitKey(0) cv2.destroyAllWindows() 
Sign up to request clarification or add additional context in comments.

2 Comments

This is actually perfect! If it doesn't hinder you, would you mind writing some explanation in the code? I see you added in the first 12 lines of code of image processing by thresholding the image to black and white, after that I got lost :). I also did not understand how is it boxing every detected letter. The angle of the detected letter only computes it if it is angled to the left, for example here: imgur.com/a/rkmM2WS Thank you!
I have added comment to explain the code. Hope it helps.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.