0

I'm trying to read relatively clear numbers from a screenshot, but I am running into issues getting pytesseract to read the text correctly. I have the following screenshot:

This is the resulting image for scoreboard

And I know the score (2-0) and the clock (1:42) are going to be in the exact same place.

This is the code I currently have for reading the clock time and the orange score:

lower_orange = np.array([0, 90, 200], dtype = "uint8") upper_orange = np.array([70, 160, 255], dtype = "uint8") #Isolate scoreboard location on a 1080p pic clock = input[70:120, 920:1000] scoreboard = input[70:150, 800:1120] #greyscale roi_gray = cv2.cvtColor(clock, cv2.COLOR_BGR2GRAY) config = ("-l eng -c tessedit_char_whitelist=0123456789: --oem 1 --psm 8") time = pytesseract.image_to_string(roi_gray, config=config) print("time is " + time) # find the colors within the specified boundaries and apply # the mask mask_orange = cv2.inRange(scoreboard, lower_orange, upper_orange) # find contours in the thresholded image, then initialize the # list of digit locations cnts = cv2.findContours(mask_orange.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = imutils.grab_contours(cnts) locs = [] for (i, c) in enumerate(cnts): # compute the bounding box of the contour, then use the # bounding box coordinates to derive the aspect ratio (x, y, w, h) = cv2.boundingRect(c) ar = w / float(h) # since score will be a fixed size of about 25 x 35, we'll set the area at about 300 to be safe if w*h > 300: orange_score_img = mask_orange[y-5:y+h+5, x-5:x+w+5] orange_score_img = cv2.GaussianBlur(orange_score_img, (5, 5), 0) config = ("-l eng -c tessedit_char_whitelist=012345 --oem 1 --psm 10") orange_score = pytesseract.image_to_string(orange_score_img, config=config) print("orange_score is " + orange_score) 

here's the output:

time is 1:42 orange_score is 

Here is the orange_score_img, after I masked out everything within my upper and lower orange bounds and applied a gaussian blur.

enter image description here

Yet at this point, and even when I configure pytesseract to search for 1 character and limited the whitelist, I still can't get it to read correctly. Is there some additional postprocessing that I'm missing to help pytesseract read this number as 2?

2
  • Perhaps you need to threshold the "2" image to binary first and if needed thicken it with some morphology open. Commented Mar 20, 2020 at 2:15
  • thanks @fmw42, I took your suggestion of thickening and that helped a lot! Commented Mar 20, 2020 at 5:47

1 Answer 1

1

As per @fmw42's suggestion, I tried playing with some morphology changes. Thickening the numbers seemed to do the trick!

kernel = np.ones((5,5),np.uint8) orange_score_img = cv2.dilate(orange_score_img,kernel,iterations=1)

EDIT: the REAL answer, I realized, is that pytesseract does MUCH better with black text on a white background than white text on a black background! it reads perfectly when I inverted the colors:

orange_score_img = cv2.bitwise_not(orange_score_img)

I hope this helps people when they first start out using pytesseract! trying to tune the image to fit all my cases was incredibly frustrating and knowing that black text on white works much better would have saved me hours...

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.