I am trying to detect some numbers with tesseract in python. Below you will find my starting image and what I can get it down to. Here is the code I used to get it there.
import pytesseract import cv2 import numpy as np pytesseract.pytesseract.tesseract_cmd = "C:\\Users\\choll\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe" image = cv2.imread(r'64normalwart.png') lower = np.array([254, 254, 254]) upper = np.array([255, 255, 255]) image = cv2.inRange(image, lower, upper) image = cv2.bitwise_not(image) #Uses a language that should work with minecraft text, I have tried with and without, no luck text = pytesseract.image_to_string(image, lang='mc') print(text) cv2.imwrite("Wartthreshnew.jpg", image) cv2.imshow("Image", image) cv2.waitKey(0) I end up with black numbers on a white background which seems pretty good but tesseract can still not detect the numbers. I also noticed the numbers were pretty jagged but I don't know how to fix that. Does anyone have recommendations for how I could make tesseract be able to recognize these numbers?
cv2.blur()to smooth the rough edges of the numbers. It will make the image fuzzier overall but tesseract might have an easier time recognizing digits.