-1

my goal is to pre-process image (extracted from a video) for OCR detection. Text is always black, like this example: enter image description here

I tried to use age framering and HVS mask:

cv2.accumulateWeighted(frame,avg2,0.005) #res2 = cv2.convertScaleAbs(avg2) # Convert BGR to HSV hsv = cv2.cvtColor(imgray, cv2.COLOR_BGR2HSV) # define range of black color in HSV lower_val = np.array([0,0,0]) upper_val = np.array([179,255,127]) # Threshold the HSV image to get only black colors mask = cv2.inRange(hsv, lower_val, upper_val) # invert mask to get black symbols on white background mask_inv = cv2.bitwise_not(mask) cv2.imshow("Mask", mask) 

But result are not good enought. Looking for some possible workaroud. Thx

1 Answer 1

1

These type of images, where text instances can not be separated easily, tesseract won't provide with good results. Tesseract is a good option if you want to extract text from document/papaer/pdfs, etc. where text instances are clear.

For your problem, I would suggest you to follow text detection and text recognition models separetely. For text detection, you can use state-of-the-art models like east text detector, which is able to locate text in diffiuclt images. It will generate bounding boxes around text in the images and then this box are can be given to another text recognition model, which will perform actual recognition task.

For text detection : East or any other latest model For text recognition: CRNN based models

Please tryto implement above models and I am sure they will perform way better than what you are getting from Tesseract:)

BR!

Sign up to request clarification or add additional context in comments.

1 Comment

Thx @JD95, I've already used in the past EAST model in the past, but in my scenario this not help the text detection as I have a fixed position. I was looking for best practice on pre-processing to maximize tesserract performances.thx

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.