3

I have this set of images I want to de-noise in order to run OCR on it:

enter image description here

enter image description here

I am trying to read the 7810 from the image.

I have tried

cv2.threshold(img, 128, 255,cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU) cv2.fastNlMeansDenoising(img,None,60,10,20) 

and some morphological expressions but none seem to work to clear this image sufficiently.

Any recommendations on how to filter this image sufficiently that I could run OCR or some ML detection scripts on this like pytesseract?

2 Answers 2

2

You could begin by using a Median filter to remove the salt & pepper noise:

cv2.medianBlur(source, 3) 

Then try out the Otsu thresholding as you have done. This might not end up being the solution, but it makes it easier for the text detection algorithm to work on the image

Sign up to request clarification or add additional context in comments.

1 Comment

check this out for further reference <stackoverflow.com/questions/33881175/…>
1

You can try using cv2.adaptiveThreshold since your image has different lighting conditions in different areas.

enter image description here

import cv2 image = cv2.imread("1.jpg",0) thresh = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,21,2) cv2.imshow('thresh', thresh) cv2.waitKey(0) 

1 Comment

this answer was a good start. Didn't do everything, but cleared away a lot.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.