4

I'm working on a project that lets users take photos of handwritten formulas and send them to my server. I want to leave only symbols related to mathematics, not the sheet grid.

Sample photo:

(1) Original RGB photo Original photo (RGB

(2) Blurred Grayscale Blurred grayscale photo

(3) After applying Adaptive Threshold After applying Adaptive Threshold

Note: I expect my algorithm to deal with a sheet grid of any color.

What could some code snippets be?

3 Answers 3

5

Result

This is a challenging problem to generalize without knowing exactly what kind of paper/lines and ink combination to expect, and what exactly the output will be used for. I'd thought I'd attempt it and maybe learn something.

I see two ways to approach this problem:

  1. The clever way: identify the grid, its color, orientation, size to find the regions of the image occupied by it, in order to ignore it. There are major caveats here that would need to be addressed. e.g. the page may not be photographed flat and squared (warp, distortion, rotation have to accounted for). There will also be lines that we don't want removed.

  2. The simple way: Apply general image manipulations, knowing little about the problem other than the assumptions that the pen is always darker than the grid, and the output is to be binary (black pen / white page).

I like the second one better because it is easier to implement and generalizes better.

We first notice that the "white" of the page is actually a non-uniform shade of grey (if we convert to grayscale). The CV adaptive thresholding deals with this nicely. It almost gets us there.

The code below treats the image in 50x50 pixel blocks to address the non-uniformity of lighting. In each block, we subtract the median before applying a threshold. A simple solution, but maybe what you need. I haven't tested it on many images and the threshold and pre- and post-processing may need tweaking. It will not work if input images vary significantly, or if the grid is too dark relative to the ink.

import cv2 import numpy import sys BLOCK_SIZE = 50 THRESHOLD = 25 def preprocess(image): image = cv2.medianBlur(image, 3) image = cv2.GaussianBlur(image, (3, 3), 0) return 255 - image def postprocess(image): image = cv2.medianBlur(image, 5) # image = cv2.medianBlur(image, 5) # kernel = numpy.ones((3,3), numpy.uint8) # image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel) return image def get_block_index(image_shape, yx, block_size): y = numpy.arange(max(0, yx[0]-block_size), min(image_shape[0], yx[0]+block_size)) x = numpy.arange(max(0, yx[1]-block_size), min(image_shape[1], yx[1]+block_size)) return tuple(numpy.meshgrid(y, x)) def adaptive_median_threshold(img_in): med = numpy.median(img_in) img_out = numpy.zeros_like(img_in) img_out[img_in - med < THRESHOLD] = 255 return img_out def block_image_process(image, block_size): out_image = numpy.zeros_like(image) for row in range(0, image.shape[0], block_size): for col in range(0, image.shape[1], block_size): idx = (row, col) block_idx = get_block_index(image.shape, idx, block_size) out_image[block_idx] = adaptive_median_threshold(image[block_idx]) return out_image def process_image_file(filename): image_in = cv2.cvtColor(cv2.imread(filename), cv2.COLOR_BGR2GRAY) image_in = preprocess(image_in) image_out = block_image_process(image_in, BLOCK_SIZE) image_out = postprocess(image_out) cv2.imwrite('bin_' + filename, image_out) if __name__ == "__main__": process_image_file(sys.argv[1]) 
Sign up to request clarification or add additional context in comments.

2 Comments

If I run the code with the original file in the OP then I get the error "IndexError: index 336 is out of bounds for axis 0 with size 336". The line "out_image[block_idx]=...." causes the error.
It seems that new version of numpy expects the block_idx to be a tuple. I've updated the answer to have get_block_index return a tuple
1

OpenCV has a tutorial dealing with removing a grid from an image:

"Extract horizontal and vertical lines by using morphological operations", OpenCV documentation. Source: Extract horizontal and vertical lines by using morphological operations

3 Comments

Could you include the relevant parts of the linked resource to your answer? As is, your answer is very susceptible to link rot (i.e. if the linked resource changes or disappears, your answer is not helpful).
The only problem I have is that fraction bar would also be removed (when extracting horizontal lines).
Is there any way to preserve fraction bars?
0

This is a pretty difficult task. I also had this problem and I discovered that the solution can't be 100% accurate. BTW, just a few days ago I saw this link. Maybe it could help.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.