How do I normalize all pixel values such that all pixel values are 'pulled towards' the mean?

Question

Originally asked in Graphic Design site here (but I don't know how to 'move' a question to another site)

Credit to this guy: Python Tutorials for Digital Humanities - I followed a lot of his ideas, but I haven't got it down yet

I have a bunch of scanned document images and in each image there is a variance in background colour (sometimes a wide variance).

I want to adjust all image pixels such that they are 'normalized' towards the pixel colour mean. To put it another way, I want to adjust all pixel colour values (or grayscale) so that the background colour (e.g. the 'white page') loses the colour gradients and variation and instead becomes as close as possible to white (or white).

I have been looking into colour gradients, vectorization and various filters, thresh-holding and blurring techniques, but I can't quite work out how to do what I see in my head.

Take the following example image: (ignore my red annotation)

I want to find out what the overall mean colour is for all pixels, then for each pixel reduce or increase it towards the mean value to create a more 'flat' image. The reason I want to do that is because I believe it will improve the next step, which is to detect contours and edges for the purpose of text detection (and ultimately better OCR results).

So in the above example, the goal is to effectively remove the gray diagonal lines, but leave the text. I think there might be a way to automatically determine a threshold value (or have a dynamic threshold) but I am not sure exactly how to do that.

Here is another example image:

The goal for this second image would be to effectively remove the target logo and most of the other background 'document colour mess' picked up by the scanner, as per the annotation.

The overall objective is to improve text block detection to then improve OCR accuracy.

I know I can play around with filters and image enhancements using slides in GIMP or some other graphics GUI, but one of the key points is that this process has to be automated because doing it manually would essentially defeat the purpose and I may as well go back to manual data entry by eyeballing every document (ugh!).

This is why I have been trying to use OpenCV in Python.

The second point is that since every image is different I need to be able to determine the threshold (for final binarization prior to OCR) automatically, which is kind of what I mean by 'normalization' above.

might I ask towards what end removing security information from receipts might be? — pmw1234
– pmw1234, Commented Nov 1, 2023 at 11:21
I'm not sure what you mean by 'removing security information', sorry. These aren't receipts, they're just scanned price tags as examples. There's nothing private about them to me (you'll find the same kind of tags in your local store ;), they're just examples of scanned text that needs cleaning prior to effective OCR. The goal is to automate image preprocessing so as to (vastly) improve OCR (I have other OCR tools). Doesn't have to be a price tag or receipt (though they may benefit from this). It could be street signs or any other images with text that needs converting to searchable text. — skeetastax
– skeetastax, Commented Nov 1, 2023 at 11:36
I worked my way through college partially in retail and receipts/tags/etc that look like they came out of a parking lot set off red flags for me...my spidy senses are still tingling on this one... at any rate In these images all you want to keep is the very darkest pixels so all that's really needed is to take only those pixels that are below a certain luminance threshold. — pmw1234
– pmw1234, Commented Nov 1, 2023 at 12:43
Let me put my your mind at rest - I am just trying to scan my own receipts because I want to automate transaction entry. I hate manually doing this stuff....but after reading your comment I think I might understand what you're getting at regarding price tags now....I don't plan on creating fake price tags to save money. I hadn't even thought of that until you raised it, but I guess some people do, and now I understand why you said 'security' in relation to a price tag. I didn't post actual receipt images because...well, that does contain private info ;) — skeetastax
– skeetastax, Commented Nov 1, 2023 at 20:09
I wonder, is there a particular forum that is dedicated to OCR discussions that you know of? — skeetastax
– skeetastax, Commented Nov 1, 2023 at 20:11

pmw1234 · Accepted Answer · 2024-03-03 19:13:18Z

Here are both images "cleaned" using simple luminance thresholding like I suggested in the comments. There was nothing special done to these images, a simple "is the luminance below threshold for this texel" is all that was done. The image was first converted to gray scale.

If the scan was better quality with a higher resolution the results would better.

The real challenge with this technique is finding the threshold, again a scan with higher resolution would help. But one of the images looked best at a cutoff of 0.55 and the other at 0.68.

I did this in a small program that I threw together, but there must be an image tool that can do this easily. (I am no expert at tools that would be a question for the photoshop/gimp forums)

The program was just the stbi image header (a c++ header file that can load and save various image types like jpg and png) and a loop that spun over the values computed the gray scale, tested the threshold, and output the result.

Here are the results:

https://i.sstatic.net/VJkXk.jpg
https://i.sstatic.net/lcOEh.jpg

#include <stdio.h> #include <stdlib.h> #define STB_IMAGE_IMPLEMENTATION #include "stb_image/stb_image.h" #define STB_IMAGE_WRITE_IMPLEMENTATION #include "stb_image/stb_image_write.h" int main(int argc, char **argv) { int width, height, channels; if( argc < 1 ) { printf("Full path to image must be the first argument passed on the command line.\n"); return -1; } unsigned char *img = stbi_load(argv[0], &width, &height, &channels, 0); if(img == NULL || channels < 3) { printf("Error in loading the image, expected color image\n"); return -1; } printf("Loaded image with a width of %dpx, a height of %dpx and %d channels\n", width, height, channels); // convert to gray scale, ignore alpha channel if it exists for( int n=0; n < width*height*channels; n+=channels) { int grayscale = img[n]; grayscale += img[n+1]; grayscale += image[n+2]; img[n] = img[n+1] = img[n+2] = (unsigned char)(grayscale/3); } // filter 0.56f is the threshold here, it is converted into the range of an 8 bit image (0-256). // the threshold must be a value between zero and 1 int threshold = (int)(0.56f * 256); for( int n=0; n<width*height*channels; n+=channels) { // if the pixel is below the threshold then make it white if( image[n] < threshold ){ img[n] = image[n+1] = image[n+2] = 255; } } // save the image to a bmp ( don't overwrite original file) stbi_write_bmp("output.bmp", width, height, channels, img); return 0; }

I deleted the program I wrote for this, but I wrote a new one from memory here. The conversion to gray scale is just an average in this one but (hopefully) it will be enough to get started. It is written in the c language. — pmw1234
– pmw1234, Commented Mar 3, 2024 at 16:11

Stack Exchange Network

How do I normalize all pixel values such that all pixel values are 'pulled towards' the mean?

1 Answer 1

Hot Network Questions

How do I normalize all pixel values such that all pixel values are 'pulled towards' the mean?

1 Answer 1

Related

Hot Network Questions