0

Assuming there are only 2 colors in an image. What's the simplest way in Python to tell an image has more (the colored areas) of these 2 colors than the other (group of similar images)?

Definition of "more": the area of total colored blocks of one picture, is bigger than the other. (Note the shape of colors might be irregular).

enter image description here

2
  • 1
    What format is your image? Because if you can directly load the images as an ndarray in numpy, this seems fairly straightforward as a counting operation with O(w * h) time-complexity (assuming a constant # of color regions). Commented Jan 6, 2020 at 6:00
  • 1
    @tchainzzz, thank you for the comment. they are .png. Commented Jan 6, 2020 at 6:01

1 Answer 1

1

Okay, after some experimentation, I have a possible solution. You can use Pillow, a common image-loading/handling library, to convert the images to an ndarray, and then use the count_nonzero() method to get your desired results. As a fun side-effect, this works with an arbitrary amount of colors. Here's full working code that I just tried:

from PIL import Image # because for some reason, that's how you import something from Pillow import numpy as np im = Image.open("/path/to/image.png") arr = np.array(im.getdata()) unique_colors, counts = np.unique(arr.reshape(-1, arr.shape[1]), axis=0, return_counts=True) 

Now the unique_colors variable holds the unique colors that appear in your image, and counts holds the corresponding counts for each color in the image; that is to say, counts[i] is the number of times unique_colors[i] appears in the image for any i.

How does the unique + reshaping line work? This is borrowed from this particular answer. Basically, you flatten out your image array such that it has shape (num_pixels, num_channels), which could be 1, 3, or 4 depending on your image format (single-channel, RGB, RGBA, etc.). Now that I have a giant 2D "table" of pixels, I simply find which row values (hence axis=0) are unique, and then use the return_counts keyword to return, well, the counts.

At this point, you have extracted the unique colors and counts of those colors for a single image. To compare multiple images, you would repeat this process on multiple images, find the colors they have in common, and then you can simply compare integers to find out which image has more of a particular color.

For my particular image, the format of the channels happened to be RGBA; in any case, I would recommend printing out arr.shape prior to the reshape step to verify that you have the correct index. If you/anyone else knows of a more general method to find the channel index of an image obtained in this fashion — I'm all ears. Thus, you may have to change the index of arr.shape to something else depending on your image. For the record, I tried this on a .png image, like you specified. Hope this helps!

Sign up to request clarification or add additional context in comments.

4 Comments

thank you for he superb answer, much in details! when I run it with the left image, the counts returns [ 371 558 2962 7584 373 6624 16461]. the right images counts returns [ 371 232 1558 479 373 12806 19114]. their sums are the same 34933. So by that, it says they have same areas of colors?
That says they have the same area; i.e. your summation says that the number of pixels in each image is equal. I'd examine unique_colors as well, which tells you which colors have which counts. From there, you would compare the unique colors, find which colors are shared between images, and then make a comparison between the images.
great! actually I measure the white - which ever has less white = has more colored area! thanks again for the sharing and solution!
Awesome! Glad to be of help — I edited the answer to be more clear about this. Just for reference — in case you/anyone else needs it — white is represented as something like [255, 255, 255] (for 3-channel RGB).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.