19

My friend has an colored image with Chinese handwriting (basically by taking photo of or scanning what he wrote on a piece of white paper), and he would like me to convert it into a black and white binary image. Are there applications under Ubuntu that can accomplish that?

Here is an example image:

enter image description here

0

5 Answers 5

33

What you want is referred to as "threshold" in image processing. Basically, it takes an image as an input and outputs an image that has all pixels with a value below a given threshold set to black, and all pixels the value of which is above the threshold set to white. This results in a black-and-white image from an arbitrary input image.

Generally, you want to convert to grayscale first for more predictable results, but it is possible to threshold a full-color image as well.

You can use a graphical tool such as GIMP to do this interactively (you'll find the tool through the main menu -> Colors -> Threshold), or you can use ImageMagick something like this:

convert colored.png -threshold 75% thres_colored.png 

Running the above command on the example image produces the result shown below.

Black-and-white version of OP's image

Since thresholding is often somewhat of a trial-and-error process to get a result you're happy with, particularly if the source image is not very close to black-and-white already, I recommend the GUI approach if possible, but if that is not an option for whatever reason you can do it through the command line as well. For finer control of the output, you can use tools like color curves, levels and contrast first to isolate the light and dark portions of the image better before thresholding. (Actually, threshold can be seen as an extreme case of using the color curves tool.)

5
  • 2
    Thanks! Since a colored image has RBG three channels, what does/can the threshold apply to in general? Commented Jan 9, 2014 at 23:00
  • @Tim It probably depends on the software, but I would expect the threshold (unless you specify per channel or for a specific channel, see e.g. ImageMagick's convert's -channel option) to be applied to some sort of "value" of the pixel, which is computed from all channels. That's the reason why I said you might want to convert to grayscale first for more predictable results. (Also see my edit.) Commented Jan 10, 2014 at 8:17
  • Thanks! Does there exist some documentation for what the threshold apply to in the command shown in your post? Commented Jan 10, 2014 at 18:29
  • @Tim Not really. I expected convert to take either a percentage of the maximum value (which should have been 256 per channel) or a specific value, but I could only get a useful result when specifying a percentage. When you do it with a graphical tool, including GIMP, you'll generally have a histogram that shows the tonal distribution of the image; that will be a great aid in picking the proper value. Using only the command line, unless you have a specific reason to do so, is probably more trouble than it's worth, really. Commented Jan 10, 2014 at 18:31
  • 2
    As a side note, there exist other thresholding methods that don't have to rely on a hard-coded threshold level. For example, ImageMagick includes -lat which performs a local adaptive threshold, taking into account the surrounding pixels. Commented Jan 15, 2014 at 22:37
7

You can use Imagemagick:

convert test.png -colorspace Gray gray_colorspace.png 

From here.

Here is what I got after applying to your image:

enter image description here

2
  • 12
    "Binary" comes from "bi" meaning "two", so I assume the OP wants to convert the image to pure black-and-white. Converting to grayscale yields a lot more than two levels. Commented Jan 9, 2014 at 20:57
  • Agreed! makes sense, +1. Commented Jan 9, 2014 at 21:31
7

ImageMagick convert -monochrome

The -monochrome option uses some smart dithering, and makes the output much more visible than -threshold if you intend it for human consumption:

convert -monochrome signature.png out.png 

enter image description here

Does not make much difference for such a simple image, but for larger ones, it is striking:

4
  • Unfortunately this is not local-adaptive (so for larger images with uneven lighting it may make large region black) Commented Mar 1, 2022 at 3:16
  • @user202729 let me know if there's an option that does it (and possibly input/ouput demo), that sounds cool. Commented Mar 1, 2022 at 7:38
  • 1
    convert has -lat (experimentally the parameters like convert -grayscale Rec709Luma -lat 20x20-5% seems to work). Demo maybe later after I prepare image (this image is too simple, any method would work) Commented Mar 1, 2022 at 11:08
  • @user202729 that's awesome, thanks for the info! Commented Mar 1, 2022 at 11:10
2

You can also do this easily with the netpbm toolkit:

anytopnm inputfile | ppmtopgm | pgmtopbm > outputfile 

ppmtopgm converts to a grayscale image, pgmtopbm converts to a black-or-white image, and we then redirect the output to a file. It will be in the pbm format; if you want something more common, you'll have to add an output convertor (e.g., pnmtopng or some such)

1
  • anytopnm inputfile | ppmtopgm | pgmtopbm -thershold > outputfile works so much faster than convert -threshold 50% inputfile outputfile! (And avoids the latter’s resource constraints and “security” policies.) Commented Feb 9, 2021 at 6:49
1

Otsu's method will perform better when dealing with texts over a background with slightly gradient colors:

https://stackoverflow.com/questions/65945662/how-do-i-convert-a-color-image-to-black-and-white-using-imagemagick/65999445#65999445

According to @fmw42 's comparison on multiple threshold methods: the Local Adaptive(its equivalent in imagemagick is -lat) algorithm might work best on texts over background:

Also try to combine with connected components processing that was introduced in another answer made by @fmw42 to remove unnecessary edges/dots which might get confused by any further OCR process.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.