Leptonica  1.82.0
Image processing and image analysis suite
binarize.c File Reference
#include <math.h>
#include "allheaders.h"

Go to the source code of this file.

Functions

static PIXpixSauvolaGetThreshold (PIX *pixm, PIX *pixms, l_float32 factor, PIX **ppixsd)
 
static PIXpixApplyLocalThreshold (PIX *pixs, PIX *pixth)
 
l_ok pixOtsuAdaptiveThreshold (PIX *pixs, l_int32 sx, l_int32 sy, l_int32 smoothx, l_int32 smoothy, l_float32 scorefract, PIX **ppixth, PIX **ppixd)
 
PIXpixOtsuThreshOnBackgroundNorm (PIX *pixs, PIX *pixim, l_int32 sx, l_int32 sy, l_int32 thresh, l_int32 mincount, l_int32 bgval, l_int32 smoothx, l_int32 smoothy, l_float32 scorefract, l_int32 *pthresh)
 
PIXpixMaskedThreshOnBackgroundNorm (PIX *pixs, PIX *pixim, l_int32 sx, l_int32 sy, l_int32 thresh, l_int32 mincount, l_int32 smoothx, l_int32 smoothy, l_float32 scorefract, l_int32 *pthresh)
 
l_ok pixSauvolaBinarizeTiled (PIX *pixs, l_int32 whsize, l_float32 factor, l_int32 nx, l_int32 ny, PIX **ppixth, PIX **ppixd)
 
l_ok pixSauvolaBinarize (PIX *pixs, l_int32 whsize, l_float32 factor, l_int32 addborder, PIX **ppixm, PIX **ppixsd, PIX **ppixth, PIX **ppixd)
 
PIXpixSauvolaOnContrastNorm (PIX *pixs, l_int32 mindiff, PIX **ppixn, PIX **ppixth)
 
PIXpixThreshOnDoubleNorm (PIX *pixs, l_int32 mindiff)
 
l_ok pixThresholdByConnComp (PIX *pixs, PIX *pixm, l_int32 start, l_int32 end, l_int32 incr, l_float32 thresh48, l_float32 threshdiff, l_int32 *pglobthresh, PIX **ppixd, l_int32 debugflag)
 
l_ok pixThresholdByHisto (PIX *pixs, l_int32 factor, l_int32 halfw, l_float32 delta, l_int32 *pthresh, PIX **ppixd, PIX **ppixhisto)
 

Detailed Description


 ===================================================================
 Image binarization algorithms are found in:
   grayquant.c:   standard, simple, general grayscale quantization
   adaptmap.c:    local adaptive; mostly gray-to-gray in preparation
                  for binarization
   binarize.c:    special binarization methods, locally adaptive and
                  global.
 ===================================================================

     Adaptive Otsu-based thresholding
         l_int32       pixOtsuAdaptiveThreshold()       8 bpp

     Otsu thresholding on adaptive background normalization
         PIX          *pixOtsuThreshOnBackgroundNorm()  8 bpp

     Masking and Otsu estimate on adaptive background normalization
         PIX          *pixMaskedThreshOnBackgroundNorm()  8 bpp

     Sauvola local thresholding
         l_int32       pixSauvolaBinarizeTiled()
         l_int32       pixSauvolaBinarize()
         static PIX   *pixSauvolaGetThreshold()
         static PIX   *pixApplyLocalThreshold();

     Sauvola binarization on contrast normalization
         PIX          *pixSauvolaOnContrastNorm()  8 bpp

     Contrast normalization followed by bg normalization and thresholding
         PIX          *pixThreshOnDoubleNorm()

     Global thresholding using connected components
         PIX          *pixThresholdByConnComp()

     Global thresholding by histogram
         PIX          *pixThresholdByHisto()

 Notes:
     (1) pixOtsuAdaptiveThreshold() computes a global threshold over each
         tile and performs the threshold operation, resulting in a
         binary image for each tile.  These are stitched into the
         final result.
     (2) pixOtsuThreshOnBackgroundNorm() and
         pixMaskedThreshOnBackgroundNorm() are binarization functions
         that use background normalization with other techniques.
     (3) Sauvola binarization computes a local threshold based on
         the local average and square average.  It takes two constants:
         the window size for the measurement at each pixel and a
         parameter that determines the amount of normalized local
         standard deviation to subtract from the local average value.
     (4) pixThresholdByConnComp() uses the numbers of 4 and 8 connected
         components at different thresholding to determine if a
         global threshold can be used (for text or line-art) and the
         value it should have.

Definition in file binarize.c.

Function Documentation

◆ pixApplyLocalThreshold()

static PIX * pixApplyLocalThreshold ( PIX pixs,
PIX pixth 
)
static

pixApplyLocalThreshold()

Parameters
[in]pixs8 bpp grayscale; not colormapped
[in]pixth8 bpp array of local thresholds
Returns
pixd 1 bpp, thresholded image, or NULL on error

Definition at line 803 of file binarize.c.

◆ pixMaskedThreshOnBackgroundNorm()

PIX* pixMaskedThreshOnBackgroundNorm ( PIX pixs,
PIX pixim,
l_int32  sx,
l_int32  sy,
l_int32  thresh,
l_int32  mincount,
l_int32  smoothx,
l_int32  smoothy,
l_float32  scorefract,
l_int32 *  pthresh 
)

pixMaskedThreshOnBackgroundNorm()

Parameters
[in]pixs8 bpp grayscale; not colormapped
[in]pixim[optional] 1 bpp 'image' mask; can be null
[in]sx,sytile size in pixels
[in]threshthreshold for determining foreground
[in]mincountmin threshold on counts in a tile
[in]smoothxhalf-width of block convolution kernel width
[in]smoothyhalf-width of block convolution kernel height
[in]scorefractfraction of the max Otsu score; typ. ~ 0.1
[out]pthresh[optional] threshold value that was used on the normalized image
Returns
pixd 1 bpp thresholded image, or NULL on error
Notes:
     (1) This begins with a standard background normalization.
         Additionally, there is a flexible background norm, that
         will adapt to a rapidly varying background, and this
         puts white pixels in the background near regions with
         significant foreground.  The white pixels are turned into
         a 1 bpp selection mask by binarization followed by dilation.
         Otsu thresholding is performed on the input image to get an
         estimate of the threshold in the non-mask regions.
         The background normalized image is thresholded with two
         different values, and the result is combined using
         the selection mask.
     (2) Note that the numbers 255 (for bgval target) and 190 (for
         thresholding on pixn) are tied together, and explicitly
         defined in this function.
     (3) See pixBackgroundNorm() for meaning and typical values
         of input parameters.  For a start, you can try:
           sx, sy = 10, 15
           thresh = 100
           mincount = 50
           smoothx, smoothy = 2

Definition at line 371 of file binarize.c.

◆ pixOtsuAdaptiveThreshold()

l_ok pixOtsuAdaptiveThreshold ( PIX pixs,
l_int32  sx,
l_int32  sy,
l_int32  smoothx,
l_int32  smoothy,
l_float32  scorefract,
PIX **  ppixth,
PIX **  ppixd 
)

pixOtsuAdaptiveThreshold()

Parameters
[in]pixs8 bpp
[in]sx,sydesired tile dimensions; actual size may vary
[in]smoothx,smoothyhalf-width of convolution kernel applied to threshold array: use 0 for no smoothing
[in]scorefractfraction of the max Otsu score; typ. 0.1; use 0.0 for standard Otsu
[out]ppixth[optional] array of threshold values found for each tile
[out]ppixd[optional] thresholded input pixs, based on the threshold array
Returns
0 if OK, 1 on error
Notes:
     (1) The Otsu method finds a single global threshold for an image.
         This function allows a locally adapted threshold to be
         found for each tile into which the image is broken up.
     (2) The array of threshold values, one for each tile, constitutes
         a highly downscaled image.  This array is optionally
         smoothed using a convolution.  The full width and height of the
         convolution kernel are (2 * smoothx + 1) and (2 * smoothy + 1).
     (3) The minimum tile dimension allowed is 16.  If such small
         tiles are used, it is recommended to use smoothing, because
         without smoothing, each small tile determines the splitting
         threshold independently.  A tile that is entirely in the
         image bg will then hallucinate fg, resulting in a very noisy
         binarization.  The smoothing should be large enough that no
         tile is only influenced by one type (fg or bg) of pixels,
         because it will force a split of its pixels.
     (4) To get a single global threshold for the entire image, use
         input values of sx and sy that are larger than the image.
         For this situation, the smoothing parameters are ignored.
     (5) The threshold values partition the image pixels into two classes:
         one whose values are less than the threshold and another
         whose values are greater than or equal to the threshold.
         This is the same use of 'threshold' as in pixThresholdToBinary().
     (6) The scorefract is the fraction of the maximum Otsu score, which
         is used to determine the range over which the histogram minimum
         is searched.  See numaSplitDistribution() for details on the
         underlying method of choosing a threshold.
     (7) This uses enables a modified version of the Otsu criterion for
         splitting the distribution of pixels in each tile into a
         fg and bg part.  The modification consists of searching for
         a minimum in the histogram over a range of pixel values where
         the Otsu score is within a defined fraction, scorefract,
         of the max score.  To get the original Otsu algorithm, set
         scorefract == 0.
     (8) N.B. This method is NOT recommended for images with weak text
         and significant background noise, such as bleedthrough, because
         of the problem noted in (3) above for tiling.  Use Sauvola.

Definition at line 157 of file binarize.c.

◆ pixOtsuThreshOnBackgroundNorm()

PIX* pixOtsuThreshOnBackgroundNorm ( PIX pixs,
PIX pixim,
l_int32  sx,
l_int32  sy,
l_int32  thresh,
l_int32  mincount,
l_int32  bgval,
l_int32  smoothx,
l_int32  smoothy,
l_float32  scorefract,
l_int32 *  pthresh 
)

pixOtsuThreshOnBackgroundNorm()

Parameters
[in]pixs8 bpp grayscale; not colormapped
[in]pixim[optional] 1 bpp 'image' mask; can be null
[in]sx,sytile size in pixels
[in]threshthreshold for determining foreground
[in]mincountmin threshold on counts in a tile
[in]bgvaltarget bg val; typ. > 128
[in]smoothxhalf-width of block convolution kernel width
[in]smoothyhalf-width of block convolution kernel height
[in]scorefractfraction of the max Otsu score; typ. 0.1
[out]pthresh[optional] threshold value that was used on the normalized image
Returns
pixd 1 bpp thresholded image, or NULL on error
Notes:
     (1) This does background normalization followed by Otsu
         thresholding.  Otsu binarization attempts to split the
         image into two roughly equal sets of pixels, and it does
         a very poor job when there are large amounts of dark
         background.  By doing a background normalization first,
         to get the background near 255, we remove this problem.
         Then we use a modified Otsu to estimate the best global
         threshold on the normalized image.
     (2) See pixBackgroundNorm() for meaning and typical values
         of input parameters.  For a start, you can try:
           sx, sy = 10, 15
           thresh = 100
           mincount = 50
           bgval = 255
           smoothx, smoothy = 2

Definition at line 273 of file binarize.c.

◆ pixSauvolaBinarize()

l_ok pixSauvolaBinarize ( PIX pixs,
l_int32  whsize,
l_float32  factor,
l_int32  addborder,
PIX **  ppixm,
PIX **  ppixsd,
PIX **  ppixth,
PIX **  ppixd 
)

pixSauvolaBinarize()

Parameters
[in]pixs8 bpp grayscale; not colormapped
[in]whsizewindow half-width for measuring local statistics
[in]factorfactor for reducing threshold due to variance; >= 0
[in]addborder1 to add border of width (whsize + 1) on all sides
[out]ppixm[optional] local mean values
[out]ppixsd[optional] local standard deviation values
[out]ppixth[optional] threshold values
[out]ppixd[optional] thresholded image
Returns
0 if OK, 1 on error
Notes:
     (1) The window width and height are 2 * whsize + 1.  The minimum
         value for whsize is 2; typically it is >= 7..
     (2) The local statistics, measured over the window, are the
         average and standard deviation.
     (3) The measurements of the mean and standard deviation are
         performed inside a border of (whsize + 1) pixels.  If pixs does
         not have these added border pixels, use addborder = 1 to add
         it here; otherwise use addborder = 0.
     (4) The Sauvola threshold is determined from the formula:
           t = m * (1 - k * (1 - s / 128))
         where:
           t = local threshold
           m = local mean
           k = factor (>= 0)   [ typ. 0.35 ]
           s = local standard deviation, which is maximized at
               127.5 when half the samples are 0 and half are 255.
     (5) The basic idea of Niblack and Sauvola binarization is that
         the local threshold should be less than the median value,
         and the larger the variance, the closer to the median
         it should be chosen.  Typical values for k are between
         0.2 and 0.5.

Definition at line 611 of file binarize.c.

◆ pixSauvolaBinarizeTiled()

l_ok pixSauvolaBinarizeTiled ( PIX pixs,
l_int32  whsize,
l_float32  factor,
l_int32  nx,
l_int32  ny,
PIX **  ppixth,
PIX **  ppixd 
)

pixSauvolaBinarizeTiled()

Parameters
[in]pixs8 bpp grayscale, not colormapped
[in]whsizewindow half-width for measuring local statistics
[in]factorfactor for reducing threshold due to variance; >= 0
[in]nx,nysubdivision into tiles; >= 1
[out]ppixth[optional] Sauvola threshold values
[out]ppixd[optional] thresholded image
Returns
0 if OK, 1 on error
Notes:
     (1) The window width and height are 2 * whsize + 1.  The minimum
         value for whsize is 2; typically it is >= 7.
     (2) For nx == ny == 1, this defaults to pixSauvolaBinarize().
     (3) Why a tiled version?
         (a) A uint32 is used for the mean value accumulator, so
             overflow can occur for an image with more than 16M pixels.
         (b) A dpix is used to accumulate mean square values, and it
             can only accommodate images with less than 2^28 pixels.
             Using tiles reduces the size of all the arrays.
         (c) Each tile can be processed independently, in parallel,
             on a multicore processor.
     (4) The Sauvola threshold is determined from the formula:
             t = m * (1 - k * (1 - s / 128))
         See pixSauvolaBinarize() for details.

Definition at line 484 of file binarize.c.

◆ pixSauvolaGetThreshold()

static PIX * pixSauvolaGetThreshold ( PIX pixm,
PIX pixms,
l_float32  factor,
PIX **  ppixsd 
)
static

pixSauvolaGetThreshold()

Parameters
[in]pixm8 bpp grayscale; not colormapped
[in]pixms32 bpp
[in]factorfactor for reducing threshold due to variance; >= 0
[out]ppixsd[optional] local standard deviation
Returns
pixd 8 bpp, sauvola threshold values, or NULL on error
Notes:
     (1) The Sauvola threshold is determined from the formula:
           t = m * (1 - k * (1 - s / 128))
         where:
           t = local threshold
           m = local mean
           k = factor (>= 0)   [ typ. 0.35 ]
           s = local standard deviation, which is maximized at
               127.5 when half the samples are 0 and half are 255.
     (2) See pixSauvolaBinarize() for other details.
     (3) Important definitions and relations for computing averages:
           v == pixel value
           E(p) == expected value of p == average of p over some pixel set
           S(v) == square of v == v * v
           mv == E(v) == expected pixel value == mean value
           ms == E(S(v)) == expected square of pixel values
              == mean square value
           var == variance == expected square of deviation from mean
               == E(S(v - mv)) = E(S(v) - 2 * S(v * mv) + S(mv))
                               = E(S(v)) - S(mv)
                               = ms - mv * mv
           s == standard deviation = sqrt(var)
         So for evaluating the standard deviation in the Sauvola
         threshold, we take
           s = sqrt(ms - mv * mv)

Definition at line 721 of file binarize.c.

◆ pixSauvolaOnContrastNorm()

PIX* pixSauvolaOnContrastNorm ( PIX pixs,
l_int32  mindiff,
PIX **  ppixn,
PIX **  ppixth 
)

pixSauvolaOnContrastNorm()

Parameters
[in]pixs8 or 32 bpp
[in]mindiffminimum diff to accept as valid in contrast normalization. Use ~130 for noisy images.
[out]ppixn[optional] intermediate output from contrast normalization
[out]ppixth[optional] threshold array for binarization
Returns
pixd 1 bpp thresholded image, or NULL on error
Notes:
     (1) This composite operation is good for adaptively removing
         dark background.

Definition at line 864 of file binarize.c.

◆ pixThresholdByConnComp()

l_ok pixThresholdByConnComp ( PIX pixs,
PIX pixm,
l_int32  start,
l_int32  end,
l_int32  incr,
l_float32  thresh48,
l_float32  threshdiff,
l_int32 *  pglobthresh,
PIX **  ppixd,
l_int32  debugflag 
)

pixThresholdByConnComp()

Parameters
[in]pixsdepth > 1, colormap OK
[in]pixm[optional] 1 bpp mask giving region to ignore by setting pixels to white; use NULL if no mask
[in]start,end,incrbinarization threshold levels to test
[in]thresh48threshold on normalized difference between the numbers of 4 and 8 connected components
[in]threshdiffthreshold on normalized difference between the number of 4 cc at successive iterations
[out]pglobthresh[optional] best global threshold; 0 if no threshold is found
[out]ppixd[optional] image thresholded to binary, or null if no threshold is found
[in]debugflag1 for plotted results
Returns
0 if OK, 1 on error or if no threshold is found
Notes:
     (1) This finds a global threshold based on connected components.
         Although slow, it is reasonable to use it in a situation where
         (a) the background in the image is relatively uniform, and
         (b) the result will be fed to an OCR program that accepts 1 bpp
             images and works best with easily segmented characters.
         The reason for (b) is that this selects a threshold with a
         minimum number of both broken characters and merged characters.
     (2) If the pix has color, it is converted to gray using the
         max component.
     (3) Input 0 to use default values for any of these inputs:
         start, end, incr, thresh48, threshdiff.
     (4) This approach can be understood as follows.  When the
         binarization threshold is varied, the numbers of c.c. identify
         four regimes:
         (a) For low thresholds, text is broken into small pieces, and
             the number of c.c. is large, with the 4 c.c. significantly
             exceeding the 8 c.c.
         (b) As the threshold rises toward the optimum value, the text
             characters coalesce and there is very little difference
             between the numbers of 4 and 8 c.c, which both go
             through a minimum.
         (c) Above this, the image background gets noisy because some
             pixels are(thresholded to foreground, and the numbers
             of c.c. quickly increase, with the 4 c.c. significantly
             larger than the 8 c.c.
         (d) At even higher thresholds, the image background noise
             coalesces as it becomes mostly foreground, and the
             number of c.c. drops quickly.
     (5) If there is no global threshold that distinguishes foreground
         text from background (e.g., weak text over a background that
         has significant variation and/or bleedthrough), this returns 1,
         which the caller should check.

Definition at line 1013 of file binarize.c.

◆ pixThresholdByHisto()

l_ok pixThresholdByHisto ( PIX pixs,
l_int32  factor,
l_int32  halfw,
l_float32  delta,
l_int32 *  pthresh,
PIX **  ppixd,
PIX **  ppixhisto 
)

pixThresholdByHisto()

Parameters
[in]pixsgray 8 bpp, no colormap
[in]factorsubsampling factor >= 1
[in]halfwhalf of window width for smoothing; use 0 for default
[in]deltarelative amount to resolve peaks and valleys; in (0 ... 1], use 0 for default
[out]pthreshbest global threshold; 0 if no threshold is found
[out]ppixd[optional] thresholded 1 bpp pix
[out]ppixhisto[optional] rescaled histogram of gray values
Returns
0 if OK, 1 on error or if no threshold is found
Notes:
     (1) This finds a global threshold.  It is best for an image that
         has a fairly well-defined fg and bg.
     (2) If it finds a good threshold and ppixd is defined, the binarized
         image is returned in  otherwise it return null.
     (3) Suggest using default values for half and delta.
     (4) Returns 0 in pthresh if it can't find a good threshold.

Definition at line 1169 of file binarize.c.

◆ pixThreshOnDoubleNorm()

PIX* pixThreshOnDoubleNorm ( PIX pixs,
l_int32  mindiff 
)

pixTheshOnDoubleNorm()

Parameters
[in]pixs8 or 32 bpp
[in]mindiffminimum diff to accept as valid in contrast normalization. Use ~130 for noisy images.
Returns
pixd 1 bpp thresholded image, or NULL on error
Notes:
     (1) This composite operation is good for adaptively removing
         dark background.
     (2) The threshold for the binarization uses an estimate based
         on Otsu adaptive thresholding.

Definition at line 920 of file binarize.c.