Skip to content

Conversation

@kyaco
Copy link

@kyaco kyaco commented Apr 29, 2025

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Abstract

This is a pull request for a program that calculates grayscale morphological operations quickly by using a data structure called a sparse table.

About sparse table

1D sparse table

For the data structure, see this: https://www.geeksforgeeks.org/sparse-table/

It is possible to quickly find the minimum (maximum) value of an array within a range whose length is a power of two.

Let's call the nth row of the sparse table the "row of depth n".
The nth row contains the minimum (maximum) values ​​in consecutive subsequences of length $2^n$ of the original array.

sparse tabl(min)
depth0(original array) 1 4 9 2 5 0 5 7 3
depth1 1 4 2 2 0 0 5 3 -
depth2 1 2 0 0 0 0 - - -
depth3 0 0 - - - - - - -

Let us denote by $F$ the operation of computing the next row from a row in a sparse table.

$st[0] = original~array$
$st[depth + 1] = F(st[depth])$

2D sparse table

For the data structure, see this: https://www.geeksforgeeks.org/2d-range-minimum-query-in-o1/

Change the order of the subscripts to:

$st[dr][dc][r][c]$

where $dr$ is the depth in the row direction, $dc$ is the depth in the column direction, $r$ is the index in the row direction, and $c$ is the index in the column direction.
$st[dr][dc]$ is a two-dimensional table that lists the minimum (maximum) values ​​within a rectangle R of height $2^{dr}$ and width $2^{dc}$ that is slid across the original array A. In other words, it is the result of a morphological operation when A is a grayscale image and R is the structuring element.

In terms of the depth (dr, dc) of a sparse table, the operation of increasing the depth by +1 in the row and col directions will be denoted as Fr and Fc, respectively.

$st[0][0] = original~image$
$st[dr+1][dc] = F_r(st[dr][dc])$
$st[dr][dc+1] = F_c(st[dr][dc])$

Sparse table morphology

Morphological operations are performed in the following steps:

  1. Prepare a set of rectangles whose width and height are powers of 2 to cover the structuring element
  2. Group the rectangles by height and width.
  3. Prepare a MAT DST to accumulate the results
  4. For each group, do the following
    1. Create st[dr][dc] that corresponds to the size of this group
    2. For each rectangle in the group, do the following
      • Shift st[dr][dc] by the position of each rectangle, and calculate min/max against DST.

At step 1, I implemented a method to find corners in a sparse table of structuring elements.
I don't know if it's the best method, but it's giving good results.

At step 4-1, st[dr][dc] is created by reusing st[dr'][dc'] of smaller size.
There are several possible reuse strategies, but I implemented a method that minimizes the total number of calculations of Fr and Fc.
This is reduced to the Rectilinear Steiner Arborescence Problem. (https://link.springer.com/article/10.1007/BF01758762).

Examples

With a structuring element of MORPH_ELLIPSE, size(5, 5)

The structuring element is decomposed into six rectangles.
The result of grouping in step 2 is as follows:

  • Size(4, 2): Point(0, 1), Point(1, 1), Point(0, 2), Point(1, 2)
  • Size(1, 4): Point(2, 0), Point(2, 1)

image

For size(4, 2), we need to calculate st[1][2], and for size(1, 4), we need to calculate st[2][0].

  1. st[0][0] =: original image
  2. st[1][0] =: Fr(st[0][0])
  3. st[1][1] =: Fc(st[1][0])
  4. st[1][2] =: Fc(st[1][1])
    • shift & min/max process for Point(0, 1), Point(1, 1), Point(0, 2), Point(1, 2)
  5. st[2][0] =: Fr(st[1][0])
    • shift & min/max process for Point(2, 0), Point(2, 1)

image

Performance measurement

  • Execution environment
    • Intel i7-8000 CPU @ 3.20GHz
    • 32GB RAM
  • image
    • lena.png(512x512), CV_8UC3
  • conditions
    • cv_RECT: MORPH_RECT, cv_erode
    • cv_CROSS: MORPH_CROSS, cv_erode
    • cv_ELLIPSE: MORPH_ELLIPSE, cv_erode
    • st_RECT: MORPH_RECT, sparse_table_erode
    • st_CROSS: MORPH_CROSS, sparse_table_erode
    • st_ELLIPSE: MORPH_ELLIPSE, sparse_table_erode

When kernel is decomposited in advance:

image

From the graph, time complexity against diameter of kernel are as follows:

  • cv_RECT and cv_CROSS: O(k)
  • cv_ELLIPSE: O(k^2)
  • st_RECT and st_CROSS: O(k^0.5)
  • st_ELLIPSE: O(k^0.76)

Including decomposition time:

image

When the kernel size approaches the same size as the target image, the kernel decomposition time becomes comparable to that of CV_RECT.

@kyaco
Copy link
Author

kyaco commented Jul 8, 2025

Hi, @asmorkalov. Please review this pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment