2

I am writing a function to label the connected component in an image (I know there are several libraries outside, I just wanted to play with the algorithm).

To do this I label the connected regions with different labels and create an equivalence table that contain information on the labels belonging to the same connected component.

As an example if my equivalence table (vector of vector) looks something like:

1: 1,3 2: 2,3 3: 1,2,3 4: 4 

It means that in the image there are 2 different regions, one made of elements that are labelled 1,2,3 and an other made of elements labelled 4.

What is an easy and efficient way to resolve the equivalences and end up with something that looks like:

1: 1,2,3 2: 4 

that I can use to "merge" the different connected regions belonging to the same connected component?

Thanks a lot for the help!

3
  • The data representation you've given doesn't make sense with the explanation of it to me.. I might be missing something, but I don't see a region with only the element 4 unless it's a like a vin diagram and your "regions" means group related through intersection Commented Sep 20, 2012 at 21:52
  • Sorry if I have not been clear. If you explain what doesn't make sense I can try and edit the question. As you scan the image you assign temporary label to the pixels (in this case the temporary labels are 1,2,3,4). Meanwhile you build the equivalence relations. For example the first equivalence you find is that pixels temporary labelled as 1 belong to the same region of pixels labelled by 3. The second states that pixels labelled by 2 also belong to the region of pixels labelled by 3. Pixels labelled by 3 of course belong to the same region as pixel labelled 1 or 2 or 3. Pixels labelled by 4 Commented Sep 20, 2012 at 22:11
  • Pixels labelled by 4 don't belong to any other region. From this information I have to derive that there are 2 different regions, one made of pixels labelled as 1,2,3, the other one made of pixels labelled as 4. Hope this is more clear now. Commented Sep 20, 2012 at 22:12

1 Answer 1

2

You probably want a disjoint set datastructure. This yields near-constant amortized performance per operation, and can be made very efficient for the kind of task you describe.

Briefly, each equivalence class is stored as an "inverted tree": each node has a parent link, but no child links. A "query" follows the parent links until they reach the root (which can be marked by having itself as a parent). Note that every node in the same tree will yield the same root, which is the "exemplar" representing that equivalence class.

Nodes typically start out as their own equivalence class (that is, they are initialized as a tree consisting of just the root). To merge two equivalence classes, find the root of each, and make one the parent of the other.

By adding two simple optimizations (always "collapse" query paths to point directly to the root, and always choose the root of the deeper tree as merge parent), you gain "near-constant" amortized worst case performance for your operations.

2
  • Ok this seems of help. Although the implementation of such a data structure is not entirely clear to me... Commented Sep 21, 2012 at 5:57
  • Check out the link -- the Wikipedia article is relatively good. There is also a treatment on this in Introduction to Algorithms by Cormen et al, which is very thorough. Commented Sep 21, 2012 at 17:49

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.