Hoshen–Kopelman algorithm

The Hoshen–Kopelman algorithm is a simple and efficient algorithm for labeling clusters on a grid, where the grid is a regular network of cells, with the cells being either occupied or unoccupied. This algorithm is based on a well-known union-finding algorithm. The algorithm was originally described by Joseph Hoshen and Raoul Kopelman in their 1976 paper "Percolation and Cluster Distribution. I. Cluster Multiple Labeling Technique and Critical Concentration Algorithm".

Percolation theory

is the study of the behavior and statistics of clusters on lattices. Suppose we have a large square lattice where each cell can be occupied with the probability p and can be empty with the probability 1 – p. Each group of neighboring occupied cells forms a cluster. Neighbors are defined as cells having a common side but not those sharing only a corner i.e. we consider the 4-connected neighborhood that is top, bottom, left and right. Each occupied cell is independent of the status of its neighborhood. The number of clusters, the size of each cluster and their distribution are important topics in percolation theory.

Hoshen–Kopelman algorithm for cluster finding

In this algorithm, we scan through a grid looking for occupied cells and labeling them with cluster labels. The scanning process is called as Raster Scan. The algorithm begins with scanning the grid cell by cell and check if the cell is occupied or not. If the cell is occupied, then it must be labeled with a cluster label. This cluster label is decided based on the neighbors of that cell. If the cell doesn’t have any occupied neighbors then, a new label is assigned to the cell.

Union-find algorithm

This algorithm is a simple method for computing equivalence classes. Calling the function union specifies that, items x and y are members of the same equivalence class. Because equivalence relations are transitive; all the items equivalent to x are equivalent to all the items equivalent to y. Thus for any item x, there is a set of items which are all equivalent to x. This set is the equivalence class of which x is a member. A second function find returns a representative member of the equivalence class to which x belongs.

Pseudocode

During the raster scan of the grid, whenever an occupied cell is encountered, neighboring cells are scanned to check whether any of them have already been scanned. If we find already scanned neighbors, the union operation is performed, to specify that these neighboring cells are in fact members of the same equivalence class. Then thefind operation is performed to find a representative member of that equivalence class with which the current cell will be labeled.
On the other hand, if the current cell has no neighbors, it is assigned a new, previously unused, label. The entire grid is processed in this way.
Following pseudocode is referred from implementation of the same algorithm.
Raster Scan and Labeling on the Grid
largest_label = 0;
for x in 0 to n_columns
Union
void union
Find
int find

Example

Consider the following example. The dark cells in the grid in figure represent that they are occupied and the white ones are empty. So by running H–K algorithm on this input we would get the output as shown in figure with all the clusters labeled.
The algorithm processes the input grid, cell by cell, as follows: Let's say that grid is a two-dimensional array.

Starting from cell grid i.e. the first cell. The cell is occupied and it doesn't have cells to the left or above so we will label this cell with a new label say 1.
grid and grid both are unoccupied so they are not labeled.
grid is occupied so check cell to the left which is unoccupied so we increment the current label value and assign the label to the cell as 2.
grid, grid and grid are unoccupied so they are not labeled.
grid is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label 3.
grid is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of a cell on the left to this cell 3.
grid is occupied so check cell to the left and above, both the cells are occupied, so merge the two clusters and assign the cluster label of the cell above to the cell on the left and to this cell i.e. 2.
grid is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of a cell on the left to this cell 2.
grid, grid and grid are unoccupied so they are not labeled.
grid is occupied so check cell to the left and above, only cell above is occupied so assign the label of the cell above to this cell 3.
grid, grid and grid are unoccupied so they are not labeled.
grid is occupied so check cell above which is unoccupied so we increment the current label value and assign the label to the cell as 4.
grid is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of the cell on the left to this cell 4.
grid is unoccupied so it is not labeled.
grid is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label 5.
grid is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of the cell on the left to this cell 5.
grid, grid and grid are unoccupied so they are not labeled.
grid is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label 6.
grid is occupied so check cell to the left and above, both, cell to the left and above are occupied so merge the two clusters and assign the cluster label of the cell above to the cell on the left and to this cell i.e. 5..
grid is unoccupied so it is not labeled.
grid is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label 7.
gridgrid grid and grid are unoccupied so they are not labeled.
grid is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label 8.
grid is occupied so check cell to the left and above, both, cell to the left and above are occupied so merge the two clusters and assign the cluster label of the cell above to the cell on the left and to this cell i.e. 7..

Applications

Determination of Nodal Domain Area and Nodal Line Lengths
Nodal Connectivity Information
Modeling of electrical conduction

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...