Ideas for image segmentation

A well known problem of computer vision is the identification of edges and surfaces in images. A robot roaming on the ground will appreciate being able to distinguish green grass from the water of a lake.

The robot has a camera on its head and is given a still image $I = [p_{i,j}]_{i,j=0}^{N-1}$ consisting of $N \times N$ pixels $p_{i,j} \in \{0, 1, \dots, 255\}^3$ in the rgb palette. Let $R$ be the “real world” image $R :[0,1]^2 \rightarrow [0,255]^3$ such that $p_{i,j} = R\left(\frac{2i+1}{2N}, \frac{2j+1}{2N}\right)$. We suppose that $R$ is piecewise continuous in the following sense: the set $\mathcal{C}$ of continuity points of $R$ is open, dense in $[0,1]^2$ and has a finite number of connected components. The edge points are therefore $\partial \mathcal{C}$ and the surfaces are the connected components of $\mathcal{C}$. To characterize $\mathcal{C}$, we make use of the modulus of continuity, defined as

$w_a(\delta) = \sup_{| a - b |< \delta} |R(a) - R(b)|,$

for all $a \in [0,1]^2$.

Lemma. A point $a \in [0,1]^2$ is an edge point of $R$ if and only if

$\inf_{\delta > 0} w_a(\delta) > 0$.

This lemma gives a hint as to how we can detect edge points. We may simply compare a pixel $p_{i,j}$ to its neighbours and say that $(i,j)$ is an edge point when the maximum distance (in the rgb palette) to a neighbour is greater than a fixed threshold $\tau > 0$.

There are two problems with this approach. First, the set of edge points so obtained will be irregular and very messy. Second, this is not robust to noise and the approach may be broken by intricate texture patterns. Here, I’m going to focus on resolving the first issue: obtaining smooth edges.

Obtaining smooth edges.

(We assume that the image is simple and contains no noise or textures). The idea is the following. As we identify surfaces, in a growing number $k$, we correspondingly adjust a smooth partition of unity

$1 = \sum_{l=1}^k \phi_{l,k}.$

At each pixel $(i,j)$, we understand $\phi_{l,k}(i,j)$ as being the probability that $(i,j)$ is a point of the $l$th surface. Once we have algorithmically established a satisfying set of probabilities, we define the $l$th surface as being the set of points on which $\phi_{l,k}$ is greater than any other $\phi_{l', k}$.

The idea of an algorithm to construct the partition of unity is the following. Let $K_\delta$ be a positive radial function, decreasing with the bandwidth $\delta$ and such that its maximum also decreases with $\delta$; and let $\tau > 0$ be a fixed threshold, to be adjusted. Pick an pixel $(i,j)$ and find the largest ball $B_\delta$ of radius $\delta$ around $(i,j)$ such that the $d_2$ distance between $p_{i,j}$ and other pixels of $B_\delta$ are less than $\tau$. If $\delta$ is not too small, we spawn a new surface and let $\phi_{1,1}$ be the kernel $K_{\delta}$ centered at $(i,j)$. We then grow this surface by trying to fit other balls in this way at all the points of $B_\delta$, replacing $\phi_{1,1}$ by the maximum of itself with the kernels corresponding to these balls. We maintain (weakly) decreasing radiuses for the balls as we get away from $(i,j)$,  and the height of $\phi_{1,1}$ correspondingly gets smaller. Once we have explored all the connected component of $(i,j)$, we start again, this time from an unexplored point, to obtain $\phi_{2,2}$. If no points were left unexplored, we can start again from a point at which $\phi_{1,1}$ attains a minimum. We do this for a fixed number of time or until ${}\sum_{l=1}^k \phi_{l,k}$ gets close to be constant.

Note that if the $\phi_{l,k}$ are translates of some radial kernel $K_\delta$ for a fixed $\delta > 0$, then the surfaces are Voronoi cells and the edges form a Voronoi diagram.

The post will be updated this weekend in a near future with the python implementation of the algorithm. I already know that the modulus of continuity approach yields decent results (but with irregular and broken edges); we will see if this modification works.

Update: I didn’t have the time to program and experiment with this. It will have to wait until I approach image segmentation more seriously.