# WEEK 1 :Danyiar Nurgaliev, Timur Gatanov, and Daniel J. Needleman, Automated Identification of Microtubules in Cellular Electron Tomography. Methods in Cell Biology, 2010, 97.

Robin Kirkpatrick, Week 1 Commentary

## Introduction

Computer automated image segmentation can be a difficult problem, and requires careful combination of prefiltering, and selection of the optimization procedures and parameters used. Although it can be challenging, automation of analysis is highly desired when handling data sets as manual segmentation is time consuming. In this paper, the authors present their computer automated approach for segmenting microtubules from cellular electron tomography data taken from frozen cells. This technique renders a 3D image, from which the authors attempt to segment each microtubule in 3D.

## Successful Image Segmentation

Their successful approach involved 2 steps, namely preprocessing and tracking. Preprocessing was done to find sets of points that were likely to be part of the wall of a microtubule, whereas the "tracking" step linked the final set of data to segments that likely correspond the same microtubule. An example of an original image is shown below in Figure 1.

Figure 1- Example of an XY slice of the original electron tomography data. Black spots correspond to ribosomes, and the white long structures with black outlines are microtubules.

Note that the microtubulues are bright in comparison to the background (ie. they hold a high pel value). However, the boundaries are dark (low pel value).

*Preprocessing*

To remove noise, the initial image was low- pass filtered using a gaussian kernel. The authors then employed a method that allows them to determine pels that most likely lie within a microtubule. Their assumption was as follows. Microtubules are approximately (locally) linear structures, and appear as white with dark outlines corresponding to the walls of microtubules. The authors note that a line in 2D has is a local minimum (since the line is dark) in the perpendicular direction of the line, and is constant in the direction of the line. Noting that the first and second derivatives in the direction of the line should be zero (plus or minus noise, which is amplified due to differentiation as well). The first derivative in the perpendicular direction should be high at the edge, and the second derivative should be zero at the edge.

It should be noted that numerical differentiation amplifies noise, therefore it is critical to prefilter the image. Mathematically, we can define the output image, Io, in terms of the convolution kernel G, and the initial image, I.

<math>{Io(x,y) = I(x,y) * G(x,y)}</math>

Following directly from calculus, the derivatives can be computed directly by convolution with the derivative of the convolution kernel (which has a closed analytical form for a gaussian kernel) as shown below (rather than computing the convolution, followed by a numerical derivative).

<math>{[\frac{\partial Io(x,y)}{\partial x}, \frac{\partial Io(x,y)}{\partial y}] = [I(x,y) * G_x(x,y), I(x,y) * G_y(x,y)}</math>

Where <math>{G_y(x,y)}</math> and <math>{G_x(x,y)}</math> are the partial derivatives of the convolution kernel which is trivial to compute analytically for a gaussian kernel.

Similarly, a matrix of second derivatives can be computed

<math>{[ \left( \begin{array}{cc} Io_{xx} & Io_{xy} \\ Io_{xy} & Io_{yy} \end{array} \right)]= [ \left( \begin{array}{cc} (Io * G_{xx}) & (Io * G_{xy}) \\ (Io * G_{xy}) & (Io * G_{yy}) \end{array} \right)]}</math>

Conveniently, the width of the convolution kernel determines the width of structures that remain in the resulting mask. The authors note that the intensity can be expanded in the vicinity of the current pixel using a taylor series expansion. Directly following this analysis for I_x = 0 (not shown here, done for an angle of 0) results in the relation:

<math>{[x = -(I_x/I_{xx})]} </math>

Therefore, if Ixx>0 (concave up), and the ratio above is less than a half a pixel, then the current pixel is considered to be at a local minimum. This analysis can be done in a straightforward manner for a given angle with respect to x axis. The result is a binary mask, which is an initial guess of the locations that are the edges of the microtubules. However, their method at this stage still has many false positives due the algorithm finding local linear structures that may not be representative of a microtubule wall. To search for extended lines, the authors convolve this mask with a binary mask that consists of a large number of 1's along the orthogonal direction of the original angle used (ie in the guessed direction of the line). This is followed by thresholding, and a dilation operation (to reduce the shortening effect induced by the convolution step near the edges). These steps are done for a large number of different angles. The authors then noted that the microtubule is always surrounded by two lines, and thus walls form pairs at the same orientation. The current mask was then convolved with pairs of lines at a set spacing and thresholded.

*Tracking*

To discern global properties about the microtubules in the image (length, number, etc), it is important to link the various segments if they belong to the same microtubule. The preprocessing step has some issues, as it will yield false breaks, false fuses (especially near clusters of microtubules), etc. To circumvent this issue, the authors use the method of active contours. Briefly, they seek a contour <math>{\vec x(l)}</math> which minimizes an energy function E which is discussed below. Noting that global search algorithms are laborious, the authors employ simulated annealing (SA). SA is a Monte-Carlo technique. SA is intended to find the global minimum in significantly less moves that a global search algorithm. However, SA allows for random moves up the energy landscape. Allowing random moves uphill in energy allows the solution to navigate a complex energy landscape, and thus avoid being trapped in local energy wells so it can find the global minimum. The algorithm makes a random perturbation from the guess, and computes E(x_perturbed). The algorithm will temporarily keep this solution as the current guess if E(x_perturbed)<E(x_guess). The perturbed guess will also be kept with probability exp(-deltaE/T) allowing the algorithm to go uphill in energy. The temperature is slowly decreased throughout the algorithm so that eventually no additional uphill moves are allowed to be made. The energy function the authors defined was analogous to an elastic string in an external potential.

<math> {E_{contour} = E_{potential} + E_{bending} + E_{length} = \int \! U(\vec x(l)) dl + \int \! (k/R(l))dl - \mu*L }</math>

L is the contour length, <math>{\mu}</math> determines the energy contribution of energy due to the length. 1/R(l) is the curvature of the contour, and k is the bending modulus. High values of k restrict the amount of bending allowed in the final contour. The potential function, U, was set to Io for the preprocessed images. For the unprocessed images, <math>{ U = I(\vec x(l) - (w/2)\vec n(l)) + I(\vec x(l) + (w/2)\vec n(l))}</math>. w is the predicted width of the microtubule, which allows the algorithm to search for pairs of lines width w apart. The perturbations made in the algorithm include shrinking and growing segments, and moving one vertex in the contour. Note that the energies discussed above were for a single contour. The algorithm searches instead for N curves. The total energy is the sum of the individual contour energies, plus an interaction energy due to contours being near one another, and a chemopotential energy due to creation of additional contours. Mathematically, this can be expressed as

<math>{E_{total} = \sum_{i=0}^N E_{i,contour} + E_{interaction} + \nu N}</math>

## Final Comments

The authors compare their results with manual results and find that there are still some problems with the current methodology. These remaining issues include missing microtubules, determining a shorter length than what is actually present, and identifying long microtubules as a large number of short ones.

The reader wonders whether the form of the energy functional used here would be useful for trajectory linking of distance-time data (an analogous problem). Techniques typically minimize variances and intensity fluctuations, however, this method seems much more robust. The reader also wonders if the gaps could be easily filled in using a closing (or analogous) operation either pre or post processing in the direction of the microtubule (ie orthogonal to the picked angle). The parameters that work best for the optimization (mu, nu, etc) are highly dependent on the region properties in the image. The authors are currently working on improving their technique by using machine learning to train their algorithm to handle a larger number of cases. The reader also wonders if the likelihood model could be verified using the training sets given. That would be illuminating, and may yield a better understanding of what the energy functions should be used.