Monday, September 1, 2014

Computational Image Analysis is a Centerpiece of Cell Biology

For most biologists, especially cell biologists, it doesn't take long before one realizes how important visual processing is for interpreting scientific data. Likely in the first few weeks of an undergraduate molecular biology lab, young scientists learn the importance of being able to differentiate a "good" gel from a "bad" one by the look of the bands. At a slightly more expert level, we are trained to discriminate the "look" of healthy cells from problematic cells under a microscope. The most seasoned scientists develop the uncanny skill of scanning hundreds of fluorescently stained cells and mentally processing a) if the experiment ran correctly, and b) find a "representative" field of view that can be used in a publication. Typically, this skill will be explained as intuition, and most scientists can also quickly tick off a list of instances when such a process has led to novel discoveries and countless publications.

The success of visual intuition is one of the most amazing aspects of the human brain. In fact, the performance of human/animal brains in these tasks is far superior to modern computers, famously illustrated by the Google "cat recognition problem". A recent publication from IBM Research in Science replicated neural architecture in a computer chip, and not surprisingly the test case of its performance was to test image recognition. Given this trajectory of computational and engineering efforts, it won't be long until the algorithms of visual intuition are more rigorously quantified (and more broadly exploited).

In the cell biology field, an excellent tool to help scientists grasp computational image analysis is the CellProfiler free-ware developed by MIT/Broad Institute. This program offers fairly powerful analysis tools (which most companies were charging thousands of dollars), and also is structured in a way to maximize learning potential of novice to intermediate users. Most importantly, widespread access is removing the mystery of image analysis to a whole generation of scientists.

One of the first reactions when a biologist applies computational image analysis on a data set is disbelief. Even after the kinks of the routine are ironed out, and the analysis is performed on a "known" data set (for example one that was previously published), the results often won't look right to the user. A typical case being where an experimental condition promotes the expression of a target protein. In data from populations averages (e.g. Western blots), it should be obvious that there is 5X enhancement of expression after treatment. Subsequently looking at microscope images, one can clearly see the difference between control and experiment groups-- often highlighted in a prototypical case in the figure of a publication. However, upon doing the image analysis on hundreds of cells, a common result is that there is tremendous heterogeneity in the sample. An average 5X enhancement may result from 20X from a small subset, and no change in a surprisingly large percentage of cells. The opposite cause is also often true-- a large phenotypic change is caused by a small shift of response across the entire population of cells. These types of results, while not contradictory to the previous experimental data, often make scientists uneasy.

As computational methods become more commonplace in understanding and engineering biological systems, it is important to embrace the messiness of single cell data. While it may initially feel counterproductive to more traditional intuitive methods (compound A causes translocation of TF Y is "cleaner" than A increases the likelihood of translocation in X% of cells by P to Q fold), the reality is that our intuition is operating by the same logic as (well crafted) computational methods. The power of combining human and machine analysis to cell data should be to improve efficiency in discovery. Such an approach has (slowly) started to take root in histopathology, and will surely find many more applications in the life sciences.