Identifying nuclear phenotypes using semi-supervised metric learning.
Singh S., Janoos F., Pécot T., Caserta E., Leone G., Rittscher J., Machiraju R.
In systems-based approaches for studying processes such as cancer and development, identifying and characterizing individual cells within a tissue is the first step towards understanding the large-scale effects that emerge from the interactions between cells. To this end, nuclear morphology is an important phenotype to characterize the physiological and differentiated state of a cell. This study focuses on using nuclear morphology to identify cellular phenotypes in thick tissue sections imaged using 3D fluorescence microscopy. The limited label information, heterogeneous feature set describing a nucleus, and existence of subpopulations within cell-types makes this a difficult learning problem. To address these issues, a technique is presented to learn a distance metric from labeled data which is locally adaptive to account for heterogeneity in the data. Additionally, a label propagation technique is used to improve the quality of the learned metric by expanding the training set using unlabeled data. Results are presented on images of tumor stroma in breast cancer, where the framework is used to identify fibroblasts, macrophages and endothelial cells--three major stromal cells involved in carcinogenesis.