PgmNr Y3169: Classifying Microscopy Images with Deep Learning.

Authors:
Oren Kraus ^1,3; Jimmy Ba ¹; Charles Boone ^2,3,4; Brenda Andrews ^2,3,4; Brendan Frey ^1,3,4

Institutes
1) Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada; 2) Department of Molecular Genetics, University of Toronto, Ontario, Canada; 3) Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada; 4) Banting and Best Department of Medical Research, University of Toronto, Canada.

Keyword: Informatics/Computational Biology

Abstract:

High content screening (HCS) technologies have enabled large scale imaging experiments for studying cell biology and for drug screening. These systems produce hundreds of thousands of microscopy images per day and their utility depends on automated image analysis. Recently, deep learning approaches that learn feature representations directly from pixel intensity values have dominated object recognition challenges. The recognition tasks in these challenges typically consist of a single centered object and existing models are not directly applicable to microscopy datasets. Here we develop an approach that combines deep convolutional neural networks (CNNs) with multiple instance learning (MIL) in order to classify and segment microscopy images using only whole image level annotations. MIL is a framework that enables supervised learning models to train on datasets that only have labels for sets of data points.

We introduce a new neural network architecture that uses MIL to simultaneously classify and segment microscopy images with populations of cells. Building supervised classifiers based on segmented single cells remains time consuming and difficult for researchers. Combining CNNs with MIL enables training classifiers with whole microscopy images, even images containing mixed populations, using whole image level labels. We base our approach on the similarity between the aggregation function used in MIL and pooling layers used in CNNs. We show that training end-to-end MIL CNNs outperforms several previous methods on both mammalian and yeast datasets without requiring any segmentation steps. On a publically available drug screen of MFC-7 breast cancer cells (Broad Bioimage Benchmark Collection, image set BBBC021v1) we achieve 97% accuracy at predicting the mechanism of action of different treatments. On a yeast protein localization dataset, we achieve 96% accuracy at predicting localization for proteins that localize to a single subcellular compartment.