Carpenter Builds Open Source Imaging Software



Loading...

Best Practices Winner: The Broad Institute of MIT and Harvard
Project: CELLPROFILER
Category: IT & Informatics

By Kevin Davies

July 21, 2009
| Anne Carpenter trained as a traditional cell biologist specializing in microscopy with no intention of writing image analysis software. “It wasn’t until I needed software to do something that existing commercial software couldn’t do that I became interested in writing software myself,” says Carpenter. The genesis of CellProfiler was “completely out of necessity.”

Carpenter found that the commercial software bundled with automated microscopes was good at measuring certain cell types, but little help measuring the size of Drosophila cells during her postdoc with David Sabatini at the Whitehead Institute. She came across some promising algorithms doing a literature search, but didn’t have any way of implementing them. “So I sent an email to the MIT computer science department asking if anyone could help out for a couple of hours a week.” A student named Thouis Jones agreed to help, and soon made it the subject of his Ph.D.

The satisfaction of developing useful software for the cell biology community persuaded Carpenter to abandon her postdoc project and focus on CellProfiler software development, training and implementation. “It became much more compelling to help dozens of other people working on image analysis for their projects versus doing my own,” she says.

One of those grateful beta testers was Scott Floyd, a cell biologist and physician at Beth Israel Deaconess Hospital. Floyd was screening for genes involved in cellular response to DNA damage in the search for drugs that could protect cancer patients against the side effects of radiation. He could recognize telltale increases in the speckled appearance of cell nuclei by eye, but struck out using commercial software.

The software Carpenter built—CellProfiler—made its free open source debut in December 2005, and was detailed in Genome Biology in 2006. In January 2007, Jones and Carpenter established the Imaging Platform group at the Broad Institute, focusing on new algorithms and data analysis methods. From here, Carpenter can help dozens of researchers working on clinically relevant projects. “Everything we develop becomes open source, and the easiest way to get that out to the public is to put it into the CellProfiler interface.”

Profiler Packages
In contrast to the tedious and error-prone manual inspection of identifying specific cell shapes or morphology, CellProfiler’s easy point-and-click interface and modular structure allows operators to customize the workflow to a particular experiment—even computational novices. Researchers can build a “pipeline” of modules, each performing a set function on the images. This might be followed by measurements for each cell or for an entire image, such as size, location, and shape or the intensity and texture of the staining pattern within cells.

Carpenter’s team of computer scientists and biologists helps Broad colleagues test hundreds of thousands of samples to understand gene function and identify drug candidates. Her group operates “like a faculty research lab at any academic institution, but we are unique in having a very strong technology focus, and secondly, in being extraordinarily more collaborative than a typical faculty lab.”

CellProfiler comes into its own in the high-throughput analysis of images from robotic fluorescent light microscopes, such as those offered by companies like Cellomics, GE Healthcare, and PerkinElmer, essentially turning images into numbers. The software’s strength lies in its flexibility and sophistication, which allow “accurate and rich measurements coming out of the cells.” But Carpenter says the commercial packages still excel in their prepackaged convenience, and her team will recommend using commercial software when collaborators are screening a simple phenotype. “We only get involved when people are stumped on their project.”

Maturity Level
Although CellProfiler has been gaining admirers for a few years, Carpenter only submitted for Bio•IT World’s Best Practices competition once she was satisfied that the program had reached a certain level of maturity and popularity. Signs of maturity include the fact that the software was downloaded 300 times per month in 2008 and in total some 9000 times since its introduction, and has amassed more than 100 citations.

Perhaps most important was “the killer application”—CellProfiler Analyst—which was submitted for publication in late 2008 and published in Proceedings of the National Academy of Sciences in early 2009. This tool looks at those measurements and performs machine-learning cell sorting. Says Carpenter: “You don’t need to know anything about machine learning to use the software. It really just looks like a video game.”

“We knew that would be a slam dunk popular tool for using CellProfiler data,” she says. “Previously, if a biologist had a tough phenotype, they’d need six months writing a new algorithm. Here, provided we can find the cells in the image, we can use this machine learning. It typically takes a biologist anywhere from 1 hour to 1 day of scoring cells by eye, and the computer has learned what they’re looking for. So pretty much any phenotype we come across, we can score in a day.”

CellProfiler has won many dedicated fans over the past few years. Michael Yaffe (Floyd’s boss) calls CellProfiler “an indispensable component of a large-scale high-throughput screen” that “adds an entirely new dimension to analysis, leading to generation of a robust and novel dataset that will be extraordinarily useful for years to come.”

Another satisfied user is John McLaughlin, who runs a screening facility at Rigel Pharmaceuticals producing thousands of images weekly, and hasn’t looked back since trying CellProfiler two years ago. “It had everything I needed,” he says. McLaughlin likes the underlying Matlab platform, and its compatibility with a compute cluster, which is not found with all commercial packages. “My goal is to find drugs to cure disease, not learn (yet another) computer language,” says McLaughlin.

Carpenter’s team is currently involved in numerous wide-ranging collaborations, from studying the genetic underpinnings of breast cancer with Eric Lander’s group to improving the analysis of neuronal cell types, which she calls “challenging for the best algorithms.” Other projects involve screening potential drugs for infectious diseases including tuberculosis in human cells, and whole-organism analysis of the nematode worm to develop novel antibiotics. On the technology side, her team is working to enable CellProfiler to do movie analysis and 3-D image analysis. “Right now, it’s fairly impractical to collect large sets of 3-D images, but as that becomes more practical, we’ll work on algorithms to study those images.” 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

oracle_clinical
eClinical Visions - Clinical Trial Management: Enabling Operational Efficiency
Sponsored by Oracle

Read how contributors from Genzyme, Duke Clinical Research Institute, Accenture, Oracle Health Sciences and others address some of the most pertinent challenges facing the biopharmaceutical industry including... Globalization of clinical trials driven by the need to reduce costs and recruit participants; greater outsourcing; escalating regulatory demands; increased trial complexity; and post-marketing studies. Download this paper to gain new insight into:

  • Recent progress made in addressing these challenges
  • Expert opinion on clinical trial management systems (CTMS) for improving trial efficiencies
  • How to cut trial costs and enhance the productivity of trial participants


oracle_RDC
Remote Data Capture – Acquisition and Analysis
Sponsored by Oracle

Today nearly half of all clinical trials are conducted electronically, and rising! Electronic Data Capture (EDC) technology provides industry-wide opportunities, along with challenges, that are being addressed. In this informative report industry experts and users from Pfizer, PPD, C3i and Oracle Health Sciences discuss the impact of EDC and its newest zero footprint; online iteration.  It can used anywhere, world-wide, where the Internet is available while placing greater onus on global trial support. The critical focus of this new technology is that it must support the work of the person at the heart of the clinical trial system– the investigator. Download this report to learn more about:

  • Trends and Issues in an Electronic Clinical Data Management World
  • The New Remote Data Capture Paradigm 
  • Improving and Monitoring Clinical Data Management in the eClinical Age
  • Optimizing and Supporting Remote Data Capture


oracle_video
Technology Video Report: A Day in the Life with Remote Data Capture (Next-Gen EDC)
Sponsored by Oracle
See why Oracle Remote Data Capture (RDC) Onsite is the next generation in electronic data capture with its user-friendly method to collect, clean, review, and verify clinical trial data. Providing unprecedented performance with real-time data capture, Oracle RDC Onsite simplifies source data verification. With a clear, consistent view of study data across all sites, the benefits include reduced monitoring time, decreased queries and discrepancies, and less time to database lock.



Life Science Webcasts & Podcasts

Predict or Perish! Shaping the Practices of Clinical Trials
Decisionview webinarSponsored by:  DecisionView

Predictive Analytics are a key differentiator in running your clinical trials successfully through 2010 and beyond. They will help you to optimize your patient enrollment, reduce your clinical operations costs and minimize your financial liability in the clinical supply chain. In this session, you will:
• Learn what predictive analytics are and what they are not
• Understand why you need predictive analytics to run your clinical trials, and
• Explore how predictive analytics will shape the future of clinical trials

Download Now. 

 



More Podcasts

Job Openings

The University of Washington Department of Genome Sciences is seeking a LINUX SYSTEMS ENGINEERING MANAGER to lead a team in a diverse scientific computing environment that includes multiple HPC systems, petascale storage, and custom application servers. Apply online at UW Hires for req number 61505.  http://www.washington.edu/admin/hr/jobs/

Loading...

For reprints and/or copyright permission, please contact The YGS Group, 3650 West Market Street, York, PA;

(717) 505-9701 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.