Deep-Learning Model Predicts Cell Development with High Accuracy
MIT engineers developed a deep-learning model that predicts individual cell behaviors during fruit fly embryonic development with 90% accuracy. This breakthrough could forecast organ formation and identify early disease markers.
During the critical initial stages of development, tissues and organs emerge through the intricate shifting, splitting, and growth of thousands of cells. MIT engineers have pioneered a new method to predict, minute by minute, how individual cells will fold, divide, and rearrange during the earliest phase of a fruit fly's growth. This innovative deep-learning model holds promise for forecasting the development of more complex tissues, organs, and organisms, and could assist scientists in identifying cellular patterns linked to early-onset diseases such as asthma and cancer.
Published in the journal Nature Methods, the team's new deep-learning model learns and then accurately predicts how specific geometric properties of individual cells change throughout fruit fly development. The model meticulously records and tracks cell properties, including position and contact with neighboring cells at any given moment.
The model was applied to videos of developing fruit fly embryos, each commencing as a cluster of approximately 5,000 cells. Researchers observed that the model could predict with 90 percent accuracy how each of these 5,000 cells would fold, shift, and rearrange minute by minute during the initial hour of development. This period, known as gastrulation, sees the embryo transform from a smooth, uniform shape into more defined structures and features.
"This very initial phase is known as gastrulation, which takes place over roughly one hour, when individual cells are rearranging on a time scale of minutes," explains study author Ming Guo, an associate professor of mechanical engineering at MIT. "By accurately modeling this early period, we can start to uncover how local cell interactions give rise to global tissues and organisms."
The researchers aim to extend the model's application to predict cell-by-cell development in other species, such as zebrafish and mice, to identify common patterns across species. The method could also potentially reveal early patterns of disease, like asthma, by discerning how asthma-prone lung tissue initially develops, a process currently unknown.
"Asthmatic tissues show different cell dynamics when imaged live," notes co-author and MIT graduate student Haiqian Yang. "We envision that our model could capture these subtle dynamical differences and provide a more comprehensive representation of tissue behavior, potentially improving diagnostics or drug-screening assays."
The study's co-authors include Markus Buehler, the McAfee Professor of Engineering in MIT’s Department of Civil and Environmental Engineering; George Roy and Tomer Stern of the University of Michigan; and Anh Nguyen and Dapeng Bi of Northeastern University.
A developing Drosophila embryo was recorded using light sheet microscopy. The embryo is segmented, tracked, and reconstructed. Cell boundaries show how individual cells fold, divide, and rearrange. (Credit: MultiCell authors)
Points and Foams: A Dual-Graph Approach
Scientists typically model embryo development either as a point cloud, where each point represents a cell moving over time, or as a "foam," where cells are depicted as bubbles shifting and sliding against each other. Rather than choosing between these, Guo and Yang adopted a combined approach.
"There’s a debate about whether to model as a point cloud or a foam," Yang states. "But both are essentially different ways of modeling the same underlying graph, which is an elegant way to represent living tissues. By combining these as one graph, we can highlight more structural information, like how cells are connected to each other as they rearrange over time."
Central to the new model is a "dual-graph" structure that represents a developing embryo as both moving points and bubbles. This dual representation allows researchers to capture more detailed geometric properties of individual cells, such as nucleus location, contact with neighboring cells, and whether a cell is folding or dividing at a given moment.
As a proof of principle, the team trained the model to "learn" how individual cells change during fruit fly gastrulation. "The overall shape of the fruit fly at this stage is roughly an ellipsoid, but there are gigantic dynamics going on at the surface during gastrulation," Guo explains. "It goes from entirely smooth to forming a number of folds at different angles. And we want to predict all of those dynamics, moment to moment, and cell by cell."
Precision Prediction
For this study, the researchers utilized high-quality videos of fruit fly gastrulation provided by their collaborators at the University of Michigan. These one-hour recordings capture developing fruit flies at single-cell resolution, complete with detailed labels of individual cell edges and nuclei — exceptionally rare and valuable data.
"These videos are of extremely high quality," Yang emphasizes. "This data is very rare, where you get submicron resolution of the whole 3D volume at a pretty fast frame rate."
The team trained their new model using data from three out of four fruit fly embryo videos, enabling it to learn the interactions and changes of individual cells during development. When tested on a completely new fruit fly video, the model accurately predicted the minute-by-minute changes of most of the embryo’s 5,000 cells.
Specifically, the model predicted properties of individual cells, such as whether they would fold, divide, or continue sharing an edge with a neighboring cell, with approximately 90 percent accuracy.
"We end up predicting not only whether these things will happen, but also when," Guo adds. "For instance, will this cell detach from this cell seven minutes from now, or eight? We can tell when that will happen."
The team believes that, in principle, this new model and its dual-graph approach could predict cell-by-cell development in other multicellular systems, including more complex species and even certain human tissues and organs. The primary limiting factor remains the availability of high-quality video data.
"From the model perspective, I think it’s ready," Guo asserts. "The real bottleneck is the data. If we have good quality data of specific tissues, the model could be directly applied to predict the development of many more structures."
This work received support, in part, from the U.S. National Institutes of Health.
The paper, titled “MultiCell: geometric learning in multicellular development,” was published in Nature Methods.