Can AI Build Virtual Cells? Scientists Are Modeling the Smallest Units of Life}

AI researchers are pioneering virtual cell models, aiming to revolutionize biology by simulating cellular functions, with significant progress and ongoing debates about feasibility.

Can AI Build Virtual Cells? Scientists Are Modeling the Smallest Units of Life}

图片

Editor | Luobo Pi

If Stephen Quake's plan succeeds, future biologists may no longer need to spend extensive time manipulating pipettes.

“Our goal,” he says, “is to create computational tools that shift cell biology from ‘90% reliance on experiments and 10% on computation’ to ‘90% reliance on computation and 10% on experiments.’”

Stephen Quake is the chief scientist at the Chan Zuckerberg Initiative (CZI) and one of the leading researchers in the virtual cell project.

These virtual cells are based on AI models capable of generating information that currently takes weeks of wet lab experiments, such as tumor cell responses to specific drugs.

“This will become a powerful tool for understanding the root causes of diseases,” Quake envisions, expecting scientists to mainly verify predictions through experiments.

The research on creating virtual cells is still in its early stages, but the idea has already attracted intense interest from academic and industrial labs worldwide. CZI, a nonprofit organization developing open datasets and tools, plans to invest hundreds of millions of dollars over the next decade to create virtual cells.

Google DeepMind CEO Demis Hassabis also mentioned earlier this year that their company has a virtual cell project.

“This is a daunting task,” says Jan Ellenberg, a molecular biologist at the Swedish Sahlgrenska Academy, who is also co-leading the Alpha Cell project, set to launch in 2026. “What can be done now is to start pioneering projects to demonstrate the feasibility of this approach in principle.”

However, some scientists argue that although developing virtual cells is a significant long-term goal in biology, currently it is mostly hype with few concrete results or clear pathways.

“It’s mainly used as a rallying and fundraising mechanism, and it’s effective,” says Anshul Kundaje, a computational biologist at Stanford University. “Investors are pouring huge sums into this field.”

Errors in Machines

For decades, biologists have used computer simulations to model cell behavior.

In 2012, scientists created the first comprehensive model of a cell, simulating the internal workings of Mycoplasma genitalium, a bacterium with only 525 genes.

But these early studies “often aimed to build a complete mechanical model of a cell,” explains Silvana Konermann, a computational biologist at Stanford’s Palo Alto Research Center.

In contrast, current efforts benefit from advances in AI, enabling scientists to develop complex data representations. “Building models that can learn from data is revolutionary,” Quake states.

Early virtual cell models mainly focused on a single data type: sequencing data of messenger RNA molecules in individual cells, essentially catalogs of gene activity and snapshots of cell states.

These data form the basis for creating “cell atlases” of humans and other organisms, revealing previously underappreciated diversity. Researchers are generating large-scale single-cell sequencing datasets to support their virtual cell models. CZI plans to release sequencing data for one billion cells, expanding its database beyond 100 million cells. In February, Arc released sequencing data of 100 million cancer cells treated with hundreds of drugs.

Hani Goodarzi, a systems biologist at Arc, emphasizes that single-cell sequencing data is highly significant because it allows biological models to generate data like large language models do.

Competitive Cell Manufacturing

Researchers have begun developing single-cell AI models using these datasets. Arc recently released a model called “State,” their first virtual cell. They also launched a $175,000 prize challenge to use such models to predict how human stem cells respond to genetic modifications.

Many scientists say these models’ predictive capabilities are still insufficient for practical applications. “They perform poorly,” Kundaje remarks, as some models are being benchmarked with new datasets.

Some believe that virtual cells need to incorporate other data types, such as optical and electron microscopy images, which show how cellular components interact and change over time. “We need to go beyond single-cell sequencing data,” Ellenberg states.

One challenge in developing virtual cells is that the concept varies among researchers. “I think there’s no clear definition of a virtual cell,” says Jonah Cool, head of CZI’s 1-billion-cell project.

Harvard cell biologist Tim Mitchison recognizes this at a CZI workshop aimed at planning the construction of virtual cells. “There’s almost no consensus among participants,” he notes. “I am more optimistic about the prospects,” he adds, believing that in the near future, AI models targeting specific cell types or gene regulation will become feasible.

Quake admits that freeing biologists from labs will take a long time, but we have plenty of time to achieve it.

“Biologists are not ready, and neither are these models,” he concludes.

Related content: https://www.nature.com/articles/d41586-025-02011-0

Subscribe to QQ Insights

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe