Researchers at Helmholtz Munich, the Technical University of Munich, and the Stowers Institute for Medical Research released RegVelo, a deep-learning framework that predicts developmental trajectories and identifies the regulatory interactions driving cell fate decisions. The validation matters: predictions were experimentally tested via CRISPR/Cas9 knockouts and Perturb-seq across cell-cycle progression, pancreatic endocrinogenesis, hematopoiesis, and zebrafish neural crest differentiation โ€” and the tool successfully recovered all known terminal cell states across the four systems. Concrete numbers from the cell-cycle data: cross-boundary correctness 0.864/1.0, velocity consistency 0.873, Spearman correlation 0.683 against FUCCI ground-truth scores. Published as a bioRxiv preprint, peer review pending.

RegVelo combines two existing single-cell analysis techniques and learns the joint model end-to-end. RNA velocity (La Manno et al., 2018) infers developmental direction from the ratio of unspliced to spliced mRNA in scRNA-seq data โ€” fast-changing transcripts indicate which way a cell is moving in state space. Gene regulatory network inference identifies who-regulates-whom in transcription factor cascades. Both are useful alone but produce different and sometimes contradictory predictions. RegVelo's contribution is a neural network that encodes scRNA-seq data, runs through a decoder producing cell-gene-specific latent time, and jointly infers both velocity and regulatory network in one pass. The output: for any cell, predict the next state, the genes driving the transition, and what happens when you perturb a specific regulator. The Perturb-seq validation is the gold standard โ€” actually knock out the predicted regulator with CRISPR, measure the result, and compare against RegVelo's pre-experiment prediction. First author Weixu Wang, co-senior authors Fabian J. Theis (Helmholtz Munich) and Tatjana Sauka-Spengler (Stowers Institute, TU Munich). Theis's lab has been one of the leading single-cell ML groups for a decade โ€” scVI in 2018, scvi-tools as the field standard since โ€” so the result isn't a one-off.

The single-cell-ML field has been building toward this exact integration for roughly five years. scVI (Theis lab, 2018) was the first major deep-learning model for scRNA-seq batch correction. cellxgene and the Human Cell Atlas built the data infrastructure. RNA velocity arrived as a separate track in 2018. Gene regulatory networks have been inferred with shallower methods (GENIE3, ARACNe). RegVelo is the synthesis: one model, learned end-to-end, with experimentally validated predictions across four cell systems. The pattern matters because cell fate prediction is the upstream question for most regenerative medicine, drug discovery, and developmental biology โ€” knowing which gene to perturb to push a cell from one fate to another is what's actually buildable as therapy downstream. CoCoGraph (#814) and FINGERS-7B (#808) are companions in the same broader thread: biology becoming AI-tractable not just at the molecule level (CoCoGraph) or the diagnostic level (FINGERS-7B) but at the cell-fate-decision level (RegVelo). The Theis-lab/Stowers/TU Munich collaboration matters because it's not a vendor product โ€” it's the academic single-cell-ML community shipping its current best joint inference.

bioRxiv preprint, peer review pending. Code/open-source not specified in the announcement โ€” the Theis lab usually ships open source (scvi-tools is widely used), so expect a release if peer review completes cleanly. For working biologists: the CRISPR/Perturb-seq validation across four test systems is the strong signal โ€” RegVelo's predictions held up against the experimental gold standard, not just held up against in-silico held-out test sets. For builders watching bio-ML: the joint-inference pattern (combining established techniques end-to-end via a deep learning backbone rather than running them separately and stitching the outputs) is the architectural lesson, and it'll get copied across other single-cell modalities. For the broader audience: this is what "AI for biology" looks like when it's serious โ€” specific institutional labs, named techniques being unified, experimental validation against established gold standards, no breathless claims about curing disease, just measurable improvements on the upstream prediction problem that makes the downstream therapies possible.