Systematic Reconstruction of Molecular Pathway Signatures Using Scalable Single-Cell Perturbation Screens

In the field of functional genomics, researchers have long been dedicated to predicting causal regulatory relationships from observational data. However, despite modern technologies enabling the measurement of diverse molecular modalities, inferring causal regulatory relationships from observational data remains challenging. In particular, the identification and quantification of downstream effectors of signaling pathway regulators are a key focus of genomic research. The advent of genome editing tools such as CRISPR has enabled massively parallel screening, especially when combined with single-cell RNA sequencing (scRNA-seq) in Perturb-seq technology, which allows causal inference through genetic perturbations. However, existing Perturb-seq applications primarily focus on resting cells, which may fail to accurately describe context-dependent gene functions.

To address this issue, researchers developed a scalable Perturb-seq workflow that combines combinatorial indexing and next-generation sequencing to systematically identify targets of signaling regulators in diverse biological contexts. Through this approach, researchers were not only able to quantify the heterogeneity of perturbation efficiency but also infer changes in signaling pathway activation in vivo and in situ.

Source of the Paper

This paper was co-authored by Longda Jiang, Carol Dalgarno, Efthymia Papalexi, Isabella Mascio, Hans-Hermann Wessels, Huiyoung Yun, Nika Iremadze, Gila Lithwick-Yanai, Doron Lipson, and Rahul Satija, among others. The authors are affiliated with multiple research institutions, including the New York Genome Center, the Center for Genomics and Systems Biology at New York University, and Ultima Genomics. The paper was published in March 2025 in the journal Nature Cell Biology, with the DOI 10.1038/s41556-025-01622-z.

Research Process and Results

Research Process

  1. Experimental Design and Cell Culture
    Researchers selected six cancer cell lines of different origins (A549, MCF7, HT29, HAP1, BXPC3, and K562) and expressed the CRISPRi dCas9-KRAB-MeCP2 cassette in these cells. To study the activity of different signaling pathways, the researchers subjected each cell line to five different stimuli: IFN-β, IFNγ, TGF-β, TNF, and insulin. For each signaling pathway, 44 to 61 known regulators were selected, and three independent single guide RNAs (sgRNAs) were designed for each gene.

  2. Perturb-seq Experiment
    Researchers used the Parse Biosciences EverCode Whole Transcriptome Mega Kit for single-cell RNA sequencing, combined with combinatorial indexing to enhance the scalability and cost-effectiveness of the experiment. In the experiment, 2.6 million cells were sequenced, and combinatorial Parse barcodes were used to identify cell types and stimulation conditions, while sgRNA barcodes were used to identify genetic perturbations.

  3. Data Analysis and Algorithm Development
    To address the technical and biological heterogeneity in Perturb-seq data, researchers developed a computational framework called MixScale. MixScale quantifies the perturbation strength in each cell, optimizing the identification of differentially expressed genes (DEGs). MixScale first estimates a “perturbation vector” for each cell and then quantifies the degree of perturbation through scalar projection. Additionally, researchers introduced a weighted multivariate regression (WMVReg) method to further improve the robustness of DEG identification.

  4. Extraction and Validation of Signaling Pathway Signatures
    Researchers identified conserved perturbation programs across different cell lines and signaling pathways using a MulticCA decomposition method. These programs reflect changes in downstream gene expression of specific regulators. The researchers also validated these signaling pathway signatures using external datasets, including IFNβ-stimulated monocytes, IFNγ-stimulated PBMCs, and TGFβ-stimulated ovarian cancer cell lines.

Key Results

  1. Effectiveness of the MixScale Framework
    MixScale can quantify gradient responses in CRISPRi perturbation data, especially in cases of heterogeneous cell perturbation efficiency. Through MixScale, researchers were able to more accurately identify DEGs while maintaining high statistical power even with low cell numbers.

  2. Conservation and Specificity of Signaling Pathway Signatures
    Researchers found that regulators of different signaling pathways target highly overlapping downstream genes within the same pathway but exhibit significant specificity across different cell lines. For example, responses to the IFNγ and IFNβ pathways were conserved across multiple cell lines, while the TGFβ and insulin signaling pathways showed significant cell-type specificity.

  3. Application of Signaling Pathway Signatures
    Researchers successfully inferred IFNβ signaling activation in COVID-19 patients and identified TNF signaling pathway activation in non-immune cells in Crohn’s disease. Additionally, using spatial transcriptomics, researchers identified spatial activation patterns of the TGFβ signaling pathway in a mouse colon injury model.

Conclusions and Significance

This study systematically reconstructed molecular signatures of multiple signaling pathways by developing a scalable Perturb-seq workflow and the MixScale computational framework. These signatures not only extend existing gene sets but also accurately infer signaling pathway activation in diverse biological contexts. The study provides new tools and methods for understanding the regulatory mechanisms of signaling pathways and lays the foundation for future functional genomics research.

Research Highlights

  1. Scalable Perturb-seq Workflow: By combining combinatorial indexing and next-generation sequencing, researchers were able to systematically identify targets of signaling regulators in large-scale experiments.
  2. MixScale Computational Framework: MixScale quantifies the heterogeneity of cell perturbation efficiency and optimizes DEG identification, improving statistical power.
  3. Conservation and Specificity of Signaling Pathway Signatures: Researchers identified conserved perturbation programs across different cell lines and signaling pathways and validated the application value of these signatures in diverse biological contexts.
  4. Application Prospects: The study provides new tools for understanding signaling pathway activation in diseases, particularly demonstrating its broad application potential in COVID-19 and Crohn’s disease.

Other Valuable Information

Researchers also noted that future studies could apply this framework to other biological processes and cell types, incorporating multimodal data such as chromatin accessibility and protein levels to further enrich the understanding of signal transduction mechanisms. Additionally, the application of combinatorial perturbation technology will provide new perspectives for exploring interactions between regulators within and across pathways.

This paper, through innovative experimental design and computational methods, provides important tools and insights for the field of functional genomics, demonstrating the potential of Perturb-seq technology in understanding complex biological systems.