![wolf 3d reference wolf 3d reference](https://media.sketchfab.com/models/a2d1070c390b4da0a892da3c51c2c128/thumbnails/993253fed0d74fe1b11d64c175beb5a4/720x405.jpeg)
Finally, traditional data-integration methods consider any perturbation between datasets that affects most cells as a technical batch effect, but biological perturbations may also affect most cells.
#WOLF 3D REFERENCE FULL#
Furthermore, contextualizing a single dataset requires rerunning the full integration pipeline, presupposing both computational expertise and resources.
![wolf 3d reference wolf 3d reference](https://live.staticflickr.com/65535/49021174831_80725b390a_b.jpg)
This requires access to all relevant datasets, which can be hindered by legal restrictions on data sharing. Data-integration methods are typically used to overcome these batch effects in reference construction 7. Yet query datasets and reference atlases typically comprise data generated in different laboratories with different experimental protocols and thus contain batch effects. Learning from a reference atlas requires mapping a query dataset to this reference to generate a joint embedding. Reference atlases provide an opportunity to radically change how we currently analyze single-cell datasets: by learning from the appropriate reference, we could automate annotation of new datasets and easily perform comparative analyses across tissues, species and disease conditions. These references help to understand the cellular heterogeneity that constitutes natural and inter-individual variation, aging, environmental influences and disease. Large single-cell reference atlases 1, 2, 3, 4 comprising millions 5 of cells across tissues, organs, developmental stages and conditions are now routinely generated by consortia such as the Human Cell Atlas 6. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies.