Preloader

Deep distributed computing to reconstruct extremely large lineage trees

  • 1.

    Zou, Q., Wan, S., Zeng, X. & Ma, Z. S. Reconstructing evolutionary trees in parallel for massive sequences. BMC Syst. Biol. 11, 100 (2017).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 2.

    Mora, C., Tittensor, D. P., Adl, S., Simpson, A. G. & Worm, B. How many species are there on Earth and in the ocean? PLoS Biol. 9, e1001127 (2011).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 3.

    Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 4.

    Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 5.

    Kalhor, R. et al. Developmental barcoding of whole mouse via homing CRISPR. Science 361, eaat9804 (2018).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 6.

    Chan, M. M. et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–82 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 7.

    Bowling, S. et al. An engineered CRISPR–Cas9 mouse line for simultaneous readout of lineage histories and gene expression profiles in single cells. Cell 181, 1410–1422 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 8.

    Salvador-Martinez, I., Grillo, M., Averof, M. & Telford, M. J. Is it possible to reconstruct an accurate cell lineage using CRISPR recorders? eLife 8, e40292 (2019).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 9.

    McKenna, A. et al. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science 353, aaf7907 (2016).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 10.

    Raj, B. et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 36, 442–450 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 11.

    Spanjaard, B. et al. Simultaneous lineage tracing and cell-type identification using CRISPR–Cas9-induced genetic scars. Nat. Biotechnol. 36, 469–473 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 12.

    Alemany, A., Florescu, M., Baron, C. S., Peterson-Maduro, J. & van Oudenaarden, A. Whole-organism clone tracing using single-cell sequencing. Nature 556, 108–112 (2018).

    CAS 
    PubMed 

    Google Scholar 

  • 13.

    Quinn, J. J. et al. Single-cell lineages reveal the rates, routes, and drivers of metastasis in cancer xenografts. Science 371, eabc1944 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 14.

    Simeonov, K. P. et al. Single-cell lineage and transcriptome reconstruction of metastatic cancer reveals selection of aggressive hybrid EMT states. Cancer Cell 39, 1150–1162.e9 (2021).

    CAS 
    PubMed 

    Google Scholar 

  • 15.

    Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 16.

    Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 17.

    Cao, J. et al. A human cell atlas of fetal gene expression. Science 370, eaba7721 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 18.

    Sender, R., Fuchs, S. & Milo, R. Revised estimates for the number of human and bacteria cells in the body. PLoS Biol. 14, e1002533 (2016).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 19.

    Barbera, P. et al. EPA-ng: massively parallel evolutionary placement of genetic sequences. Syst. Biol. 68, 365–369 (2019).

    PubMed 

    Google Scholar 

  • 20.

    Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 21.

    Simonsen, M., Mailund, T. & Pedersen, C. N. S. in International Workshop on Algorithms in Bioinformatics 113–122 (Springer, 2008).

  • 22.

    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 23.

    Robinson, D. F. & Foulds, L. R. Comparison of phylogenetic trees. Math. Biosci. 53, 131–147 (1981).

    Google Scholar 

  • 24.

    Yarza, P. et al. The All-Species Living Tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains. Syst. Appl. Microbiol. 31, 241–250 (2008).

    CAS 
    PubMed 

    Google Scholar 

  • 25.

    Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 26.

    Frieda, K. L. et al. Synthetic recording and in situ readout of lineage information in single cells. Nature 541, 107–111 (2017).

    CAS 
    PubMed 

    Google Scholar 

  • 27.

    Jones, M. G. et al. Inference of single-cell phylogenies from lineage tracing data using Cassiopeia. Genome Biol. 21, 92 (2020).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 28.

    Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).

    PubMed 

    Google Scholar 

  • 29.

    Hwang, B. et al. Lineage tracing using a Cas9-deaminase barcoding system targeting endogenous L1 elements. Nat. Commun. 10, 1234 (2019).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 30.

    Grünewald, J. et al. A dual-deaminase CRISPR base editor enables concurrent adenine and cytosine editing. Nat. Biotechnol. 38, 861–864 (2020).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 31.

    Zhang, X. et al. Dual base editor catalyzes both cytosine and adenine base conversions in human cells. Nat. Biotechnol. 38, 856–860 (2020).

    CAS 
    PubMed 

    Google Scholar 

  • 32.

    Sakata, R. C. et al. Base editors for simultaneous introduction of C-to-T and A-to-G mutations. Nat. Biotechnol. 38, 865–869 (2020).

    CAS 
    PubMed 

    Google Scholar 

  • 33.

    Du, Z., Santella, A., He, F., Tiongson, M. & Bao, Z. De novo inference of systems-level mechanistic models of development from live-imaging-based phenotype analysis. Cell 156, 359–372 (2014).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 34.

    Ciccarelli, F. D. et al. Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283–1287 (2006).

    CAS 
    PubMed 

    Google Scholar 

  • 35.

    Brown, C. T. et al. Unusual biology across a group comprising more than 15% of domain Bacteria. Nature 523, 208–211 (2015).

    CAS 
    PubMed 

    Google Scholar 

  • 36.

    Poe, S. & Swofford, D. L. Taxon sampling revisited. Nature 398, 299–300 (1999).

    CAS 
    PubMed 

    Google Scholar 

  • 37.

    Chow, K. K. et al. Imaging cell lineage with a synthetic digital recording system. Science 372, eabb3099 (2021).

  • 38.

    Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 39.

    Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 40.

    Yu, M. K. et al. DDOT: a Swiss army knife for investigating data-driven biological ontologies. Cell Syst. 8, 267–273 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 41.

    Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 42.

    Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 43.

    Shen, W., Le, S., Li, Y. & Hu, F. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11, e0163962 (2016).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 44.

    Schliep, K. P. phangorn: phylogenetic analysis in R. Bioinformatics 27, 592–593 (2010).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 45.

    Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).

    CAS 
    PubMed 

    Google Scholar 

  • 46.

    Madeira, F. et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 47, W636–W641 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 47.

    Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 48.

    Baum, B. R. PHYLIP: phylogeny inference package. Version 3.2. Quarterly Review of Biology 64, 539–541 (1989).

    Google Scholar 

  • 49.

    Zhao, L., Liu, Z., Levy, S. F. & Wu, S. Bartender: a fast and accurate clustering algorithm to count barcode reads. Bioinformatics 34, 739–747 (2018).

    CAS 
    PubMed 

    Google Scholar 

  • 50.

    Satopaa, V., Albrecht, J., Irwin, D. & Raghavan, B. in 2011 31st International Conference on Distributed Computing Systems Workshops 166–171 (IEEE, 2011).

  • 51.

    Levenshtein, V. I. in Soviet Physics Doklady, Vol. 10 707–710 (Doklady Akademii Nauk SSSR, 1966).

  • 52.

    Brunner, E. & Munzel, U. The nonparametric Behrens–Fisher problem: asymptotic theory and a small-sample approximation. Biom. J. 42, 17–25 (2000).

    Google Scholar 

  • Source link