Preloader

CoLoRd: compressing long reads | Nature Methods

  • Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).

    CAS 
    Article 

    Google Scholar 

  • Stancu, M. C. et al. Mapping and phasing of structural variation in patient genomes using Nanopore sequencing. Nat. Commun. 8, 1326 (2017).

    Article 

    Google Scholar 

  • Jones, D. C., Ruzzo, W. L., Peng, X. & Katze, M. G. Compression of next-generation sequencing reads aided by highly efficient de novo assembly. Nucleic Acids Res. 40, e171 (2012).

    CAS 
    Article 

    Google Scholar 

  • Bonfield, J. K. & Mahoney, M. V. Compression of FASTQ and SAM format sequencing data. PloS ONE 8, e59190 (2013).

    CAS 
    Article 

    Google Scholar 

  • Roguski, Ł. & Deorowicz, S. DSRC 2: industry-oriented compression of FASTQ files. Bioinformatics 30, 2213–2215 (2014).

    CAS 
    Article 

    Google Scholar 

  • Grabowski., S., Deorowicz, S. & Roguski, Ł. Disk-based compression of data from genome sequencing. Bioinformatics 31, 1389–1395 (2015).

    CAS 
    Article 

    Google Scholar 

  • Roguski, Ł., Ochoa, I., Hernaez, M. & Deorowicz, S. FaStore: a space-saving solution for raw sequencing data. Bioinformatics 34, 2748–2756 (2018).

    CAS 
    Article 

    Google Scholar 

  • Liu, Y., Yu, Z., Dinger, M. E. & Li, J. Index suffix–prefix overlaps by (w, k) -minimizer to generate long contigs for reads compression. Bioinformatics 35, 2066–2074 (2018).

    Article 

    Google Scholar 

  • Chandak, S., Tatwawadi, K., Ochoa, I., Hernaez, M. & Weissman, T. SPRING: a next-generation compressor for FASTQ data. Bioinformatics 35, 2674–2676 (2018).

    Article 

    Google Scholar 

  • Dufort y Álvarez., G. et al. ENANO: encoder for NANOpore FASTQ files. Bioinformatics 36, 4506–4507 (2020).

    Article 

    Google Scholar 

  • Nicolae, M., Pathak, S. & Rajasekaran, S. LFQC: a lossless compression algorithm for FASTQ files. Bioinformatics 31, 3276–3281 (2015).

    CAS 
    Article 

    Google Scholar 

  • Myers, E. The fragment assembly string graph. Bioinformatics 21, 79–85 (2005).

    Google Scholar 

  • Li, H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110 (2016).

    CAS 
    Article 

    Google Scholar 

  • Koren, S., Walenz, B. P., Berlin, K., Miller, J. R. & Phillippy, A. M. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    CAS 
    Article 

    Google Scholar 

  • Dufort y Álvarez., G. et al. RENANO: a REference-based compressor for NANOpore FASTQ files. Bioinformatics 37, 4862–4864 (2021).

    Article 

    Google Scholar 

  • Nurk, S. et al. The complete sequence of a human genome. Preprint at bioRxiv https://doi.org/10.1101/2021.05.26.445798v1 (2021).

  • Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).

    CAS 
    Article 

    Google Scholar 

  • Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS 
    Article 

    Google Scholar 

  • Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).

    CAS 
    Article 

    Google Scholar 

  • Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).

    CAS 
    Article 

    Google Scholar 

  • Deorowicz, S. FQSqueezer: k-mer-based compression of sequencing data. Sci. Rep. 10, 578 (2020).

    CAS 
    Article 

    Google Scholar 

  • Kokot, M., Długosz, M. & Deorowicz, S. KMC 3: counting and manipulating k-mer statistics. Bioinformatics 33, 2759–2761 (2017).

    CAS 
    Article 

    Google Scholar 

  • Sosić, M. & Sikić, M. Edlib: a C/C++ library for fast, exact sequence alignment using edit distance. Bioinformatics 33, 1394–1395 (2017).

    Article 

    Google Scholar 

  • Vereecke, N. et al. High quality genome assemblies of Mycoplasma bovis using a taxon-specific Bonito basecaller for MinION and Flongle long-read Nanopore sequencing. BMC Bioinformatics 21, 517 (2020).

    CAS 
    Article 

    Google Scholar 

  • Depledge, D. P. et al. Direct RNA sequencing on Nanopore arrays redefines the transcriptional complexity of a viral pathogen. Nat. Commun. 10, 754 (2019).

    CAS 
    Article 

    Google Scholar 

  • Charalampous, T. et al. Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower respiratory infection. Nat. Biotechnol. 7, 783–792 (2019).

    Article 

    Google Scholar 

  • Deschamps, S. et al. A chromosome-scale assembly of the sorghum genome using Nanopore sequencing and optical mapping. Nat. Commun. 9, 4844 (2018).

    Article 

    Google Scholar 

  • Kim, K. et al. Long-read, whole-genome shotgun sequence data for five model organisms. Sci. Data 1, 140045 (2014).

    CAS 
    Article 

    Google Scholar 

  • Hon, T. et al. Highly accurate long-read HiFi sequencing data for five complex genomes. Sci. Data 7, 399 (2020).

    CAS 
    Article 

    Google Scholar 

  • Cheng, H., Concepcion, G. T., Feng, X. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).

    CAS 
    Article 

    Google Scholar 

  • Murigneux, V. et al. Comparison of long-read methods for sequencing and assembly of a plant genome. GigaScience 9, giaa146 (2020).

    Article 

    Google Scholar 

  • Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).

    CAS 
    Article 

    Google Scholar 

  • Source link