Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2017
    In:  Bioinformatics Vol. 33, No. 23 ( 2017-12-01), p. 3740-3748
    In: Bioinformatics, Oxford University Press (OUP), Vol. 33, No. 23 ( 2017-12-01), p. 3740-3748
    Abstract: Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy. Results We introduce MetaCache—a novel software for read classification using the big data technique minhashing. Our approach performs context-aware classification of reads by computing representative subsamples of k-mers within both, probed reads and locally constrained regions of the reference genomes. As a result, MetaCache consumes significantly less memory compared to the state-of-the-art read classifiers Kraken and CLARK while achieving highly competitive sensitivity and precision at comparable speed. For example, using NCBI RefSeq draft and completed genomes with a total length of around 140 billion bases as reference, MetaCache’s database consumes only 62 GB of memory while both Kraken and CLARK fail to construct their respective databases on a workstation with 512 GB RAM. Our experimental results further show that classification accuracy continuously improves when increasing the amount of utilized reference genome data. Availability and implementation MetaCache is open source software written in C ++ and can be downloaded at http://github.com/muellan/metacache. Supplementary information Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4803 , 1367-4811
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2017
    detail.hit.zdb_id: 1468345-3
    SSG: 12
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages