Skip to main content

Computational Analysis of High Throughput Sequencing Data

  • Protocol
  • First Online:
Bioinformatics for Omics Data

Part of the book series: Methods in Molecular Biology ((MIMB,volume 719))

Abstract

The advent of High Throughput Sequencing (HTS) methods opens new opportunities for the analysis of genomes and transcriptomes. While the sequencing of a whole mammalian genome took several years at the turn of this century, today it is only a matter of weeks. The race towards the thousand-dollar genome is fueled by the – ethically challenging – idea of personalized genomic medicine. However, these methods allow new and interesting insights in many aspects such as the discovery of novel noncoding RNA classes, structural variants, or alternative splice sites to name a few. Meanwhile, several methods for HTS have been introduced to the markets. Here, an overview on the technologies and the bioinformatics analysis of HTS data is given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pushkarev, D., Neff, N. F., and Quake, S. R. (2009) Single-molecule sequencing of an individual human genome. Nat Biotechnol 27, 847–52.

    Article  PubMed  CAS  Google Scholar 

  2. Pandey, V., and Nutter, P. E. (2008) Next-generation genome sequencing: towards personalized medicine. Wiley, New York.

    Google Scholar 

  3. Margulies, M., Egholm, M., Altman, W. E. et al. (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–80.

    PubMed  CAS  Google Scholar 

  4. Bentley, D. R., Balasubramanian, S., Swerdlow, H. P. et al. (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–9.

    Article  PubMed  CAS  Google Scholar 

  5. Harris, T. D. et al (2009) Single-molecule DNA sequencing of a viral genome. Science 302, 106–9.

    Google Scholar 

  6. Drmanac, R., Sparks, A. B., Callow, M. J. et al. (2009) Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81.

    Article  PubMed  Google Scholar 

  7. Clarke, J., Wu, H.-C., Jayasinghe, L., Patel, A., Reid, S., and Bayley, H. (2009) Continuous base identification for single-molecule nanopore DNA sequencing. Nat Nanotechnol 4, 265–70.

    Article  PubMed  CAS  Google Scholar 

  8. Eid, J., Fehr, A., Gray, J. et al. (2009) Real-time DNA sequencing from single polymerase molecules. Science 323, 133–8.

    Article  PubMed  CAS  Google Scholar 

  9. Quinlan, A. R., Steward, D. A., Stromberg, M. P., and Marth, G. T. (2008) Pyrobayes: an improved base caller for SNP discovery in pyrosequences. Nat Methods 5, 454–57.

    Article  Google Scholar 

  10. Kircher, M., Stenzel, U., and Kelso, J. (2009) Improved base calling for the Illumina Genome Analyzer using machine learning strategies. Genome Biol 10, R83.

    Article  PubMed  Google Scholar 

  11. Erlich, Y., Mitra, P. P., de la Bastide, M., McCombie, W. R., and Hannon, G. J. (2008) Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nat Methods 5, 679–82.

    Article  PubMed  CAS  Google Scholar 

  12. Dohm, J. C., Lottaz, C., Borodina, T., and Himmelbauer, H. (2008) Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res 36, e105.

    Article  PubMed  Google Scholar 

  13. Li, H., Ruan, J., and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18, 1851–8.

    Article  PubMed  CAS  Google Scholar 

  14. Smith, A. D., Xuan, Z., and Zhang, M. Q. (2008) Using quality scores and longer reads improves accuracy of Solexa read mapping. BMC Bioinform 9, 128.

    Article  Google Scholar 

  15. Ferragina, P., and Manzini, G. (2000) Opportunistic data structures with applications. Proceedings 41st Annual Symposium on Foundations of Computer Science, 390–8.

    Google Scholar 

  16. Li, H., and Durbin, R. (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–60.

    Article  PubMed  CAS  Google Scholar 

  17. Langmead, B., Trapnell, C., Pop, M., and Salzberg, S. L. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.

    Article  PubMed  Google Scholar 

  18. Li, R., Yu, C., Li, Y., Lam, T.-W., Yiu, S.-M., Kristiansen, K., and Wang, J. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–7.

    Article  PubMed  CAS  Google Scholar 

  19. Hoffmann, S., Otto, C., Kurtz, S., Sharma, C. M., Khaitovich, P., Vogel, J., Stadler, P. F., and Hackermuller, J. (2009) Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput Biol 5, e1000502.

    Article  PubMed  Google Scholar 

  20. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and 1000 Genome Project Data Processing Subgroup (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–9.

    Article  PubMed  Google Scholar 

  21. Warren, R. L., Sutton, G. G., Jones, S. J. M., and Holt, R. A. (2007) Assembling millions of short DNA sequences using SSAKE. Bioinformatics 23, 500–1.

    Article  PubMed  CAS  Google Scholar 

  22. Zerbino, D. R., and Birney, E. (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18, 821–9.

    Article  PubMed  CAS  Google Scholar 

  23. Li, R., Zhu, H., and Wang, J. (2009) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res doi:10.1101/gr.097261.109.

    Google Scholar 

  24. Li, R. et al. (2009) The sequence and de novo assembly of the giant panda genome. Nature doi:10.1038/nature08696.

    Google Scholar 

  25. Li, R., Li, Y., Fang, X., Yang, H., Wang, J., Kristiansen, K., and Wang, J. (2009) SNP detection for massively parallel whole-genome resequencing. Genome Res 19, 1124–32.

    Article  PubMed  CAS  Google Scholar 

  26. Lynch, M. (2009) Estimation of allele f­requencies from high-coverage genome-sequencing projects. Genetics 182, 295–301.

    Article  PubMed  CAS  Google Scholar 

  27. Dohm, J. C., Lottaz, C., Borodina, T., and Himmelbauer, H. (2007) SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. Genome Res 17, 1697–706.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

Many thanks go to Maribel Hernandez Rosales, Dulce Palafox, Ishaan Gupta, Sven Findeis, Dominic Rose, and Jörg Hackermüller for fruitful discussions and proof-reading the manuscript. This publication is supported by LIFE-Leipzig Research Center for Civilization Diseases, Universitaet Leipzig. This project was funded by means of the European Social Fund and the Free State of Saxony.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steve Hoffmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Hoffmann, S. (2011). Computational Analysis of High Throughput Sequencing Data. In: Mayer, B. (eds) Bioinformatics for Omics Data. Methods in Molecular Biology, vol 719. Humana Press. https://doi.org/10.1007/978-1-61779-027-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-027-0_9

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-026-3

  • Online ISBN: 978-1-61779-027-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics