Skip to main content

Computational Identification of Related Proteins

BLAST, PSI- BLAST, and Other Tools

  • Protocol

Part of the book series: Springer Protocols Handbooks ((SPH))

Abstract

Molecular sequences that share a high degree of similarity often are thought to have evolved from common ancestral genes. Closely related protein sequences will presumably correspond to similar three-dimensional structures and conserved biological functions (although the reverse is not necessarily true: similar structures and conserved functions do not imply that the corresponding protein sequences will be similar; reviewed in ref. 1). These assumptions provide the basis for computational gene annotation. Typically, the first step in characterizing a novel gene is to compare its sequence against known sequences in available databases and to predict its origin and function by copying the annotation of those previously characterized sequences. This approach has been highly successful and is probably the only practical method applicable to large-scale annotation efforts at present. It should be pointed out, however, that this practice is not without its limitations (and is also unsatisfactory from the more theoretical perspective of those who wish to determine structure and function from primary sequence; for a provocative editorial on this subject, see ref. 2). The intrinsic problems of transitive propagation of historical annotation errors have been discussed elsewhere (bi3) and are all too familiar to any biologist who has looked into the databases only to find puzzling annotations that make no sense with current knowledge.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Weir, M., Swindells, M., and Overington, J. (2001) Insights into protein function through large-scale computational analysis of sequence and structure. Trends Biotechnol. 19, S61–S6.

    Article  PubMed  CAS  Google Scholar 

  2. Konopka, A. K. (2003) Selected dreams and nightmares about computational biology. Comp. Biol. & Chem. 27, 91–92.

    Article  CAS  Google Scholar 

  3. Brendel, V. (2002) Integration of data management and analysis for genome research. In Schubert, S., Reusch, B., and Jesse, N. (eds.), “Informatik bewegt”. Lecture Notes in Informatics (LNI)—Proceedings P-20, 10–21.

    Google Scholar 

  4. Altschul S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

    PubMed  CAS  Google Scholar 

  5. Altschul S. F., Madden, T. L., Schäffer, A. A., et al. (1997) Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  6. Benson D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Wheeler, D. L. (2003) GenBank. Nucleic Acids Res. 31, 23–27.

    Article  PubMed  CAS  Google Scholar 

  7. Westbrook, J., Feng Z., Chen L., Yang H., and Berman, H. M. (2003) The Protein Data Bank and structural genomics. Nucleic Acids Res. 31, 489–491.

    Article  PubMed  CAS  Google Scholar 

  8. Higgins D. G., Thompson, J. D., and Gibson, T. J. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 266, 383–402.

    Article  PubMed  CAS  Google Scholar 

  9. Kumar S., Tamura K., and Nei M. (1994) MEGA: Molecular Evolutionary Genetics Analysis software for microcomputers. Comput. Appl. Biosci. 10, 189–191.

    PubMed  CAS  Google Scholar 

  10. Felsenstein J. (1989) PHYLIP-Phylogeny Inference Package (Version 3.2). Cladistics 5, 164–166.

    Google Scholar 

  11. Vogt, G., Etzold T., and Argos, P. (1995) An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. J. Mol. Biol. 249, 816–831.

    Article  PubMed  CAS  Google Scholar 

  12. Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.

    Article  Google Scholar 

  13. Dayhoff, M. O., Schwartz, R. M., and Orcutt, B. C. (1978). A model of evolutionary change in proteins. In: (Dayhoff, M. O., ed.) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Washington, DC: pp. 345–362.

    Google Scholar 

  14. Altschul, S. F., Boguski, M. S., Gish, W., and Wootton, J. C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119–129.

    Article  PubMed  CAS  Google Scholar 

  15. Rost B. (2002) Enzyme function less conserved than anticipated. J. Mol. Biol. 318, 595–608.

    Article  PubMed  CAS  Google Scholar 

  16. Xing L. and Brendel V. (2001) Multi-query sequence BLAST output examination with MuSeq Box. Bioinformatics 17, 744–745.

    Article  PubMed  CAS  Google Scholar 

  17. Worley K. C., Wiese, B. A., and Smith, R. F. (1995) BEAUTY: an enhanced BLASTbased search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res. 5, 173–184.

    Google Scholar 

  18. Brinkman, F. S., Wan, I., Hancock, R. E., Rose, A. M., and Jones, S. J. (2001) PhyloBLAST: facilitating phylogenetic analysis of BLAST results. Bioinformatics 17, 385–387.

    Article  PubMed  CAS  Google Scholar 

  19. Paquola, A. C., Machado, A. A., Reis, E. M., Da Silva A. M., and Verjovski-Almeida S. (2003) Zerg: a very fast BLAST parser library. Bioinformatics 22, 1035–1036.

    Article  Google Scholar 

  20. Altschul, S. F. and Koonin, E. V. (1998) Iterated profile searches with PSI-BLAST-a tool for discovery in protein databases. Trends Biochem. Sci. 23, 444–447.

    Article  PubMed  CAS  Google Scholar 

  21. Jones D. T. and Swindells, M. B. (2002) Getting the most from PSI-BLAST. Trends Biochem. Sci. 27, 161–164.

    Article  PubMed  CAS  Google Scholar 

  22. Mitsuuchi, Y., Johnson, S. W., Sonoda, G., Tanno, S., Golemis, E. A., and Testa, J. R. (1999) Identification of a chromosome 3p14.3-21.1 gene, APPL, encoding an adaptor molecule that interacts with the oncoprotein-serine/threonine kinase AKT2. Oncogene 18, 4891–4898.

    Article  PubMed  CAS  Google Scholar 

  23. Miaczynska M., Christoforidis S., Giner A., et al. (2004) APPL proteins link Rab5 to nuclear signal transduction via an endosomal compartment. Cell 116, 445–456.

    Article  PubMed  CAS  Google Scholar 

  24. Peter, B. J., Kent, H. M., Mills, I. G., et al. (2004) BAR domains as sensors of membrane curvature: the amphiphysin BAR structure. Science 303, 495–499.

    Article  PubMed  CAS  Google Scholar 

  25. Lipman, D. J. and Pearson, W. R. (1985) Rapid and sensitive protein similarity searches. Science 227, 1435–1441.

    Article  PubMed  CAS  Google Scholar 

  26. Pearson, W. R. and Lipman, D. J. (1998) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.

    Article  Google Scholar 

  27. Smith, T. and Waterman, M. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.

    Article  PubMed  CAS  Google Scholar 

  28. Usuka, J., Zhu, W., and Brendel, V. (2000) Optimal spliced alignment of homologous cDNA to a genomic DNA template. Bioinformatics 16, 203–211.

    Article  PubMed  CAS  Google Scholar 

  29. Kent, W. J. (2002) BLAT-the BLAST-like alignment tool. Genome Res. 12, 656–664.

    PubMed  CAS  Google Scholar 

  30. Pertsemlidis, A. and Fondon, J. W. 3rd. (2001) Having a BLAST with bioinformatics (and avoiding BLASTphemy). Genome Biol. 2, reviews 2002.1–2002.10.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Humana Press Inc., Totowa, NJ

About this protocol

Cite this protocol

Dong, Q., Brendel, V. (2005). Computational Identification of Related Proteins. In: Walker, J.M. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. https://doi.org/10.1007/978-1-59259-890-8_51

Download citation

  • DOI: https://doi.org/10.1007/978-1-59259-890-8_51

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-343-5

  • Online ISBN: 978-1-59259-890-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics