Computational Identification of Related Proteins

Dong, Qunfeng; Brendel, Volker

doi:10.1007/978-1-59259-890-8_51

Computational Identification of Related Proteins

BLAST, PSI- BLAST, and Other Tools

Qunfeng Dong² &
Volker Brendel³

Protocol

4050 Accesses
2 Citations

Part of the book series: Springer Protocols Handbooks ((SPH))

Abstract

Molecular sequences that share a high degree of similarity often are thought to have evolved from common ancestral genes. Closely related protein sequences will presumably correspond to similar three-dimensional structures and conserved biological functions (although the reverse is not necessarily true: similar structures and conserved functions do not imply that the corresponding protein sequences will be similar; reviewed in ref. 1). These assumptions provide the basis for computational gene annotation. Typically, the first step in characterizing a novel gene is to compare its sequence against known sequences in available databases and to predict its origin and function by copying the annotation of those previously characterized sequences. This approach has been highly successful and is probably the only practical method applicable to large-scale annotation efforts at present. It should be pointed out, however, that this practice is not without its limitations (and is also unsatisfactory from the more theoretical perspective of those who wish to determine structure and function from primary sequence; for a provocative editorial on this subject, see ref. 2). The intrinsic problems of transitive propagation of historical annotation errors have been discussed elsewhere (bi3) and are all too familiar to any biologist who has looked into the databases only to find puzzling annotations that make no sense with current knowledge.

This is a preview of subscription content, log in via an institution.

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

Weir, M., Swindells, M., and Overington, J. (2001) Insights into protein function through large-scale computational analysis of sequence and structure. Trends Biotechnol. 19, S61–S6.
Article PubMed CAS Google Scholar
Konopka, A. K. (2003) Selected dreams and nightmares about computational biology. Comp. Biol. & Chem. 27, 91–92.
Article CAS Google Scholar
Brendel, V. (2002) Integration of data management and analysis for genome research. In Schubert, S., Reusch, B., and Jesse, N. (eds.), “Informatik bewegt”. Lecture Notes in Informatics (LNI)—Proceedings P-20, 10–21.
Google Scholar
Altschul S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
PubMed CAS Google Scholar
Altschul S. F., Madden, T. L., Schäffer, A. A., et al. (1997) Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
Article PubMed CAS Google Scholar
Benson D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Wheeler, D. L. (2003) GenBank. Nucleic Acids Res. 31, 23–27.
Article PubMed CAS Google Scholar
Westbrook, J., Feng Z., Chen L., Yang H., and Berman, H. M. (2003) The Protein Data Bank and structural genomics. Nucleic Acids Res. 31, 489–491.
Article PubMed CAS Google Scholar
Higgins D. G., Thompson, J. D., and Gibson, T. J. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 266, 383–402.
Article PubMed CAS Google Scholar
Kumar S., Tamura K., and Nei M. (1994) MEGA: Molecular Evolutionary Genetics Analysis software for microcomputers. Comput. Appl. Biosci. 10, 189–191.
PubMed CAS Google Scholar
Felsenstein J. (1989) PHYLIP-Phylogeny Inference Package (Version 3.2). Cladistics 5, 164–166.
Google Scholar
Vogt, G., Etzold T., and Argos, P. (1995) An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. J. Mol. Biol. 249, 816–831.
Article PubMed CAS Google Scholar
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.
Article Google Scholar
Dayhoff, M. O., Schwartz, R. M., and Orcutt, B. C. (1978). A model of evolutionary change in proteins. In: (Dayhoff, M. O., ed.) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Washington, DC: pp. 345–362.
Google Scholar
Altschul, S. F., Boguski, M. S., Gish, W., and Wootton, J. C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119–129.
Article PubMed CAS Google Scholar
Rost B. (2002) Enzyme function less conserved than anticipated. J. Mol. Biol. 318, 595–608.
Article PubMed CAS Google Scholar
Xing L. and Brendel V. (2001) Multi-query sequence BLAST output examination with MuSeq Box. Bioinformatics 17, 744–745.
Article PubMed CAS Google Scholar
Worley K. C., Wiese, B. A., and Smith, R. F. (1995) BEAUTY: an enhanced BLASTbased search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res. 5, 173–184.
Google Scholar
Brinkman, F. S., Wan, I., Hancock, R. E., Rose, A. M., and Jones, S. J. (2001) PhyloBLAST: facilitating phylogenetic analysis of BLAST results. Bioinformatics 17, 385–387.
Article PubMed CAS Google Scholar
Paquola, A. C., Machado, A. A., Reis, E. M., Da Silva A. M., and Verjovski-Almeida S. (2003) Zerg: a very fast BLAST parser library. Bioinformatics 22, 1035–1036.
Article Google Scholar
Altschul, S. F. and Koonin, E. V. (1998) Iterated profile searches with PSI-BLAST-a tool for discovery in protein databases. Trends Biochem. Sci. 23, 444–447.
Article PubMed CAS Google Scholar
Jones D. T. and Swindells, M. B. (2002) Getting the most from PSI-BLAST. Trends Biochem. Sci. 27, 161–164.
Article PubMed CAS Google Scholar
Mitsuuchi, Y., Johnson, S. W., Sonoda, G., Tanno, S., Golemis, E. A., and Testa, J. R. (1999) Identification of a chromosome 3p14.3-21.1 gene, APPL, encoding an adaptor molecule that interacts with the oncoprotein-serine/threonine kinase AKT2. Oncogene 18, 4891–4898.
Article PubMed CAS Google Scholar
Miaczynska M., Christoforidis S., Giner A., et al. (2004) APPL proteins link Rab5 to nuclear signal transduction via an endosomal compartment. Cell 116, 445–456.
Article PubMed CAS Google Scholar
Peter, B. J., Kent, H. M., Mills, I. G., et al. (2004) BAR domains as sensors of membrane curvature: the amphiphysin BAR structure. Science 303, 495–499.
Article PubMed CAS Google Scholar
Lipman, D. J. and Pearson, W. R. (1985) Rapid and sensitive protein similarity searches. Science 227, 1435–1441.
Article PubMed CAS Google Scholar
Pearson, W. R. and Lipman, D. J. (1998) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.
Article Google Scholar
Smith, T. and Waterman, M. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Article PubMed CAS Google Scholar
Usuka, J., Zhu, W., and Brendel, V. (2000) Optimal spliced alignment of homologous cDNA to a genomic DNA template. Bioinformatics 16, 203–211.
Article PubMed CAS Google Scholar
Kent, W. J. (2002) BLAT-the BLAST-like alignment tool. Genome Res. 12, 656–664.
PubMed CAS Google Scholar
Pertsemlidis, A. and Fondon, J. W. 3rd. (2001) Having a BLAST with bioinformatics (and avoiding BLASTphemy). Genome Biol. 2, reviews 2002.1–2002.10.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA
Qunfeng Dong
Department of Genetics, Development and Cell Biology, Department of Statistics, Iowa State University, Ames, IA
Volker Brendel

Authors

Qunfeng Dong
View author publications
You can also search for this author in PubMed Google Scholar
Volker Brendel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Hertfordshire, Hatfield, UK
John M. Walker

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Dong, Q., Brendel, V. (2005). Computational Identification of Related Proteins. In: Walker, J.M. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. https://doi.org/10.1007/978-1-59259-890-8_51

Download citation

DOI: https://doi.org/10.1007/978-1-59259-890-8_51
Publisher Name: Humana Press
Print ISBN: 978-1-58829-343-5
Online ISBN: 978-1-59259-890-8
eBook Packages: Springer Protocols

Publish with us

Policies and ethics