MAVL/StickWRLD for protein: visualizing protein sequence families to detect non-consensus features

William C Ray

doi:10.1093/nar/gki374

MAVL/StickWRLD for protein: visualizing protein sequence families to detect non-consensus features

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W315-9. doi: 10.1093/nar/gki374.

Author

William C Ray¹

Affiliation

¹ Children's Research Institute and The Department of Pediatrics, The Ohio State University, 700 Children's Drive, Columbus, OH 43205, USA. ray@biosci.ohio-state.edu

Abstract

A fundamental problem with applying Consensus, Weight-Matrix or hidden Markov models as search tools for biosequences is that there is no way to know, from the model, if the modeled sequences display any dependencies between positional identities. In some instances, these dependencies are crucial in correctly accepting or rejecting other sequences as members of the family. MAVL (multiple alignment variation linker) and StickWRLD provide a web-based method to visually survey the model-training sequences to discover and characterize possible dependencies. Initially introduced for nucleic acid sequences, with MAVL/StickWRLD, it is easy to distinguish typical DNA or RNA structural dependencies in input families, identify mixed populations of distinct subfamilies, or discover novel dependencies that result from binding interactions or other selective pressures [W. Ray (2004) Nucleic Acids Res., 32, W59-W63]. Since the announcement of MAVL/StickWRLD for nucleic acids, one of the most requested new features has been the extension of this visualization method to support protein alignments. We are pleased to report that this extension has been successful, that the basic visualization has been augmented in several ways to enhance protein viewing, and that the results with protein alignments are even more dramatic than with NA alignments. MAVL/StickWRLD can be accessed at http://www.microbial-pathogenesis.org/stickwrld/.

MeSH terms

Adenylate Kinase / chemistry
Algorithms
Computer Graphics*
Internet
Models, Molecular*
Protein Conformation*
Protein Structure, Tertiary
Sequence Alignment / methods*
Sequence Analysis, Protein / methods*
Software*
User-Computer Interface

Substances

Adenylate Kinase