Jinjiang: Jinjiang's WebWatcher on Biology (1) -PROTEIN SECONDARY STRUCTURE

PROTEIN SECONDARY STRUCTURE

Acknowledgement: ref: http://molbiol-tools.ca/Protein_secondary_structure.htm

Precautionary Quote: "We should be quite remiss not to emphasize that despite the popularity of secondary structural prediction schemes, and the almost ritual performance of these calculations, the information available from this is of limited reliability. This is true even of the best methods now known, and much more so of the less successful methods commonly available in sequence analysis packages. Running a secondary structure prediction on a newly-determined sequence just because everyone else does so, is to be deplored, and the fact that the results of such predictions are generally ignored is insufficient justification for doing and publishing them." Arthur Lesk, 1988.

My favourite site is The Protein Sequence Analysis (PSA) Protein Structure Prediction Server (BMERC at Boston University (BioMolecular Engineering Research Center) predicts probable secondary structures and folding classes for a given amino acid sequence. Results are available in postscript requiring a viewer such as Ghostview. A self-extracting version of the later for Windows can be obtained free from the University of Wisconsin. In addition, a new private Web format is available.

YASPIN secondary structure prediction - is a HNN (Hidden Neural Network) secondary structure prediction program that uses the PSI-BLAST algorithm to produce a PSSM for the input sequence, which it then uses to perform its prediction. (Reference: K. Lin et al. 2005. Bioinformatics 21:152-9).

For a metasite linked to a wide range of protein sequence analysis and structure predictions online programs, I recommend PredictProtein (ROSTLAB, Technische Universität München). Also see: SCRATCH Protein Predictor (Institute for Genomics & Bioinformatics, University of California, Irvine, U.S.A.)

Several great sites for online analysis of potential membrane spanning proteins are: (Test sequence ; see Orientation of Proteins in Membranes for 268 unique a-helical membrane protein structures)

TMpred - Prediction of Trans-membrane Regions and Orientation - ISREC (Swiss Institute for Experimental Cancer Research) TMHMM - Prediction of transmembrane helices in proteins (Center for Biological Sequence Analysis, The Technical University of Denmark) DAS - Transmembrane Prediction Server (Stockholm University, Sweden)
SPLIT (D. Juretic, Univ. Split , Croatia) - the Transmembrane Protein Topology Prediction Server provides clear and colourful output including beta preference and modified hydrophobic moment index.
OCTOPUS - Using a novel combinationof hidden Markov models and artificial neural networks, OCTOPUSpredicts the correct topology for 94% of the a datasetof 124 sequences with known structures. (Reference: Viklund, H. & Elofsson, A. 2008. Bioinformatics 24: 1662-1668)

Phobius - is a combined transmembrane topology and signal peptide predictor (Reference:

L. Käll et al. 2004. J. Mol. Biol. 338: 1027-1036) This tool can also be accessed here and here. SPOCTOPUS will also do this.

RHYTHM - predicts the orientation of transmembrane helices in channels and membrane-coils, specifically buried versus exposed residues. (Reference: A. Rose et al. 2009. Nucl. Acids Res. 37(Web Server issue):W575-W580)

TMMOD - Hidden Markov Model for Transmembrane Protein Topology Prediction (Dept. Computer & Information Sciences, University of Delaware, U.S.A.) - on the results page click on "show posterior probabilities" to see a TMHMM-type diagram

PRED-TMR2 (C. Pasquier & S.J.Hamodrakas,Dept. Cell Biology and Biophysics, Univ. Athens, Greece) - when applied to several test sets of transmembrane proteins the system gives a perfect prediction rating of 100% by classifying all the sequences in the transmembrane class. Only 2.5% error rate with nontransmembrane proteins.

TOPCONS - computes consensus predictions of membrane protein topology using a Hidden Markov Model (HMM) and input from five state-of-the-art topology prediction methods. (Reference: A. Bernsel et al. 2009. Nucleic Acids Res. 37(Webserver issue), W465-8) . For a batch server without BLAST runs use TOPCONSsingle.

MINNOU (Membrane protein IdeNtificatioN withOUt explicit use of hydropathy profiles and alignments) - predicts alpha-helical as well as beta-sheet transmembrane (TM) domains based on a compact representation of an amino acid residue and its environment, which consists of predicted solvent accessibility and secondary structure of each amino acid. (Reference: Cao et al. 2006. Bioinformatics 22: 303-309). A legend to help interpret the results in here.

SuperLooper - provides the first online interface for the automatic,quick and interactive search and placement of loops in proteins^.(Reference: P.W. Hildebrand et al. 2009. Nucl. Acids Res. 37(Web Server issue):W571-W574) )

For drawing the structure of transmembrane proteins two sites are available:

TOPO2 (S. Johns, UCSF Sequence Analysis Consulting Service, U.S.A.) - this is the best site providing considerable control over the presentation. Extensive documentation is provided here. RbDe (F.Campagne, Inst. Computational Biomedicine, Weill Medical College of Cornell University, New York, U.S.A.) - also permits one to prepare useful diagrams of transmembrane proteins.

TMRPres2D (TransMembrane protein Re-Presentation in 2 Dimensions tool) - this Java tool takes data from a variety of protein folding servers and creates uniform, two-dimensional, high analysis graphical images/ models of alpha-helical or beta-barrel transmembrane proteins. (Reference: I.C. Spyropoulos et al. 2004. Bioinformatics 20: 3258-3260).

Signal peptide recognition & subcellular localization:

A. Bacterial proteins

SLEP (Surface Locationization Extracellular Protein) - SLEP is a pipeline for predicting the localization of bacterial proteins starting from genome sequences (Fasta formatted). It combines the results of several tools: Glimmer, TMHMM, PRODIV-TMHMM, LipoP, PSortB.

PSORTb (Brinkman Lab, Simon Fraser Univ., Canada) - provides probably the most accurate bacterial protein subcellular localization predictor. Alternatively use PSORT (Univ. Tokyo, Japan)

- a series of programs for the prediction of protein localization sites in cells. Choose programs specific for for animal, yeast, plant or bacterial ( Gram-negative or Gram- positive) proteins.
PSLpred - is a SVM based method, predicts 5 major subcellular localization (cytoplasm, inner-membrane, outer- membrane, extracellular, periplasm) of Gram-negative bacteria. This method includes various SVM modules based on different features of the proteins. The hybrid approach achieved an overall accuracy of 91%, which is best among all the existing methods for the subcellular localization of prokaryotic proteins. (Reference: M. Bhasin et al. (2005) Bioinformatics 21: 2522-2524.) CELLO subCELlular LOcalization predictive system - assigns Gram-negative proteins to the cytoplasm , inner membrane, periplasm, outer membrane or extracellular space with overall prediction accuracy of ca. 89% . Also analyzes eukaryotic and Gram-positive proteins. (Reference: C.S. Yu et al. 2004. Protein Sci. 13:1402-1406). SubLoc - based on SOAP technology, this server/clientsuite offers a user-friendly interface for searching and predictingprotein subcellular location. N.B. It does not does not predict membrane proteins. Available here or here. (Reference: H. Chen et al. 2006. Bioinformatics 22: 376-377).

SignalP - predicts the presence and location of signal peptide cleavage sites in Gram-positive, Gram-negative and eukaryotic proteins (Center for Biological Sequence Analysis, The Technical University of Denmark). For an example of a periplasmic protein use test sequence MalE. Phobius - is a combined transmembrane topology and signal peptide predictor (Reference: L. Käll et al. 2004. J. Mol. Biol. 338: 1027-1036). LipoP 1.0 (Center for Biological Sequence Analysis Technical University of Denmark) - allows prediction of where signal peptidases I & II cleavage sites from Gram negative bacteria will cleave a protein.

SecretomeP - produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Reference: J.D. Bendtsen et al. 2005. BMC Microbiology 5: 58).

B. Eukaryotic proteins

Protein Prowler Subcellular Localisation Predictor - The subcellular localisation predictor is largely based on TargetP. (Reference: M. Boden & J. Hawkins. 2005. Bioinformatics 21: 2279-2286).
WoLF-PSORT (National Institute of Advanced Science and Technology, Japan)
pTARGET - This method can predict proteins targeted to nine distinct subcellular locations that include cytoplasm, endoplasmic reticulum, extracellular/secreted, Golgi, lysosomes, mitochondria, nucleus, peroxysomes and plasma membrane. Compared with PSORT showed that pTARGET prediction ratesare higher by 11–60% in 6 of the 8 locations tested. (Reference: C. Guda & S. Subramaniam. 2005. Bioinformatics 2005 21:3963-3969)
ProtComp (Softberry, U.S.A.) can be used to predict the subcellular localization for animal/fungal and plant proteins.

SecretomeP - produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Reference: J.D. Bendtsen et al. 2005. BMC Microbiology 5: 58).

Other sites for secondary structure predictions include:

JPred - a consensus method for protein secondary structure prediction based upon PHD, Predator, DSC, NNSSP, Zpred and Mulpred programs (European Bioinformatics Institute, Cambridge, United Kingdom) - My favourite site. N.B. Do not forget to deselect advanced option 1 at bottom of page to obtain maximum information.

YASPIN secondary structure prediction - a Hidden Neural Network secondary structure prediction program that uses the PSI-BLAST algorithm to produce a Position Specific Scoring Matrix for the input sequence, which it then uses to perform its prediction. It was trained on 2896 structures from the PDB40 database. (Reference: K. Lin et al. 2005. Bioinformatics 21:152-159).

Network Protein Sequence @nalysis at IBCP - (Institut de Biologie et Chemie des Proteines, Lyon, France) - has DSC, GORIV, Predator, SOPMA and Heirarchical Neural Network Method plus older programs.

For a different colourful approach try PSIpred (UCL Bioinformatics Unit, Department of Computer Science, University College London, United Kingdom). For a full range of properties of your protein including hydrophobicity, alpha helix, beta-sheet plots see ProScale (ExPASy, Switzerland)

Disordered states:

Many proteins containingregions that do not form well-defined structures and the following new programs help define these regions:

RONN (Regional Order Neural Network) - (Reference: Z.R. Yang et al. 2005. Bioinformatics 21: 3369-3376).
IUPred (Reference: Z. Dosztányi et al. 2005. Bioinformatics 21: 3433-3434). DISOPRED2 (Reference: J.J. Ward et al. 2004. J. Molec. Biol. 337: 635-645).
metaPrDOS - is a meta server to predict natively disordered regions of a protein chain from its amino acid sequence. metaPrDOS returns disorder tendency of each residue as prediction results.(Reference: T. Ishida & K. Kinoshita. 2008. Bioinformatics 24: 1344-1348)

Scooby-domain (Sequence hydrophobicity predicts domains) is a method to identify globular regions in protein sequence that are suitable for structural studies. The Scooby-domain JAVA applet can be used as a tool to visually identify 'foldable' regions in protein sequence. Interesting graphics. (Reference: R.A. George et al. 2005. Nucl. Acids Res. 33: W160-W163).

For estimations on the antigenicity of regions of proteins see:

Antigenicity Plot (JaMBW module) - Given a sequence of amino acids, this program computes and plots the antigenicity along the polypeptide chain, as predicted by the algorithm of Hopp & Woods (1981).
Antigenicity Prediction (Princeton BioMolecules,Langhorne, PA, U.S.A.) - find antigenic sequences and also reviews them from the point of hydrophobicity, aggregation, and steric hindrance.
EMBOSS Antigenic (EMBOSS package) - this program predicts potentially antigenic regions of a protein sequence, using the method of Kolaskar & Tongaonkar (1990). Also accessible here.

To screen for coiled-coil regions in proteins use:

Coils - Prediction of Coiled Coil Regions in Proteins (Swiss node of EMBnet, Switzerland) - (Reference: A. Lupas et al. 1991 Science 252: 1162-1164). Paircoils (MIT Laboratory for Computer Science, U.S.A.) - (Reference: B. Berger et al. 1995. Proc. Natl. Acad. Sci. USA, 92: 8259-8263) or MultiCoil - is based on the PairCoil algorithm and is used for locating dimeric and trimeric coiled coils. (Reference: E. Wolf et al. 1997. Protein Sci. 6: 1179-1189).

REPPER (REPeats and their PERiodicities) - detects and analyzes regions with short gapless repeats in proteins. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. They are complemented by PSIPRED and coiled coil prediction (COILS), making the server a useful analytical tool for fibrous proteins. (Reference: M. Gruber et al. 2005. Nucl. Acids Res. 33: W239-W243).

Domain linkers:

Armadillo Domain Linker Prediction (The Blueprint Initiative, Toronto, Canada) - Proteins are often composed of multiple structural/functional domains. Domain linkers link these domains together and have been found to contain an amino acid signature that is distinct from the structurally compact domains. Using a set of 211 two-domain contiguous proteins, the sensitivity was 56%.

Beta-barrel outer membrane proteins: (Test sequence)

PRED-TMßß (Bagos, P. G., et al. Dept Cell Biology & Biophysics, University of Athens, Greece) - employs a Hidden Markov Model method, capable of predicting and discriminating beta-barrel outer membrane proteins. Gives one the opportunity to download a custom image plot or a 2D representation (see below):

BetaTPred2 (Bioinformatics Center, Institute of Microbial Technology, India) - predict ß turns in proteins from multiple alignment by using neural network from the given amino acid sequence. For ß turn prediction, it uses the position specific score matrices generated by PSI-BLAST and secondary structure predicted by PSIPRED. For a classification of the ß turn type use BetaTurns.

TMB-Hunt - amino acid composition based TransMembrane Barrel-Hunt (A. Garrow, University of Leeds, England) - provides one with a color-coded score (& Evalue) for an individual or a series of proteins. (Reference: A.G. Garrow et al. 2005. Nucl. Acids Res. 33: W193-W197).

TMBETA-NET - Discrimination and Prediction of Transmembrane Beta Strands in Outer Membrane Proteins from amino acid sequence. Presents color-coded TM beta segments and their probabilities (Reference: M.M. Gromiha & M. Suwa. 2005. Bioinformatics 21: 961 - 968).

Metasite:

Scratch Protein Predictor - (Institute for Genomics and Bioinformatics, University California, Irvine) - programs include: ACCpro: the relative solvent accessibility of protein residues; CMAPpro: Prediction of amino acid contact maps; COBEpro: Prediction of continuous B-cell epitopes; CONpro: predicts whether the number of contacts of each residue in a protein is above or below the average for that residue; DIpro: Prediction of disulphide bridges; DISpro: Prediction of disordered regions; DOMpro: Prediction of domains; SSpro: Prediction of protein secondary structure; SVMcon: Prediction of amino acid contact maps using Support Vector Machines; and, 3Dpro: Prediction of protein tertiary structure (Ab Initio).

Jinjiang

Monday, 6 February 2012

Jinjiang's WebWatcher on Biology (1) -PROTEIN SECONDARY STRUCTURE

No comments:

Post a Comment