POLYVIEW logo

The POLYVIEW Server: General Information


Index

About POLYVIEW
Manual
Input data formats
Custom view settings
POLYVIEW-3D
Examples of applications
Automated annotations using scripts
Tips and tricks
Terms of use and disclaimer
Reference
Acknowledgements

Pictorial definitions used in Polyview-2D protein representations

Legend Description
Amino acid residue numeration
Protein secondary structure
H-alpha and other helices (model 1)
H-alpha and other helices (model 2)
E-beta-strand or bridge
C-coil
Relative solvent accessibility (RSA)
0-completely buried (0-9% RSA),
9-fully exposed (90-100% RSA)
Physical-chemical properties
H-hydrophobic: A,C,F,G,I,L,M,P,V
HAPNC A-amphipathic: H,W,Y
P-polar: N,Q,S,T
N/C-charged: D,E-neg; R,K-pos
Confidence level of prediction
0-the lowest level,
9-the highest level
Transmembrane domain

Abbreviations used throughout documentation

SS-secondary structure
SA-solvent accessibility
RSA-relative solvent accessibility
PDB-protein data bank
DSSP-dictionary of protein secondary structure
CASP-critical assessment of techniques for protein structure prediction

About POLYVIEW

The POLYVIEW protein structure visualization server can be used to generate sequence annotations, such as secondary structure, relative solvent accessibility, and physical-chemical property profiles. It can also be used to identify residues involved in protein-protein interactions and highlight other important sites and motifs. Customizable pictures with such annotations may be automatically generated using a script (for details see below).

Manual

This section explains how users can submit input data in order to generate graphical representations of protein structures with annotations available in POLYVIEW. Input data formats and view settings are described. Examples of how to customize the final protein sequence annotation are also provided.

Input data formats

The POLYVIEW server can process three types of input data:

  1. A file with a structure in the PDB format or the four-letter PDB code if the protein of interest can be found in the Protein Data Bank.
    • To enter the PDB entry code, type 4 letters (e.g. 1a2x) in the corresponding text box.
    • To submit a file, click on the respective Browse button and select a file to be uploaded.

  2. Results from the protein structure prediction servers with secondary structures (SS) and relative solvent accessibilities (SA), order-disorder regions (DR) and domain boundaries (DP) in the CASP format. Results from our own prediction server SABLE may be submitted in the original format, as well.
    • One can visualize sequence profiles by uploading files in the CASP format, including SS, DR, DP, and SA. In order to have these predictions combined, files can be simply concatinated to a single file before submission to the POLYVIEW server.
    • At present, there is no standardized CASP format for the prediction of relative solvent accessibility. However, for this purpose the same format as for the secondary structure prediction can be used, with the 'SS' value of the field 'PFRMAT' replaced by 'SA'. In such a case, the different columns in sequence section of the file are supposed to contain: (1) one-letter amino acid residue label; (2) an integer number in the scale of 0 to 99 representing the level of exposure to solvent, or percentage of solvent accessibility; (3) a confidence score for the prediction (a real number in the scale of 0 to 1), respectively.
    • Original output from the SABLE server, which is sent by e-mail, can be saved as a file and then uploaded using the Browse button to select the appropriate file.

  3. Arbitrary protein sequence profiles (using the clipboard copy and paste technique).
    • Amino acid sequence is the only obligatory data field when using this option. The remaining fields are optional.
    • One can specify any combination of different sequence profiles, like secondary structure, its prediction confidence, relative solvent accessibility, etc. The only requirement is that the length of any data needs to be consistent with the length of the amino acid sequence.
    • It is possible to submit multiple arbitrary sequences at a time. In this case all sequence profiles should be delimited by the hash sign (#). For example, 3 amino acid sequences ACDE#FGHI#KLMN with 3 corresponding secondary structure profiles CCCC#HHHH#CCEC can be submitted.

The server can process all types of input submitted at the same time, i.e. one can specify different source originated data at once to get annotations both as isolated images and as aligned profiles for the comparative purposes (for details refer to Custom view settings section). Data processing priority is the same as the order in the list above. For example, if one uses a PDB code and in the same time some sequence data were pasted into arbitrary sequence profile input section, POLYVIEW generates an annotation for the PDB protein first, and secondly, for the data given in Arbitrary protein sequence input section.

Custom view settings

Once protein sequence annotation is generated, user can customize its appearance. For this purpose, a number of settings are provided.

  • Presentation model allows one to switch between different models of graphical representations of protein secondary structures. There are two models available at present. Examples are shown in Figure 1.

    Model 1 Model 2
    Figure 1. Different models for graphical representation of protein secondary structure.

  • Background color option sets the custom color for the image background. One can set up a color by R(ed)G(reen)B(lue) code combination using decimal numbers, or choose a predefined alternative value from the list. Default value is White.
  • Shown information offers a context-based set of options. Number and type of settings that are displayed depend on the input provided by the user. These options are divided into 2 groups: (1) options that allow one to hide some information shown by default; (2) and options that let one to add some data not shown in the default setting.

    Hide residue numbering option is always available and allows user to include (or switch off) the residue numeration in the protein sequence annotation (see Figure 2, B).

    Hide amino acid sequence option is always available as well and lets user to hide amino acid sequence (see Figure 2, C).

    Hide graphical secondary structure option appears only when related data are available. It allows to hide graphical representation of secondary structures when it is not to be displayed (see Figure 2, D).

    Hide bars of SS prediction confidence option appears only when data from protein structure prediction server are submitted. It allows to hide graphical representation of the prediction confidence.

    Hide relative solvent accessibility option appears whenever residue relative solvent accessibility data are available. It lets the user to exclude the information about RSA (see Figure 2, E).

    Hide bars of RSA prediction confidence option appears only when respective data from protein structure prediction server are submitted. It allows to hide graphical representation of the prediction confidence for relative solvent accessibility.

    Show chemical property profile option is always available and lets the user to add to sequence annotation the information about the corresponding physical-chemical profile of amino acids. This can be useful, for example, to find correlations between relative solvent accessibilities patterns and chemical profiles (see Figure 2, F).

    Show letter code for secondary structure option appears whenever secondary structure data are provided. This allows the user to use traditional letteral representation of secondary structure instead of or in addition to the graphical information (see Figure 2, G).

    Show numerical SS prediction confidence option appears when data from protein structure prediction servers are submitted. This lets the user to analyse predicted structure using numerically expressed confidence factors.

    Show numerical relative solvent accessibility option appears whenever corresponding data are available. This is an alternative representation for the relative solvent accessibility with numerical values of RSA rather than graphical gray scale grade bar (see Figure 2, H). RSA ranges from 0 to 9, with 0 corresponding to fully buried (0-9% RSA) and 9 corresponding to fully exposed residue (90-100% RSA), respectively.

    Show numerical RSA prediction confidence option is available when respective data are provided. Numerical confidence factors are presented instead of or in addition to graphical bars of the prediction confidence.

    ABCD
    Default annotation Hide residue numbering Hide amino acid sequence Hide graphical secondary structure
    EFGH
    Hide relative solvent accessibility Show chemical profile Show letter code for secondary structure Show numerical relative solvent accessibility
    Figure 2. Examples of the different types of sequence annotations for the protein Mastoparan-X (PDB code 1a13) generated using options from the Shown information set. A. Annotation produced by default settings after initial data submission. B-H. Results of applying individual settings described above.

  • Highlighting options make it possible to customize the labeling of specific residues.

    Arbitrary residues option provides one with the opportunity to emphasize any amino acid sequence fragment or motif (see Figure 3). The highlighted residues can represent, for example, polymorphic or interfacial residues.

    • Residues can be highlighted by changing the color or/and the font style of amino acid labels. Available colors are red, green or blue. They can be specified by their corresponding first characters.
    • The font style can be used to alter between regular and bold style only.
    • In case of protein complexes, the chain label for residues of interest needs to be specified as well.
    • Numbers of residues to be highlighted should be enumerated with comma delimitation (white spaces are ignored). The use of dashes to define a range of numbers is also supported.

    The syntax of the string used to specify residues to be highlighted is the following: [Chain_label:]Residue_number[:Color], where '[ ]' denotes optional parts of the string. Capital letters R, G or B are used to highlight a residue in both color and bold font style, whereas lower case characters result in highlighting by the corresponding color only.

    Below are several examples:
    A:145:r - highlight the 145th residue in chain A using red color
    C:5-10 - highlight residues from 5 through 10 in chain C using bold style
    14-17,25-30,43:b - highlight residues 14, 15, 16, 17, 25, 26, 27, 28, 29, 30 in the first chain using bold font and blue color for residue 43
    A:3,A:10-20,A:35,B:15-20,B:25,B:40 - highlight residues 3, 10-20, 35 in chain A using bold font and residues 15-20, 25, 40 in chain B using the same style

    Highlighting example
    Figure 3. Example of highlighting residues. To generate this picture, option has been set up to 2-16:b,18,20,22,24-36:R.

    Trans-membrane residues option allows one to highlight the trans-membrane regions in protein (see example at Figure 8).

    • Residues are highlighted by shading their background, therefore no color needs to be specified. Although, this highligting option can be freely combined with previous one.
    • In case of protein complexes, the chain label for residues of interest needs to be specified as well. Thus, a syntax for this option is: [Chain_label:]Residue_number.
    • Numbers of residues to be highlighted should be enumerated with comma delimitation (white spaces are ignored). The use of dashes to define a range of numbers is also supported.

    Residues at S-S bridges is an option that appears only when a protein structure is submitted in the PDB format. It performs automatic highlighting of those cysteine residues that found to be at the disulphide (S-S) bridges by the DSSP program. Selected residues are marked using yellow and bold font style. Moreover, S-S bridges that cysteines belong to, are labeled by low case characters in residue numeration line. Figure 4 gives an example of the S-S bridge highlighting using protein structure with the PDB code 1acw.

    S-S bridges
    Figure 4. Example of highlighting cysteines at S-S bridges.

    Protein complex interface set of options appears only when a protein complex is submitted in the PDB format. It enables automatic recognition and highlighting of those residues that are at the protein-protein interaction interface (see Figure 7 or Figure 7-2). Selected residues are marked using magenta and bold font style.

    • Residues at interface option can be used to initiate the automatic recognition of interfacial sites of protein complex. When the corresponding checkbox is checked and the Update button pressed the POLYVIEW server determines and highlights the interface residues (and sets also the Highlight residues option text field described above).
    • RSA change threshold option has two input fields and is available as long as the option Highlight residues at interface is checked. Structures of protein complex and its isolated chains are analyzed using the DSSP program in order to determine changes in solvent accessibility of residues in the complex as opposed to isolated chains. By default, cutoff is defined in absolute scale, with 10Å2 difference in surface exposed area triggering the selection (see results of different cutoffs on Figure 7 and Figure 7-2 in Examples section). One can also specify the cutoff level for the change in terms of RSA to be used to identify interfacial residues.

  • Other options are related to the particular properties of the sequence annotation appearance.

    Number of residues per line option sets the width of the image (see Figure 5). It can be used in order to properly adjust a shape of the picture. Default value is 50.

    50 residues per line
    15 residues per line
    Figure 5. Number of residues per line (set to 50 in the upper panel and to 15 in the lower panel, respectively).

    Start numeration from option is useful to adjust a numeration between sequence annotations derived from different sources. For example, one can set up the original PDB starting number for sequence taken from protein structure prediction server. It will facilitate a comparison. In case when a protein complex was submitted, it is possible to set up the starting numbers for each chain individually by enumerating them separately using commas as a delimitation. If only one number is given, numeration of the all sequences starts from this number. If no number is supplied (i.e. empty text box), original numeration from the PDB is kept.

    Merge sequence annotations option appears only when a protein structure information, which is submitted to the POLYVIEW server, contains at least 2 amino acid sequences that are identical. It allows one to align different 1D profiles and merge them into a single picture with multiple annotation for better comparison.

    • All proteins in request should have the same amino acid sequences and consistent in length other corresponding 1D profiles.
    • Resulting annotation will contain one common numeration, amino acid sequence and physical-chemical profile for all sequences whereas appearence of other information will depend on data provided for each sequence and selection of view settings described above. Figure 6-2 demonstrates alignment of the same sequence annotations shown in Figure 6.

  • More options are to come...

All view settings mentioned above are applied during the generation of a new image after the Update button is pressed. All options can be used in combinations with others. Previous values of settings can be restored using Reset button.

POLYVIEW-3D

If a protein structure is submitted in the PDB format, it is possible to generate 3D animated images and publication quality slides using POLYVIEW-3D. Along with high quality rendering, the server provides structural and functional analysis. For details and examples, please refer to the POLYVIEW-3D tutorial.

Examples of applications

Below are several examples demonstrating how the POLYVIEW server can be used for structural and functional annotations.

  • POLYVIEW is a fast and convenient tool to view the results from protein structure prediction servers.

    PDB 1cqu from PDB
    SABLE 1cqu by SABLE
    Prof 1cqu by Prof
    PsiPred 1cqu by PsiPred
    Figure 6. Example of SS and SA predictions for the 50S ribosomal protein L9 (PDB code 1cqu). Results derived directly from the PDB and from the prediction servers SABLE, Prof, and PsiPred are compared. Colored bars below SS represent confidence level for structure prediction.

    PDB 1cqu by PDB, SABLE, Prof, PsiPred
    SABLE
    Prof
    PsiPred
    Figure 6-2. Example of the sequence annotations alignment for the same data as shown at Figure 6 using option Merge sequence annotations.

    To generate the above annotations, different types of input data had been used. PDB entry code was used to produce a graphical representation of the actual structure. In case of the SABLE prediction, data were submitted from a file in the original format. The PsiPred prediction was submitted using the clipboard copy-paste technique. As for the Prof results, they were submitted using a file in the CASP format.

  • POLYVIEW can be used for automatic identification of residues located at protein-protein interaction interfaces.

    State/Chain Chain A Chain B
    In the complex Recognition of interfacial residues Recognition of interfacial residues
    As isolated chain Recognition of interfacial residues Recognition of interfacial residues
    Figure 7. Example of an automatic recognition of residues at the protein-protein interaction interface (using protein complex 1a15). Residues highlighted in magenta and bold have different RSA in the complex relative to isolated chains and are, therefore, identified as sites of contact between two chains. Absolute threshold 10Å2 of the SA change has been used in order to define interfacial site.

    The individual values of solvent accessibility are normalized to the range 0-9 and are presented in the form of grayscale bars. In some cases (e.g. in case of big residues such as tryptophan), the change of 10Å2 or even more may not result in a shift of a residue to another bin of RSA because it will not exceed 10% of the residue nominal SA. On the other hand, some smaller than 10Å2 changes in SA may lead to the change of the RSA bin because of rounding. One can find an example of the first case in chain A, residue 64 (lysine, K), and the latter case in chain A, residue 31 (threonine, T), respectively (see Figure 7). Thus, it is strongly suggested to rely on the built-in feature of automatic identification of interfacial sites rather than visual comparison of changes in RSA patterns in a protein complex relative to isolated chains.

    State/Chain Chain A Chain B
    In the complex Recognition of interfacial residues Recognition of interfacial residues
    As isolated chain Recognition of interfacial residues Recognition of interfacial residues
    Figure 7-2. The same subject for automatic recognition of residues at the protein-protein interaction interface as in Figure 7 has been used. But in this case the relative threshold 10% of the RSA change has been used in order to define interfacial site.

    The advantage of using relative change in SA in order to identify interfacial vs non-interfacial sites is that this definition is more likely to capture conserved residues that have real changes in RSA (rather than slight random changes caused by presence in interface neighborhood). On another hand, this measure is less sensitive to the changes in SA for big amino acid residues, such as tryptophan. In the latter case the absolute change of SA should be more than 20Å2 in order to assign this residue to interfacial one with the relative cutoff of 10% RSA.

  • POLYVIEW allows one to analyse trans-membrane proteins in order to detect trans-membrane regions.

    PDB Trans-membrane protein
    SABLE Trans-membrane protein
    Figure 8. Visualization of the trans-membrane protein Sensory Rhodopsin II (PDB code 1h68). Residues highlighted by yellow background are located in trans-membrane regions according to Swiss-Prot database (Swiss-Prot code P42196). Upper panel shows the results obtained from the DSSP program as applied to this structure without accounting for different environments. Lower panel contains the SABLE server prediction that indicates residues with low water accessible surface area coinciding with membrane regions.

    Combination of the SABLE prediction and the POLYVIEW annotation provides a convenient tool for a trans-membrane regions identification. Protein sequence of a known membrane protein was submitted to SABLE in order to obtain an example of how SABLE can be used to indicate the presence of membrane domains. The prediction shown above reveals long alpha-helices and fully "buried" residues (meaning residues with low water accessible surface area). It coincides with the actual data about trans-membrane regions derived from the corresponding Swiss-Prot entry.

  • One of the functions of POLYVIEW-3D is to visualize SPPIDER's protein functional sites prediction mapped to the corresponding 3D structure.

    SPPIDER prediction: 1f4j_A SPPIDER prediction: 1g3n_A SPPIDER prediction: 1lqb_C
    A B C
    Figure 9. Panel A: Human erythrocyte catalase as a part of the oxidoreductase complex (PDB entry 1f4j, chain A). Panel B: Cyclin-dependent kinase 6 (CDK6) as a part of the p18(ink4c)-cdk6-k-cyclin ternary complex (1g3n:A). Panel C: Von Hippel-Lindau disease tumor suppressor from the pvhl/elongin-c/elongin-b complex (1lqb:C). Color scheme used: Red - true positives (residues correctly predicted to be at interface); White - true negatives (residues with no functional annotation); Yellow - false positives (residues wrongly predicted to be at interface); Blue - false negatives (known but not recognized interfacial residues).

Automated annotations using scripts

The POLYVIEW server can be used in conjuction with script provided below to automate submissions to the server for large scale annotation tasks. We provide here a perl script that allows one to set up default view settings for multiple protein sequence annotations. For batch mode, our script can read input files with multiple queries and submit them one by one.

To download a script, click here. Last edition was made on December 1, 2005.

Package consists of 2 files:

  • polyview.pl - namely perl script.
  • options.txt - file with options for script.

Make sure that your computer has the following software installed:

Tips and tricks

Below are some tips that can help to make some tricks with pictures of protein annotations:

  • To get an amino acid sequence as well as secondary structure and relative solvent accessibility in text (FASTA) format check an appropriate checkbox in the Sequence processing toolbar at the Get sequence tool. Sequence will appear in the text area of a popup window allowing one to mark and copy it to the clipboard as a plain text.

  • To get an image in publishable format click on the tool Get as in the corresponding Image processing toolbar next to each image. You will be prompted to save a PS or TIFF file to a disk.

  • After generating an animated GIF image (256 colors only) by using POLYVIEW-3D extension, one can download selected slides in True Color and bigger size both in PNG and TIFF (300dpi) formats.

  • If a TIFF formatted image looks tiny, when embedded in a document, scale it to the desirable size, quality will not get worse.

Terms of use and disclaimer

All images generated by the POLYVIEW server can be FREEly saved, printed, and distributed by means of any media without our written permission for academic and non-commercial purposes. However, the use of POLYVIEW's pictures SHOULD be acknowledged by a reference to the server.

The use of the POLYVIEW web site and server is at your own risk and no liability is accepted for any loss or damage arising through the use of the web site and graphical representations generated by server.

References

Please, use the following references:

A. Porollo, R. Adamczak, J. Meller (2004) POLYVIEW: A Flexible Visualization Tool for Structural and Functional Annotations of Proteins, Bioinformatics, 20: 2460-2462.

A. Porollo, J. Meller (2007) Versatile Annotation and Publication Quality Visualization of Protein Complexes Using POLYVIEW-3D, BMC Bioinformatics, 8:316.

Acknowledgements

This work was supported by the University of Cincinnati College of Medicine, Cincinnati Children's Hospital Research Foundation, and NIH through grants: AI055338, R01 AR050688, 5R01GM067823-02.

The following software is being used to have the POLYVIEW web server running and providing services described above:

  • The DSSP program by W. Kabsch and C. Sander available at the Centre for Molecular and Biomolecular Informatics, University of Nijmegen, Netherland.
    It is being used for calculation of the protein secondary structure and relative solvent accessibility for submitted structures in the PDB format.
  • Lincoln D. Stein's perl GD graphics library that can be found at Boutell.Com, Inc. web server or in any CPAN modules collection.
    It is being used to generate images representing protein sequence-structure-function 2D annotation.
  • Roger Sayle's RasMol (v 2.7.3) from Biomolecular Structures Group, Hertfordshire, UK and
    W.L. DeLano's PyMol from DeLano Scientific, San Carlos, CA, USA.
    They are being used to generate 3D animated images for protein structures submitted in the PDB format as a part of POLYVIEW-3D.

Last update of the document: September, 2007
Back to the POLYVIEW server home page