|
The POLYVIEW Server: General Information
|
|
Index
- About POLYVIEW
- Manual
- Input data formats
- Custom view settings
- POLYVIEW-3D
- Examples of applications
- Automated annotations using scripts
- Tips and tricks
- Terms of use and disclaimer
- Reference
- Acknowledgements
Pictorial definitions used in Polyview-2D protein representations
| Legend |
|
Description |
 |
|
Amino acid residue numeration |
|
|
Protein secondary structure |
 |
|
H-alpha and other helices (model 1) |
 |
|
H-alpha and other helices (model 2) |
 |
|
E-beta-strand or bridge |
 |
|
C-coil |
|
|
Relative solvent accessibility (RSA) |
 |
|
0-completely buried (0-9% RSA), |
 |
|
9-fully exposed (90-100% RSA) |
|
|
Physical-chemical properties |
 |
|
H-hydrophobic: A,C,F,G,I,L,M,P,V |
| HAPNC |
|
A-amphipathic: H,W,Y |
|
|
P-polar: N,Q,S,T |
|
|
N/C-charged: D,E-neg; R,K-pos |
|
|
Confidence level of prediction |
 |
|
0-the lowest level, |
 |
|
9-the highest level |
| |
 |
|
Transmembrane domain |
|
|
Abbreviations used throughout documentation
| SS | - | secondary structure |
| SA | - | solvent accessibility |
| RSA | - | relative solvent accessibility |
| PDB | - | protein data bank |
| DSSP | - | dictionary of protein secondary structure |
| CASP | - | critical assessment of techniques for protein structure prediction |
|
About POLYVIEW
The POLYVIEW protein structure visualization server can be used to
generate sequence annotations, such as secondary structure, relative
solvent accessibility, and physical-chemical property profiles. It
can also be used to identify residues involved in protein-protein
interactions and highlight other important sites and
motifs. Customizable pictures with such annotations may be
automatically generated using a script (for
details see below).
Manual
This section explains how users can submit input data in order to
generate graphical representations of protein structures with
annotations available in POLYVIEW. Input data formats and view
settings are described. Examples of how to customize the final protein
sequence annotation are also provided.
The POLYVIEW server can process three types of input data:
- A file with a structure in the
PDB format or the four-letter PDB code
if the protein of interest can be found in the
Protein Data Bank.
- To enter the PDB entry code, type 4 letters (e.g. 1a2x) in the
corresponding text box.
- To submit a file, click on the respective Browse button and select
a file to be uploaded.
- Results from the protein structure prediction
servers with secondary structures (SS) and relative solvent
accessibilities (SA), order-disorder regions (DR) and domain
boundaries (DP) in the
CASP format. Results from our own prediction
server SABLE
may be submitted in the original format, as well.
- One can visualize sequence profiles by uploading files in the CASP
format, including SS, DR, DP, and SA. In order to have these
predictions combined, files can be simply concatinated to a single file
before submission to the POLYVIEW server.
- At present, there is no standardized CASP format for the
prediction of relative solvent accessibility. However, for this
purpose the same format as for the secondary structure prediction can
be used, with the 'SS' value of the field 'PFRMAT' replaced by
'SA'. In such a case, the different columns in sequence section of the
file are supposed to contain: (1) one-letter amino acid residue label;
(2) an integer number in the scale of 0 to 99 representing the level of
exposure to solvent, or percentage of solvent accessibility; (3) a
confidence score for the prediction (a real number in the scale of 0
to 1), respectively.
- Original output from the SABLE server, which is sent by e-mail,
can be saved as a file and then uploaded using the Browse button to
select the appropriate file.
- Arbitrary protein sequence profiles (using the
clipboard copy and paste technique).
- Amino acid sequence is the only obligatory data field when
using this option. The remaining fields are optional.
- One can specify any combination of different sequence profiles,
like secondary structure, its prediction confidence, relative
solvent accessibility, etc. The only requirement is that the length of any
data needs to be consistent with the length of the amino acid
sequence.
- It is possible to submit multiple arbitrary sequences at a
time. In this case all sequence profiles should be delimited by the hash
sign (#). For example, 3 amino acid sequences
ACDE#FGHI#KLMN with 3 corresponding secondary structure
profiles CCCC#HHHH#CCEC can be submitted.
The server can process all types of input submitted at the same
time, i.e. one can specify different source originated data at once to
get annotations both as isolated images and as aligned profiles for
the comparative purposes (for details refer to
Custom view settings section). Data processing
priority is the same as the order in the list above. For example, if
one uses a PDB code and in the same time some sequence data were
pasted into arbitrary sequence profile input section, POLYVIEW
generates an annotation for the PDB protein first, and secondly,
for the data given in Arbitrary protein sequence input section.
Custom view settings
Once protein sequence annotation is generated, user can customize
its appearance. For this purpose, a number of settings are provided.
-
Presentation model allows one to switch between different models
of graphical representations of protein secondary structures. There are
two models available at present. Examples are shown in
Figure 1.
|
|
|
Figure 1. Different models for graphical representation of
protein secondary structure.
|
-
Background color option sets the custom color for the image
background. One can set up a color by
R(ed)G(reen)B(lue) code combination using decimal
numbers, or choose a predefined alternative value from the
list. Default value is White.
-
Shown information offers a context-based set of options. Number
and type of settings that are displayed depend on the input provided
by the user. These options are divided into 2 groups: (1) options
that allow one to hide some information shown by default; (2) and
options that let one to add some data not shown in the default
setting.
-
Hide residue numbering option is always
available and allows user to include (or switch off) the residue
numeration in the protein sequence annotation (see
Figure 2, B).
-
Hide amino acid sequence option is always
available as well and lets user to hide amino acid sequence
(see Figure 2, C).
-
Hide graphical secondary structure option
appears only when related data are available. It allows to hide
graphical representation of secondary structures when it is not
to be displayed (see Figure 2, D).
-
Hide bars of SS prediction confidence option
appears only when data from protein structure prediction server
are submitted. It allows to hide graphical representation of
the prediction confidence.
-
Hide relative solvent accessibility option
appears whenever residue relative solvent accessibility data are
available. It lets the user to exclude the information about RSA
(see Figure 2, E).
-
Hide bars of RSA prediction confidence option
appears only when respective data from protein structure
prediction server are submitted. It allows to hide graphical
representation of the prediction confidence for relative solvent
accessibility.
-
Show chemical property profile option is
always available and lets the user to add to sequence annotation
the information about the corresponding physical-chemical
profile of amino acids. This can be useful, for example, to find
correlations between relative solvent accessibilities patterns
and chemical profiles (see Figure 2, F).
-
Show letter code for secondary structure
option appears whenever secondary structure data are
provided. This allows the user to use traditional letteral
representation of secondary structure instead of or in addition
to the graphical information (see Figure 2, G).
-
Show numerical SS prediction confidence
option appears when data from protein structure prediction
servers are submitted. This lets the user to analyse predicted
structure using numerically expressed confidence factors.
-
Show numerical relative solvent accessibility
option appears whenever corresponding data are available. This
is an alternative representation for the relative solvent
accessibility with numerical values of RSA rather than graphical
gray scale grade bar (see Figure 2, H). RSA
ranges from 0 to 9, with 0 corresponding to fully buried (0-9% RSA)
and 9 corresponding to fully exposed residue (90-100% RSA),
respectively.
-
Show numerical RSA prediction confidence
option is available when respective data are provided. Numerical
confidence factors are presented instead of or in addition to
graphical bars of the prediction confidence.
-
| A | B | C | D |
|
|
|
|
| E | F | G | H |
|
|
|
|
|
Figure 2. Examples of the different types of
sequence annotations for the protein Mastoparan-X (PDB code
1a13) generated using options from the Shown
information set. A. Annotation produced by default
settings after initial data submission. B-H. Results of
applying individual settings described above.
|
-
Highlighting options make it possible to customize the
labeling of specific residues.
-
Arbitrary residues option provides one
with the opportunity to emphasize any amino acid sequence
fragment or motif (see Figure 3). The
highlighted residues can represent, for example, polymorphic or
interfacial residues.
-
Residues can be highlighted by changing the color or/and the
font style of amino acid labels. Available colors are
red, green or blue. They can be specified
by their corresponding first characters.
-
The font style can be used to alter between regular and
bold style only.
-
In case of protein complexes, the chain label for residues of
interest needs to be specified as well.
-
Numbers of residues to be highlighted should be enumerated with
comma delimitation (white spaces are ignored). The use of dashes
to define a range of numbers is also supported.
The syntax of the string used to specify residues to be
highlighted is the following:
[Chain_label:]Residue_number[:Color], where '[ ]'
denotes optional parts of the string. Capital letters R, G or B
are used to highlight a residue in both color and bold font
style, whereas lower case characters result in highlighting by
the corresponding color only.
Below are several examples:
A:145:r - highlight the 145th residue in chain A using red
color
C:5-10 - highlight residues from 5 through 10 in chain C
using bold style
14-17,25-30,43:b - highlight residues 14, 15, 16, 17, 25,
26, 27, 28, 29, 30 in the first chain using bold font and blue
color for residue 43
A:3,A:10-20,A:35,B:15-20,B:25,B:40 - highlight residues
3, 10-20, 35 in chain A using bold font and residues 15-20, 25,
40 in chain B using the same style
|
|
Figure 3. Example of highlighting residues. To generate
this picture, option has been set up to 2-16:b,18,20,22,24-36:R.
|
-
Trans-membrane residues option allows
one to highlight the trans-membrane regions in protein (see
example at Figure 8).
-
Residues are highlighted by shading their background, therefore
no color needs to be specified. Although, this highligting
option can be freely combined with previous one.
-
In case of protein complexes, the chain label for residues of
interest needs to be specified as well. Thus, a syntax for this
option is:
[Chain_label:]Residue_number.
-
Numbers of residues to be highlighted should be enumerated with
comma delimitation (white spaces are ignored). The use of dashes
to define a range of numbers is also supported.
-
Residues at S-S bridges is an option
that appears only when a protein structure is submitted in the
PDB format. It performs automatic highlighting of
those cysteine residues that found to be at the disulphide (S-S)
bridges by the DSSP program. Selected residues are marked using
yellow and bold font style. Moreover, S-S bridges that cysteines
belong to, are labeled by low case characters in residue
numeration line. Figure 4 gives an example
of the S-S bridge highlighting using protein structure with the
PDB code 1acw.
|
|
Figure 4. Example of highlighting cysteines at S-S bridges.
|
-
Protein complex interface set of options
appears only when a protein complex is submitted in the PDB
format. It enables automatic recognition and highlighting of
those residues that are at the protein-protein interaction
interface (see Figure 7 or
Figure 7-2). Selected residues are
marked using magenta and bold font style.
-
Residues at interface option can be used to
initiate the automatic recognition of interfacial sites of
protein complex. When the corresponding checkbox is checked and
the Update button pressed the POLYVIEW server determines and
highlights the interface residues (and sets also the
Highlight residues option text field described above).
-
RSA change threshold option has two input fields and is
available as long as the option Highlight residues at
interface is checked. Structures of protein complex and
its isolated chains are analyzed using the DSSP program in order
to determine changes in solvent accessibility of residues in the
complex as opposed to isolated chains. By default, cutoff is
defined in absolute scale, with 10Å2 difference in
surface exposed area triggering the selection (see results of
different cutoffs on Figure 7 and
Figure 7-2 in
Examples section). One can also
specify the cutoff level for the change in terms of RSA to be
used to identify interfacial residues.
-
Other options are related to the particular properties of
the sequence annotation appearance.
-
Number of residues per line option sets the
width of the image (see Figure 5). It can be
used in order to properly adjust a shape of the picture. Default
value is 50.
|
|
|
Figure 5. Number of residues per line (set to 50 in the
upper panel and to 15 in the lower panel, respectively).
|
-
Start numeration from option is useful to
adjust a numeration between sequence annotations derived from
different sources. For example, one can set up the original PDB
starting number for sequence taken from protein structure
prediction server. It will facilitate a comparison. In case when
a protein complex was submitted, it is possible to set up the
starting numbers for each chain individually by enumerating them
separately using commas as a delimitation. If only one number is
given, numeration of the all sequences starts from this
number. If no number is supplied (i.e. empty text box), original
numeration from the PDB is kept.
-
Merge sequence annotations option
appears only when a protein structure information, which is
submitted to the POLYVIEW server, contains at least 2 amino acid
sequences that are identical. It allows one to align different 1D
profiles and merge them into a single picture with multiple
annotation for better comparison.
- All proteins in request should have the same amino acid
sequences and consistent in length other corresponding 1D
profiles.
- Resulting annotation will contain one common numeration,
amino acid sequence and physical-chemical profile for all
sequences whereas appearence of other information will depend on
data provided for each sequence and selection of view settings
described above.
Figure 6-2 demonstrates alignment of the
same sequence annotations shown in Figure 6.
More options are to come...
All view settings mentioned above are applied during the generation of
a new image after the Update button is pressed. All options can be
used in combinations with others. Previous values of settings can be
restored using Reset button.
POLYVIEW-3D
If a protein structure is submitted in the PDB format, it is possible
to generate 3D animated images and publication quality
slides using
POLYVIEW-3D. Along
with high quality rendering, the server provides structural
and functional analysis. For details and examples, please
refer to the POLYVIEW-3D
tutorial.
Examples of
applications
Below are several examples demonstrating how the POLYVIEW server can
be used for structural and functional annotations.
-
POLYVIEW is a fast and convenient tool to view the results from
protein structure prediction servers.
| PDB |
|
| SABLE |
|
| Prof |
|
| PsiPred |
|
|
Figure 6. Example of SS and SA predictions for the 50S ribosomal
protein L9 (PDB code 1cqu). Results derived directly from the
PDB
and from the prediction servers
SABLE,
Prof,
and
PsiPred
are compared. Colored bars below SS represent confidence level for
structure prediction.
|
| PDB |
|
| SABLE |
| Prof |
| PsiPred |
|
Figure 6-2. Example of the sequence annotations alignment for
the same data as shown at Figure 6 using option Merge sequence
annotations.
|
To generate the above annotations, different types of input data had been
used. PDB entry code was used to produce a graphical representation of
the actual structure. In case of the SABLE prediction, data were
submitted from a file in the original format. The PsiPred prediction
was submitted using the clipboard copy-paste technique. As for the
Prof results, they were submitted using a file in the CASP format.
-
POLYVIEW can be used for automatic identification of residues
located at protein-protein interaction interfaces.
| State/Chain |
Chain A |
Chain B |
| In the complex |
|
|
| As isolated chain |
|
|
|
Figure 7. Example of an automatic recognition of residues at
the protein-protein interaction interface (using protein complex
1a15). Residues highlighted in magenta and bold have different RSA
in the complex relative to isolated chains and are, therefore,
identified as sites of contact between two chains.
Absolute threshold 10Å2 of the SA change
has been used in order to define interfacial site.
|
The individual values of solvent accessibility are normalized to the
range 0-9 and are presented in the form of grayscale bars. In some
cases (e.g. in case of big residues such as tryptophan), the change of
10Å2 or even more may not result in a shift of a residue to
another bin of RSA because it will not exceed 10% of the residue
nominal SA. On the other hand, some smaller than 10Å2
changes in SA may lead to the change of the RSA bin because of
rounding. One can find an example of the first case in chain A,
residue 64 (lysine, K), and the latter case in chain A, residue 31
(threonine, T), respectively (see Figure 7). Thus, it is strongly
suggested to rely on the built-in feature of automatic identification
of interfacial sites rather than visual comparison of changes in RSA
patterns in a protein complex relative to isolated chains.
| State/Chain |
Chain A |
Chain B |
| In the complex |
|
|
| As isolated chain |
|
|
|
Figure 7-2. The same subject for automatic recognition of
residues at the protein-protein interaction interface as in
Figure 7 has been used. But in this case the relative
threshold 10% of the RSA change has been used in order to define
interfacial site.
|
The advantage of using relative change in SA in order to identify
interfacial vs non-interfacial sites is that this definition is more
likely to capture conserved residues that have real changes in RSA
(rather than slight random changes caused by presence in interface
neighborhood). On another hand, this measure is less sensitive to the
changes in SA for big amino acid residues, such as tryptophan. In the
latter case the absolute change of SA should be more than
20Å2 in order to assign this residue to interfacial one
with the relative cutoff of 10% RSA.
-
POLYVIEW allows one to analyse trans-membrane proteins in order to
detect trans-membrane regions.
| PDB |
|
|
|
| SABLE |
|
|
|
Figure 8. Visualization of the trans-membrane protein Sensory
Rhodopsin II (PDB code 1h68). Residues highlighted by yellow
background are located in trans-membrane regions according to
Swiss-Prot database (Swiss-Prot code P42196). Upper panel shows the
results obtained from the DSSP program as applied to this structure
without accounting for different environments. Lower panel contains
the SABLE server prediction that indicates residues with low
water accessible surface area coinciding
with membrane regions.
|
Combination of the SABLE prediction and the POLYVIEW annotation provides
a convenient tool for a trans-membrane regions identification. Protein
sequence of a known membrane protein was submitted to SABLE in order to
obtain an example of how SABLE can be used to indicate the presence
of membrane domains. The prediction shown above reveals long
alpha-helices and fully "buried" residues (meaning residues
with low water accessible surface area). It
coincides with the actual data about trans-membrane regions
derived from the corresponding Swiss-Prot entry.
-
One of the functions of POLYVIEW-3D is to visualize SPPIDER's protein
functional sites prediction mapped to the corresponding 3D
structure.
 |
|
 |
|
 |
| A |
|
B |
|
C |
|
Figure 9. Panel A: Human erythrocyte catalase as a part of
the oxidoreductase complex (PDB entry 1f4j, chain A). Panel B:
Cyclin-dependent kinase 6 (CDK6) as a part of the
p18(ink4c)-cdk6-k-cyclin ternary complex (1g3n:A). Panel C: Von
Hippel-Lindau disease tumor suppressor from the
pvhl/elongin-c/elongin-b complex (1lqb:C).
Color scheme used: Red - true positives (residues correctly
predicted to be at interface); White - true negatives (residues with
no functional annotation); Yellow - false positives (residues
wrongly predicted to be at interface); Blue - false negatives (known
but not recognized interfacial residues).
|
Automated annotations
using scripts
The POLYVIEW server can be used in conjuction with script provided
below to automate submissions to the server for large scale annotation
tasks. We provide here a perl script that allows one to set up default
view settings for multiple protein sequence annotations. For batch mode,
our script can read input files with multiple queries and submit them
one by one.
To download a script, click
here.
Last edition was made on December 1, 2005.
Package consists of 2 files:
polyview.pl - namely perl script.
options.txt - file with options for script.
Make sure that your computer has the following software installed:
Tips and tricks
Below are some tips that can help to make some tricks with
pictures of protein annotations:
To get an amino acid sequence as well as secondary
structure and relative solvent accessibility in text
(FASTA) format check an appropriate checkbox in the Sequence
processing toolbar at the Get sequence
tool. Sequence will appear in the text area of a popup window allowing
one to mark and copy it to the clipboard as a plain text.
To get an image in publishable format click on the tool
Get as in the corresponding Image processing
toolbar next to each image. You will be prompted to save a PS or TIFF file
to a disk.
After generating an animated GIF image (256 colors only) by using
POLYVIEW-3D extension, one can download selected slides in True Color
and bigger size both in PNG and TIFF (300dpi) formats.
If a TIFF formatted image looks tiny, when embedded in a
document, scale it to the desirable size, quality will not get
worse.
Terms of use and disclaimer
All images generated by the POLYVIEW server can be FREEly saved,
printed, and distributed by means of any media without our written
permission for academic and non-commercial purposes. However, the use
of POLYVIEW's pictures SHOULD be acknowledged by
a reference to the server.
The use of the POLYVIEW web site and server is at your own risk and no
liability is accepted for any loss or damage arising through the use
of the web site and graphical representations generated by server.
References
Please, use the following references:
A. Porollo, R. Adamczak, J. Meller (2004)
POLYVIEW: A Flexible Visualization Tool for Structural and Functional Annotations of Proteins,
Bioinformatics, 20: 2460-2462.
A. Porollo, J. Meller (2007)
Versatile Annotation and Publication Quality Visualization of Protein Complexes Using POLYVIEW-3D,
BMC Bioinformatics, 8:316.
Acknowledgements
This work was supported by the
University of Cincinnati College of Medicine,
Cincinnati Children's Hospital Research Foundation, and
NIH through grants: AI055338, R01 AR050688, 5R01GM067823-02.
The following software is being used to have the POLYVIEW web server
running and providing services described above:
-
The DSSP program
by W. Kabsch and C. Sander
available at the Centre for Molecular and Biomolecular Informatics,
University of Nijmegen, Netherland.
It is being used for calculation of the protein secondary structure and
relative solvent accessibility for submitted structures in the PDB format.
-
Lincoln D. Stein's perl GD graphics library
that can be found at Boutell.Com, Inc. web server or in any
CPAN modules collection.
It is being used to generate images representing protein
sequence-structure-function 2D annotation.
-
Roger Sayle's RasMol
(v 2.7.3) from Biomolecular Structures Group, Hertfordshire, UK and
W.L. DeLano's PyMol
from DeLano Scientific, San Carlos, CA, USA.
They are being used to generate 3D animated images for protein
structures submitted in the PDB format as a part of POLYVIEW-3D.
Last update of the document: September, 2007
Back to the POLYVIEW server home page
|