Alignment of sequence to tertiary structure
Remember that the alignments of sequence on to tertiary structure that one gets from fold
recognition methods may be inaccurate. In instance where one has identified a remote homologue,
then the fold recognition methods can sometimes give a very accurate alignment, though it
is still sometimes fruitful to edit the alignment around variable regions (see the Multiple Sequence Alignment for ways of doing this).
In other cases, it may be wise to create your own alignment by starting with the alignment from
the fold recognition method, and considering the
alignment of secondary structures.
There is no generally accepted way for doing this, though one method (ie. mine) involves:
- Ensuring that residues predicted to be buried/exposed align to those known to
be buried or exposed in the template structure. Note that conserved hydrophobic/polar
residues are more likely to be buried/exposed than non-conserved residues, which could simply
be anomalies. One can predict residue accessibility manually, or by use of an automated
server like PHD.
- Ensuring that critical hydrogen bonding patterns are not disrupted in beta-sheet
structures.
- Trying to conserve residue properties (i.e. size, polarity, hydrophobicity) as best as
possible across known and unknown structure.
For example, in trying to align the prediction of the glutamyl tRNA reductases (hemA) with one alpha/beta
barrel structure (2acs):
[Sec.= known secondary structure from PDB code 2ACS (E = extended, H = alpha helix, G = 3-10 helix,
B = beta-bridge); Bur. = known residue exposure for 2ACS (b = buried, h = half-buried, e = exposed);
in/out = positioning of residues in the beta-barrel (i = pointing inwards, o = pointing outwards);
Res. cons = conservation of residues (totally conserved = UPPER CASE, h = hydrophobic, p = polar,
c = charged, a = aromatic, s = small, - = negaitve, + = positive)
Pred denotes predicted burial and secondary structure for the glutamyl tRNA reductase family;
boxed positions are those with the same known/predicted burial. Shaded positions show a
conservation of hydrophobic character in BOTH families of proteins, and positions in inverse text
show a conservation of polar character in BOTH families.]
In the construction of this alignment, several things were considered:
- The observed residue burial or exposure
- The predicted residue burial or exposure
- The conservation of residue properties in known and unknown structures
- Whether or not the side chains on the core beta-strands pointed in towards the barrel or
out towards the helices
- The hydrogen bonding pattern of the beta-strands comprising the core beta-barrel.
By using an initial alignment from one of the fold recognition methods as a guide, the alignment
above was created by trying to optimise the match of features described above.
Remember that proteins having similar three-dimensional structures with little or no
sequence similarity can differ substantial with respect to the finer details of their
structures (i.e. loops, precise orientation of side chains, orientation of secondary structures,
etc.). See here for
some work I did with Geoff Barton on this subject.
Next comparative modelling
Back to the Flowchart