Alignment of sequence to tertiary structure

Remember that the alignments of sequence on to tertiary structure that one gets from fold recognition methods may be inaccurate. In instance where one has identified a remote homologue, then the fold recognition methods can sometimes give a very accurate alignment, though it is still sometimes fruitful to edit the alignment around variable regions (see the Multiple Sequence Alignment for ways of doing this). In other cases, it may be wise to create your own alignment by starting with the alignment from the fold recognition method, and considering the alignment of secondary structures.

There is no generally accepted way for doing this, though one method (ie. mine) involves:

Ensuring that residues predicted to be buried/exposed align to those known to be buried or exposed in the template structure. Note that conserved hydrophobic/polar residues are more likely to be buried/exposed than non-conserved residues, which could simply be anomalies. One can predict residue accessibility manually, or by use of an automated server like PHD.
Ensuring that critical hydrogen bonding patterns are not disrupted in beta-sheet structures.
Trying to conserve residue properties (i.e. size, polarity, hydrophobicity) as best as possible across known and unknown structure.

For example, in trying to align the prediction of the glutamyl tRNA reductases (hemA) with one alpha/beta barrel structure (2acs):

[Sec.= known secondary structure from PDB code 2ACS (E = extended, H = alpha helix, G = 3-10 helix, B = beta-bridge); Bur. = known residue exposure for 2ACS (b = buried, h = half-buried, e = exposed); in/out = positioning of residues in the beta-barrel (i = pointing inwards, o = pointing outwards); Res. cons = conservation of residues (totally conserved = UPPER CASE, h = hydrophobic, p = polar, c = charged, a = aromatic, s = small, - = negaitve, + = positive) Pred denotes predicted burial and secondary structure for the glutamyl tRNA reductase family; boxed positions are those with the same known/predicted burial. Shaded positions show a conservation of hydrophobic character in BOTH families of proteins, and positions in inverse text show a conservation of polar character in BOTH families.]

In the construction of this alignment, several things were considered:

The observed residue burial or exposure
The predicted residue burial or exposure
The conservation of residue properties in known and unknown structures
Whether or not the side chains on the core beta-strands pointed in towards the barrel or out towards the helices
The hydrogen bonding pattern of the beta-strands comprising the core beta-barrel.

By using an initial alignment from one of the fold recognition methods as a guide, the alignment above was created by trying to optimise the match of features described above.

Remember that proteins having similar three-dimensional structures with little or no sequence similarity can differ substantial with respect to the finer details of their structures (i.e. loops, precise orientation of side chains, orientation of secondary structures, etc.). See here for some work I did with Geoff Barton on this subject.

Next comparative modelling

Back to the Flowchart