Loop modeling is a problem in protein structure prediction requiring the prediction of the conformations of loop regions in proteins without the use of a structural template. The problem arises often in homology modeling, where the tertiary structure of an amino acid sequence is predicted based on a sequence alignment to a template, or a second sequence whose structure is known. Because loops have highly variable sequences even within a given structural motif or protein fold, they often correspond to unaligned regions in sequence alignments; they also tend to be located at the solvent-exposed surface of globular proteins and thus are more conformationally flexible. Consequently, they often cannot be modeled using standard homology modeling techniques. More constrained versions of loop modeling are also used in the data fitting stages of solving a protein structure by X-ray crystallography, because loops can correspond to regions of low electron density and are therefore difficult to resolve.
Regions of a structural model that were predicted by loop modeling tend to be much less accurate than regions that were predicted using template-based techniques. The extent of the inaccuracy increases with the number of amino acids in the loop. The loop amino acids' side chains dihedral angles are often approximated from a rotamer library, but can worsen the inaccuracy of side chain packing in the overall model. Andrej Sali's homology modeling suite MODELLER includes a facility explicitly designed for loop modeling by a satisfaction of spatial restraints method.
In general, the most accurate predictions are for loops of fewer than 8 amino acids. Extremely short loops of three residues can be determined from geometry alone, provided that the bond lengths and bond angles are specified. Slightly longer loops are often determined from a "spare parts" approach, in which loops of similar length are taken from known crystal structures and adapted to the geometry of the flanking segments. In some methods, the bond lengths and angles of the loop region are allowed to vary, in order to obtain a better fit; in other cases, the constraints of the flanking segments may be varied to find more "protein-like" loop conformations. The accuracy of such short loops may be almost as accurate as that of the homology model upon which it is based. It should also be considered that the loops in proteins may not be well-structured and therefore have no one conformation that could be predicted; NMR experiments indicate that solvent-exposed loops are "floppy" and adopt many conformations, while the loop conformations seen by X-ray crystallography may merely reflect crystal packing interations, or the stabilizing influence of crystallization co-solvents.
- Mount DM. (2004). Bioinformatics: Sequence and Genome Analysis 2nd ed. Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY.
- Chung SY, Subbiah S. (1996.) A structural explanation for the twilight zone of protein sequence homology. Structure 4: 1123–27.
- MODLOOP, public server for access to MODELLER's loop modeling facility