MRC Analysis of the Protein Fold
The wavelet approximation of the protein backbone produces interesting visuals. A clean view of the packing and interdigitation of the helices in interferon is generated with 1/8 the geometric information. The lower-resolution 'ribbon' closely matches cylinders fit to the helices (Figure 5). The visual experiment is complicated by subjectivity and the variety of graphic representations. The consensus of half a dozen colleagues was that 1/4 the information suggested the same fold.
The distance difference versus resolution plot of Figure 7 may suggest a best value. In terms of integer levels, The rms deviation is 2.1A at 1/4 the segments and 3.3A at 1/8 the segments. The rms is less than the ideal Ca-Ca distance (3.8A) at fractional levels between 2.0 and 3.0. At 2.6 (1/6 the segments), the rms for helix/sheet residues is less than 3A.
At what reduced level of detail is a protein fold recognizable? Can this be expressed as a unique small integer fraction, say 1/4 or 1/6? This number should be optimal for lower-resolved comparisons. This number should also have bearing on the information content of the protein fold. An investigation of the number of possible folds has just been published [36], including a procedure to approximate the protein fold using the discrete cosine transform. Lower resolution curves resembling wavelet curves suggest 1/3 the points always generates a recognizable fold.
Do each 3, 4, or 6 residues establish the local folding path in space? Einstein said everything should be made as simple as possible, but no simpler. Three nucleotides translate to one amino acid. Three amino acids could code a 3D coordinate. Speculation is based on recent work in theoretical conchology [37]. The serpentine meandering of certain shells may be described by a geodesic (straight line) in a generalized 6D isoEuclidean space. It was noted this seashell mathematics may apply to proteins (their pictures might pass for the low resolution ribbon tubes shown herein).
Topological Comparison
The usual technique of substructure detection is a distance geometry approach, followed by the computationally more expensive 3D superposition [15,38]. Sophisticated alignment techniques [39] assign multiple properties (e.g., hydrophobicity) to each Ca position to aid in the superposition.
Improved methods would be welcome due to the exponential increase in the number of structures being deposited in the Protein Data Bank. How can a structural motif be recognized, and will a lower-resolution model be useful? Searches conducted using the standard techniques will be compared to searches using the wavelet approximation. The wavelet decomposition of the hydrophobicity or other properties should provide a 'texturing', useful both for alignment and for interesting graphics. This research is in progress.
Multiresolution Editing
There are currently no structural constraints imposed between the residues, each of which moves as an independent rigid body tied to the space curve. It is easy to convert the spline curve from right-handed helix to left by editing a very low resolution version, as none of the well-known constraints on protein structure are used. A geometric minimizer [40] could easily be added in real time. More work is needed to handle non-bonded interactions. The reconstruction of the underlying atoms from an edited spline is a question for research.
Considerable trial-and-error experimentation will be required to discover the best interactive and numerical techniques. Prototyping is being done in the SGI Inventor programming environment. The 'virtual reality' (direct manipulation) interface of the environment has great potential as a general tool to construct and edit proteins through interactive computer graphics.
Molecular Surfaces and Volumes
The multiresolution NURBS surface visuals look much like molecular surfaces Max [25] generated with spherical harmonics. A procedure to texture map spherical harmonic surfaces has just been published [41]. Methods for more general topologies need to be explored.
Wavelet techniques can be applied to the 3-dimensional grid of an electron density map or various potential maps used by structure-based design. The wavelet approximation techniques could then be combined with docking methods; either the brute-force surface comparison [42] or searching of an energy grid [43].
Comparison of Wavelet Transforms
Molecular shapes can be compared by generating molecular envelopes volumes, superposing the envelopes, computing their Fourier transforms, and comparing the transforms at various resolutions [44]. This process was used to automatically classify antigen binding sites.
How can wavelet transforms be compared? Are there any advantages over Fourier techniques [45]? Wavelet analysis should allow much faster comparisons at low resolution. The localization properties of the wavelet transform show promise [46].
Wavelet decomposition of protein backbones and molecular surfaces has been demonstrated and the hierarchy of resolution levels visualized. The multiresolution editing capability introduces a new interactive tool for protein modeling. The minimum specification of the protein fold is of philosophic interest.
Comparisons of lower-resolution versions of backbones, molecular surfaces, and density volumes should be possible through their wavelet transforms, yielding substantial computational speedup for database and brute force searches. Wavelet analysis may allow much faster comparisons at low resolution, but at what cost? Important detail may be lost. This is the focus of future research.
Thanks to Lily Yang and all, past and present, at the UAB CMC. Thanks to Maxine Rice for preparing and to Susan Baum for editing the manuscript.