Biochem.Biophys.Res.Comm. 195:686-696.
Summary: Jan Dohlman conceived this study and I executed it using Lupas' program to detect coiled-coil proteins from their primary sequence. Key conclusion: Coiled-coils correlate with known autoantibodies.
Abstract
Autoimmune diseases are characterized by the presence of antibodies and T-cells targeting restricted sets of host proteins. This phenomenon may be due in part to greater non-specific immunogenicity for these proteins compared to others which are not autoantigenic. We used computer-based methods to analyze the sequenced human autoantigens for distinctive patterns of potential immunologic importance. Sequences longer than 27 residues predicted by these methods to form coiled-coil alpha-helices with a probability greater than 0.9 were detected in 40 of 109 (36.7%) of the known human autoantigens. These include many predominantly systemic disease-specific autoantigens not previously known to contain this structure. In comparison, 8.7% of human proteins in the Swissprot data base, and 1.1% of the proteins in the Brookhaven data base were found to contain such segments. These predicted coiled-coil alpha-helices are distinguished from most globular protein helices by greater length, higher charge content, and a heptad repeat multivalency. Coiled-coil segments correlate in part with known autoantibody epitopes and may contribute to autoantigenic potential. Systemic autoantigens generally are either basic or contain extended, multivalent, charge-rich segments such as coiled-coils.
Figure 2. Observed conformation in proteins with high CC probability.
Fig. 2A. Observed conformation in proteins with high CC probability. All residues of unique sequences with coordinates deposited in the PDB, or published but not yet in the PDB with Pmax > 0.70 are shown. The darker shading on the cartoons of secondary structure conformation represent higher probability. Charged residues are shown in bold face. All sequences in the PDB had their secondary structure conformation determined automatically from full-atom models or C alpha only models with our programs (16-18). Proteins are referred to by their PDB codes or abbreviated; 2TMA (tropomyosin), a long homodimeric CC (10); SRNA (E. coli seryl-tRNA synthetase) contains a long N-terminal anti-parallel CC (12), GCN4 (yeast transcriptional activator), a two-stranded parallel CC (10), 2HMG (hemagglutinin) contains a trimer CC region (10); APOE (apolipoprotein E), parallel helices in long 4-helix bundle (14); 1LRD (lambda repressor), surface helix-loop-helix away from DNA-binding site (10); 1YPI (triosephosphate isomerase), surface helix-turn-helix (10); APO3 (locust apolipophorin III) helix 2 of elongated 5-helix bundle (15), SOMA (porcine growth hormone), helix 3 in long 4helix bundle (13), 256B (cytochrome B-562), helix-turn-helix in bundle (10); 4TSI (tyrosyl tRNA synthetase), surface helix-turn-helix (10). The mean minimum uninterrupted helix length is 28.4(+9.0). The mean charge content of the CC segments depicted is 39.5(+7.4)%, and of the p >= 90% segments, 43.1%(+5.4).
Fig. 2B. Distribution of helix lengths in PDB proteins. Defining a helix as 5 or more consecutive residues in helical conformation based on the coordinates of the alpha-carbon, we find 4088 helices in 953 peptide chains in the PDB with our automated procedures. The mean length is 12.0(+8.1). Only 1.4% of the helices are 28 residues or longer.