REMARK  ---------------------------------------------------------- 
REMARK  Molecule : T0084AL019_1_1 
REMARK  Alignment model prepared for CASP3 experiment 
REMARK  by group : UCSC-COMPBIO 
REMARK  ---------------------------------------------------------- 
TARGET T0084  
AUTHOR 9070-5088-8627  
REMARK   
REMARK Prediction date: 2 Sept 1998  
REMARK Group name: UCSC-compbio  
REMARK Students: Christian Barrett, Melissa Cline, Mark Diekhans, Leslie Grate,  
REMARK Faculty:  Kevin Karplus, David Haussler, and Richard Hughey  
REMARK University of California, Santa Cruz  
REMARK   
METHOD Overview  
METHOD   
METHOD Fold recognition was performed using the Target98 (SAM-T98) method  
METHOD [3] using SAM version 2.1.1 [1], a refinement of the methods developed  
METHOD by this group for CASP2 [2].  This method attempts to find and multiply   
METHOD align a set of homologs to a given sequence, then create an HMM from that   
METHOD multiple alignment.  
METHOD   
METHOD First, a set of sequence weights is determined from the alignment.  Next,   
METHOD Modelfromalign is used to build the model from the alignment and the   
METHOD sequence weights.  Finally, hmmscore performs a local, all-paths scoring   
METHOD of the sequences, using a reversed-sequence normalization feature.  
METHOD   
METHOD The weighting method, detailed in upcoming publications [3,4],  
METHOD combines the Henikoffs' scheme [5], Dirichlet mixtures [6], and an  
METHOD entropy method to set the final weights.  
METHOD   
METHOD Alignment generation  
METHOD   
METHOD The initial step uses BLASTP to search NRP twice: once to produce a set  
METHOD of very close homologs, and once to produce a set of possible homologs.  
METHOD   
METHOD The method then uses multiple iterations of a selection, training, and   
METHOD alignment procedure.  Each iteration involves an initial alignment, a set   
METHOD of search sequences, a threshold value, and a transition regularizer.   
METHOD   
METHOD The first iteration uses a single sequence (or seed alignment) as the   
METHOD initial alignment and the close homologs found by BLASTP are used as the   
METHOD search set.  The threshold is set very strictly, so that only good matches   
METHOD to the sequence are considered.  This iteration uses a transition regularizer   
METHOD that was designed to match the gap costs used by BLASTP.  
METHOD   
METHOD On subsequent iterations the input alignment is the output from the  
METHOD previous iteration, the search set is the larger set of possible  
METHOD homologs found by BLASTP, and the thresholds are gradually loosened.  
METHOD The second through second-from-last iteration use a ``long-match''  
METHOD transition regularizer, and the final iteration uses a transition regularizer   
METHOD trained on FSSP alignments.  
METHOD   
METHOD References  
METHOD [1] R. Hughey and A. Krogh, CABIOS 12(2): 95-107, 1996.  
METHOD     http://www.cse.ucsc.edu/research/compbio/sam.html.    
METHOD [2] K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R.  
METHOD     Hughey, L. Holm, and C. Sander, Proteins: Structure, Function, and   
METHOD     Genetics, Suppl. 1, 134-9, 1997.  
METHOD [3] K. Karplus, C. Barrett, and R. Hughey, Technical Report UCSC-CRL-98-06,  
METHOD     Department of Computer Engineering, Univ. of California, Santa Cruz, 1998.  
METHOD [4] J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard,  
METHOD     and C. Chothia, http://cyrah.med.harvard.edu/~jong/assess_final.html, 1998.  
METHOD [5] S. Henikoff and J. C. Henikoff, JMB, vol 243, pp 574-578, Nov 1994.  
METHOD [6] K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S.  
METHOD    Mian, and D. Haussler, CABIOS 12(4):327-345, 1996.  
METHOD   
METHOD   
METHOD Since this peptide was supposed to be a de novo design, we did not  
METHOD expect evolutionary information from similar sequences in the protein  
METHOD data base, but we searched anyway.  We found two proteins with pieces  
METHOD that matched the peptide: one from desmoplakin I [Homo sapiens] and  
METHOD one from ependymin [4 different species].  
METHOD   
METHOD gi|2134996|pir||A38194	tvcldLDKVEAYRCGLKKIKNDLNLKKSLLATMKTELQKAQQihsqt. 607:643   
METHOD 			         E     L    N L   KSLL   K ELQK  Q  
METHOD T0084			.....CGGREGVLKKLRAVENELHYNKSLLEEVKDELQKMRQ...... 1:37  
METHOD 			             KKLR VENE H NK     V  
METHOD gi|998296		diaegXFNYDSTAKKLRFVENESHANKTSHMDVLIHFEEGVLyeids. 20:56  
METHOD gi|998286 		diavgDFNYDSTAKKLRFVENESHANKTSHMDVLIHFEEGVLyemds. 31:67  
METHOD gi|998304 		diadgEFNYDSTAKKLRFVENESHSNKTSHMDVLIHFEEGVLyeids. 29:65  
METHOD gi|998302 		diaegEFNYDSTAKKLRFVENESHSNKTSHMDVLIHFEEGVLyeids. 27:63  
METHOD   
METHOD Unfortunately, neither of these proteins have known structures, though  
METHOD the desmoplakin I match is to part of a 2-strand coiled-coil domain.  
METHOD   
METHOD We did not find a full-length match for this peptide in PDB, but we  
METHOD did find several partial matches.  Based on these partial matches,   
METHOD we can piece together the prediction from three pieces   
METHOD          
METHOD 4blmA	IGGPESLKKELRKI  
METHOD 4blmA	LLLHHHHHHHHHHL  
METHOD          GG E   K LR  
METHOD   
METHOD 1ft1A	   RQWVIQEFRLWDNELQYVDQLLKE  
METHOD 1ft1A	   HHHHHHHLLLLLLHHHHHHHHHHH  
METHOD 	   R  V    R   NEL Y   LL E  
METHOD   
METHOD 1fgjA	              DDPLYYKKGKLEEVENNLRSM  
METHOD 1fgjA	              LLGGGHHHHHHHHHHHHHHHL  
METHOD 	                 L Y K  LEEV   L  M  
METHOD   
METHOD   
METHOD and predict the following secondary structure:  
METHOD 	  
METHOD 	CGGREGVLKKLRAVENELHYNKSLLEEVKDELQKMRQ  
METHOD 	LLLHHHHHHHLLLLLLHHHHHHHHHHHHHHHHHHLLL  
METHOD   
METHOD Since we do not have the tools here to put these fragments into a  
METHOD single coordinate system, we are just submitting them as separate  
METHOD pieces.  
MODEL 1  
REMARK  ---------------------------------------------------------- 
REMARK  AL2TS service [v. 08/06/1998]: Adam Zemla, adamz@llnl.gov 
REMARK  ---------------------------------------------------------- 
REMARK  Coordinates assigned from PDB entry: 4blm_A 
ATOM      1  N   CYS     1       2.154  83.679   8.788  1.00  0.00              
ATOM      2  CA  CYS     1       1.119  82.910   8.062  1.00  0.00              
ATOM      3  C   CYS     1      -0.130  83.710   7.744  1.00  0.00              
ATOM      4  O   CYS     1      -1.090  83.115   7.218  1.00  0.00              
ATOM      5  N   GLY     2      -0.118  84.985   8.075  1.00  0.00              
ATOM      6  CA  GLY     2      -1.267  85.847   7.814  1.00  0.00              
ATOM      7  C   GLY     2      -2.136  86.181   9.010  1.00  0.00              
ATOM      8  O   GLY     2      -3.233  86.734   8.717  1.00  0.00              
ATOM      9  N   GLY     3      -1.763  85.878  10.235  1.00  0.00              
ATOM     10  CA  GLY     3      -2.616  86.265  11.397  1.00  0.00              
ATOM     11  C   GLY     3      -3.635  85.188  11.751  1.00  0.00              
ATOM     12  O   GLY     3      -3.776  84.166  11.044  1.00  0.00              
ATOM     13  N   ARG     4      -4.343  85.437  12.852  1.00  0.00              
ATOM     14  CA  ARG     4      -5.337  84.505  13.377  1.00  0.00              
ATOM     15  C   ARG     4      -6.423  84.130  12.368  1.00  0.00              
ATOM     16  O   ARG     4      -6.755  82.927  12.258  1.00  0.00              
ATOM     17  N   GLU     5      -6.846  85.103  11.566  1.00  0.00              
ATOM     18  CA  GLU     5      -7.879  84.799  10.548  1.00  0.00              
ATOM     19  C   GLU     5      -7.385  83.812   9.503  1.00  0.00              
ATOM     20  O   GLU     5      -8.174  82.953   9.019  1.00  0.00              
ATOM     21  N   GLY     6      -6.112  83.925   9.113  1.00  0.00              
ATOM     22  CA  GLY     6      -5.565  82.980   8.115  1.00  0.00              
ATOM     23  C   GLY     6      -5.366  81.610   8.753  1.00  0.00              
ATOM     24  O   GLY     6      -5.518  80.540   8.112  1.00  0.00              
ATOM     25  N   VAL     7      -5.036  81.639  10.044  1.00  0.00              
ATOM     26  CA  VAL     7      -4.863  80.366  10.783  1.00  0.00              
ATOM     27  C   VAL     7      -6.228  79.630  10.814  1.00  0.00              
ATOM     28  O   VAL     7      -6.291  78.402  10.668  1.00  0.00              
ATOM     29  N   LEU     8      -7.248  80.435  11.056  1.00  0.00              
ATOM     30  CA  LEU     8      -8.623  79.886  11.154  1.00  0.00              
ATOM     31  C   LEU     8      -8.950  79.123   9.881  1.00  0.00              
ATOM     32  O   LEU     8      -9.251  77.933   9.879  1.00  0.00              
ATOM     33  N   LYS     9      -8.813  79.847   8.793  1.00  0.00              
ATOM     34  CA  LYS     9      -9.039  79.357   7.414  1.00  0.00              
ATOM     35  C   LYS     9      -8.297  78.068   7.120  1.00  0.00              
ATOM     36  O   LYS     9      -8.852  77.115   6.536  1.00  0.00              
ATOM     37  N   LYS    10      -7.031  77.970   7.523  1.00  0.00              
ATOM     38  CA  LYS    10      -6.218  76.770   7.315  1.00  0.00              
ATOM     39  C   LYS    10      -6.690  75.603   8.134  1.00  0.00              
ATOM     40  O   LYS    10      -6.598  74.444   7.662  1.00  0.00              
ATOM     41  N   LEU    11      -7.165  75.880   9.350  1.00  0.00              
ATOM     42  CA  LEU    11      -7.720  74.788  10.199  1.00  0.00              
ATOM     43  C   LEU    11      -9.036  74.274   9.582  1.00  0.00              
ATOM     44  O   LEU    11      -9.294  73.057   9.559  1.00  0.00              
END
