PFRMAT AL 
TARGET T0063 
AUTHOR 9070-5088-8627 
REMARK  
REMARK Prediction date: Tuesday July 14, 1998 
REMARK Group name: UCSC-compbio 
REMARK Authors: Christian Barrett, Melissa Cline, Mark Diekens, Kevin Karplus, 
REMARK 	 David Haussler and Richard Hughey 
REMARK University of California, Santa Cruz 
REMARK  
METHOD Overview 
METHOD  
METHOD Fold recognition was performed using the Target98 (SAM-T98) method 
METHOD [3] using SAM version 2.1.1 [1], a refinement of the methods developed 
METHOD by this group for CASP2 [2].  This method attempts to find and multiply  
METHOD align a set of homologs to a given sequence, then create an HMM from that  
METHOD multiple alignment. 
METHOD  
METHOD First, a set of sequence weights is determined from the alignment.  Next,  
METHOD Modelfromalign is used to build the model from the alignment and the  
METHOD sequence weights.  Finally, hmmscore performs a local, all-paths scoring  
METHOD of the sequences, using a reversed-sequence normalization feature. 
METHOD  
METHOD The weighting method, detailed in upcoming publications [3,4], 
METHOD combines the Henikoffs' scheme [5], Dirichlet mixtures [6], and an 
METHOD entropy method to set the final weights. 
METHOD  
METHOD Alignment generation 
METHOD  
METHOD The initial step uses BLASTP to search NRP twice: once to produce a set 
METHOD of very close homologs, and once to produce a set of possible homologs. 
METHOD  
METHOD The method then uses multiple iterations of a selection, training, and  
METHOD alignment procedure.  Each iteration involves an initial alignment, a set  
METHOD of search sequences, a threshold value, and a transition regularizer.  
METHOD  
METHOD The first iteration uses a single sequence (or seed alignment) as the  
METHOD initial alignment and the close homologs found by BLASTP are used as the  
METHOD search set.  The threshold is set very strictly, so that only good matches  
METHOD to the sequence are considered.  This iteration uses a transition regularizer  
METHOD that was designed to match the gap costs used by BLASTP. 
METHOD  
METHOD On subsequent iterations the input alignment is the output from the 
METHOD previous iteration, the search set is the larger set of possible 
METHOD homologs found by BLASTP, and the thresholds are gradually loosened. 
METHOD The second through second-from-last iteration use a ``long-match'' 
METHOD transition regularizer, and the final iteration uses a transition regularizer  
METHOD trained on FSSP alignments. 
METHOD  
METHOD References 
METHOD [1] R. Hughey and A. Krogh, CABIOS 12(2): 95-107, 1996. 
METHOD     http://www.cse.ucsc.edu/research/compbio/sam.html.   
METHOD [2] K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R. 
METHOD     Hughey, L. Holm, and C. Sander, Proteins: Structure, Function, and  
METHOD     Genetics, Suppl. 1, 134-9, 1997. 
METHOD [3] K. Karplus, C. Barrett, and R. Hughey, Technical Report UCSC-CRL-98-06, 
METHOD     Department of Computer Engineering, Univ. of California, Santa Cruz, 1998. 
METHOD [4] J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard, 
METHOD     and C. Chothia, http://cyrah.med.harvard.edu/~jong/assess_final.html, 1998. 
METHOD [5] S. Henikoff and J. C. Henikoff, JMB, vol 243, pp 574-578, Nov 1994. 
METHOD [6] K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S. 
METHOD    Mian, and D. Haussler, CABIOS 12(4):327-345, 1996. 
METHOD  
METHOD  
METHOD Our prediction with 1pex as the template structure is very 
METHOD tenuous---the score was in the region where 95% of the hits are false 
METHOD positives.  We had several higher scoring templates (1plq, 1tbgE, 
METHOD 1eaf, 1pmaA, 3pchM, 1pud), but we decided to submit 1pex because 
METHOD double-blast also (very weakly) identified 1pex as a homolog (with 
METHOD GP:S72024 as an intermediate), because the secondary structure 
METHOD prediction matched fairly well, and because the two lysine residues 
METHOD that are subject to posttranslational modification in the target are 
METHOD conserved in our alignment and exposed on the surface. 
METHOD  
METHOD We would be more confident of our prediction if t0063 were known to be 
METHOD a dimer, since we are predicting half of a 4-sheet beta propeller. 
METHOD Experimental evidence, though, points to its existence as a monomer. 
METHOD  
METHOD There is a very strong chance that T0063 is a new fold, but it would 
METHOD be neat if it dimerized to form a beta propeller. 
METHOD  
METHOD We would have liked to submit two alignments: one generated 
METHOD automatically, and one tweaked a bit by hand.  Based on 
METHOD previous experience, we probably did more harm than good in tweaking 
METHOD the alignment.  The verification server will not accept multiple 
METHOD predictions at this time, so we are submitting the automatically 
METHOD generated alignment.  The second alignment, that we would have liked to 
METHOD submit, is shown below: 
METHOD                                                                        
METHOD T0063 mvlkwvm......................................................... 
METHOD 1pex  tpdkcdpslsldaitslrgetmifkdrffwrlhpqqvdaelfltksfwpelpnridaayehpsh 
METHOD  
METHOD  
METHOD                             10        20        30        40        50 
METHOD                              |         |         |         |         | 
METHOD T0063 ..............-------STKYVEAGELKE-----GSYVVIDGEPCRVVEIEKSKTGKHGS 
METHOD 1pex  dlififrgrkfwal------------------------NGYDILEGYPKKISELGLPKEVKKIS 
METHOD  
METHOD  
METHOD               60        70        80        90       100       110     
METHOD                |         |         |         |         |         |     
METHOD T0063 AKARIV---AVGVFDGGKRTLSLPVDAQVEVPIIEKFTAQILSVSGDVIQLMDMRDYKTIEVPM 
METHOD 1pex  AAVHFEDTGKTLLFSGNQVWRY-----------------------DDTNHIMD-KDY------P 
METHOD  
METHOD  
METHOD          120       130       140       150       160                   
METHOD            |         |         |         |         |                   
METHOD T0063 KYVEEEAKGRLAPGAEVEVWQILDRYKIIRVK---------------------........... 
METHOD 1pex  RLIEEDF-----PGIGDKVDAVYEKNGYIYFFNGP------------------iqfeysiwsnr 
METHOD  
METHOD  
METHOD                      
METHOD                      
METHOD T0063 .............. 
METHOD 1pex  ivrvmpansilwc. 
METHOD  
METHOD  
MODEL 1 
PARENT 1pex 
G 20 N 352 
S 21 G 353 
Y 22 Y 354 
V 23 D 355 
V 24 I 356 
I 25 L 357 
D 26 E 358 
G 27 G 359 
E 28 Y 360 
P 29 P 361 
C 30 K 362 
R 31 K 363 
V 32 I 364 
V 33 S 365 
E 34 E 366 
I 35 L 368 
E 36 G 369 
K 37 L 370 
S 38 P 371 
K 39 K 372 
T 40 E 373 
G 41 V 374 
K 42 K 375 
H 43 K 376 
G 44 I 377 
S 45 S 378 
A 46 A 379 
K 47 A 380 
A 48 V 381 
R 49 H 382 
I 50 F 383 
V 51 E 384 
A 52 D 385 
V 53 T 386 
G 54 G 387 
V 55 L 391 
F 56 F 392 
D 57 S 393 
G 58 G 394 
G 59 N 395 
K 60 Q 396 
R 61 V 397 
T 62 W 398 
L 63 R 399 
S 64 Y 400 
G 88 D 401 
D 89 D 402 
V 90 T 403 
I 91 N 404 
Q 92 H 405 
L 93 I 406 
M 94 M 407 
D 95 D 408 
M 96 K 409 
R 97 D 410 
D 98 Y 411 
Y 99 P 412 
K 100 R 413 
T 101 L 414 
I 102 I 415 
E 103 E 416 
V 104 E 417 
P 105 D 418 
M 106 F 419 
K 107 P 420 
Y 108 G 421 
V 109 I 422 
E 110 G 423 
E 111 D 424 
E 112 K 425 
A 113 V 426 
K 114 D 427 
E 124 A 428 
V 125 V 429 
W 126 Y 430 
Q 127 E 431 
I 128 K 432 
L 129 N 433 
D 130 G 434 
R 131 Y 435 
Y 132 I 436 
K 133 Y 437 
I 134 F 438 
I 135 F 439 
R 136 N 440 
V 137 G 441 
K 138 P 442 
TER 
END 
