
PFRMAT AL
TARGET T0090
AUTHOR 3670-4530-6947
REMARK 
REMARK Prediction date: Friday June 30, 2000
REMARK Group name: UCSC-compbio
REMARK Authors: Christian Barrett, Melissa Cline, Mark Diekhans, Leslie Grate,
REMARK 	 Kevin Karplus, Richard Hughey, I. Saira Mian, and Spencer Tu
REMARK University of California, Santa Cruz
REMARK 
METHOD Overview
METHOD 
METHOD Fold recognition for this target was performed using the SAM-T99
METHOD method (which is similar to SAM_T98 [3]) using SAM version 3.1 [1], a
METHOD refinement of the methods developed by this group for CASP3 [7].  This
METHOD method attempts to find and multiply align a set of homologs to a
METHOD given sequence, then create an HMM from that multiple alignment.
METHOD 
METHOD First, a set of sequence weights is determined from the alignment.  Next, 
METHOD Modelfromalign is used to build the model from the alignment and the 
METHOD sequence weights.  Finally, hmmscore performs a local, all-paths scoring 
METHOD of the sequences, using a reversed-sequence normalization feature.
METHOD 
METHOD The weighting method, detailed in publications [3,4],
METHOD combines the Henikoffs' scheme [5], Dirichlet mixtures [6], and an
METHOD entropy method to set the final weights.
METHOD 
METHOD Alignment generation
METHOD 
METHOD The initial step uses BLASTP to search NRP twice: once to produce a set
METHOD of very close homologs, and once to produce a set of possible homologs.
METHOD 
METHOD The method then uses multiple iterations of a selection, training, and 
METHOD alignment procedure.  Each iteration involves an initial alignment, a set 
METHOD of search sequences, a threshold value, and a transition regularizer. 
METHOD 
METHOD The first iteration uses a single sequence (or seed alignment) as the 
METHOD initial alignment and the close homologs found by BLASTP are used as the 
METHOD search set.  The threshold is set very strictly, so that only good matches 
METHOD to the sequence are considered.  This iteration uses a transition regularizer 
METHOD that was designed to match the gap costs used by BLASTP.
METHOD 
METHOD On subsequent iterations the input alignment is the output from the
METHOD previous iteration, the search set is the larger set of possible
METHOD homologs found by BLASTP, and the thresholds are gradually loosened.
METHOD The second through second-from-last iteration use a ``long-match''
METHOD transition regularizer, and the final iteration uses a transition regularizer 
METHOD trained on FSSP alignments.
METHOD 
METHOD References
METHOD [1] R. Hughey and A. Krogh, CABIOS 12(2): 95-107, 1996.
METHOD     http://www.cse.ucsc.edu/research/compbio/sam.html.  
METHOD [2] K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R.
METHOD     Hughey, L. Holm, and C. Sander, Proteins: Structure, Function, and 
METHOD     Genetics, Suppl. 1, 134-9, 1997.
METHOD [3] K. Karplus, C. Barrett, and R. Hughey, Technical Report UCSC-CRL-98-06,
METHOD     Department of Computer Engineering, Univ. of California, Santa Cruz, 1998.
METHOD [4] J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard,
METHOD     and C. Chothia, http://cyrah.med.harvard.edu/~jong/assess_final.html, 1998.
METHOD [5] S. Henikoff and J. C. Henikoff, JMB, vol 243, pp 574-578, Nov 1994.
METHOD [6] K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S.
METHOD    Mian, and D. Haussler, CABIOS 12(4):327-345, 1996.
METHOD [7] Karplus, K; Barrett, C; Cline, M; Diekhans, M; Grate, L; Hughey, R. 
METHOD     Predicting protein structure using only sequence information.
METHOD     Proteins, 1999, Suppl 3:121-5.
METHOD 
METHOD We got a strong hit to both 1mut and 1tum, plus a weak hit to 1lvl.
METHOD 1mut and 1tum are very similar; we chose to pursue 1tum, which is the
METHOD FSSP representative for the family.
METHOD 
METHOD We submitted two models for this prediction.  The first was a hand-edited
METHOD combination of two alignments: a global posterior-decoded alignment to the
METHOD 1tum FSSP alignment, and a local posterior-decoded alignment to the SAM-T2K
METHOD alignment for the 1tum family.  Our second model was the second alignment,
METHOD the local posterior-decoded alignment to the SAM-T2K 1tum family, and was
METHOD generated without manual intervention.
METHOD 
MODEL 2
PARENT 1tum
G 57 K 3
H 58 L 4
A 59 Q 5
A 60 I 6
V 61 A 7
L 62 V 8
L 63 G 9
P 64 I 10
F 65 I 11
D 66 R 12
P 67 N 13
V 68 E 14
R 69 N 15
D 70 N 16
E 71 E 17
V 72 I 18
V 73 F 19
L 74 I 20
I 75 T 21
E 76 R 22
Q 77 R 23
I 78 A 24
R 79 A 25
I 80 D 26
A 81 A 27
E 87 H 28
T 88 M 29
P 89 A 30
W 90 N 31
L 91 K 32
L 92 L 33
E 93 E 34
M 94 F 35
V 95 P 36
A 96 G 37
G 97 G 38
M 98 K 39
I 99 I 40
E 100 E 41
E 101 M 42
G 102 G 43
E 103 E 44
S 104 T 45
V 105 P 46
E 106 E 47
D 107 Q 48
V 108 A 49
A 109 V 50
R 110 V 51
R 111 R 52
E 112 E 53
A 113 L 54
I 114 Q 55
E 115 E 56
E 116 E 57
A 117 V 58
G 118 G 59
L 119 I 60
I 120 T 61
V 121 P 62
K 122 Q 63
R 123 H 64
T 124 F 65
K 125 S 66
P 126 L 67
V 127 F 68
L 128 E 69
S 129 K 70
F 130 L 71
L 131 E 72
A 132 Y 73
S 133 E 74
P 134 F 75
G 135 P 76
G 136 D 77
T 137 R 78
S 138 H 79
E 139 I 80
R 140 T 81
S 141 L 82
S 142 W 83
I 143 F 84
M 144 W 85
V 145 L 86
G 146 V 87
E 147 E 88
V 148 R 89
D 149 W 90
A 150 E 91
T 151 G 92
T 152 E 93
A 153 P 94
S 154 W 95
E 162 G 96
N 163 K 97
E 164 E 98
D 165 G 99
I 166 Q 100
R 167 P 101
V 168 G 102
H 169 E 103
V 170 W 104
V 171 M 105
S 172 S 106
R 173 L 107
E 174 V 108
Q 175 G 109
A 176 L 110
Y 177 N 111
Q 178 A 112
W 179 D 113
V 180 D 114
E 181 F 115
E 182 P 116
G 183 P 117
K 184 A 118
I 185 N 119
TER
END

