
PFRMAT SS
TARGET T0098
AUTHOR 5287-1010-7667
METHOD We took 20 x (M=121) log-odds substitution matrix from psi-blast
METHOD and used this in place of 121 amino acid sequence. 
METHOD Then we took a window of 8 amino acids on either
METHOD side of a given amino acid.  This gave us 340 attributes for a
METHOD single amino acid with a structure prediction. This data was
METHOD formatted to be compatible for learning with C4.5 release 8.
METHOD 3.6 million examples were generated after windowing from 
METHOD whole PDB. All of the windowed data was partitioned disjointly into
METHOD 16 partitions and decision trees using C4.5 were created on each of
METHOD the partitions. The target data, T0098 was windowed in the same way
METHOD as the training data. Each learned decision tree then attempted a
METHOD classification of protein sequences in the target data and the
METHOD votes were accumulated. The percentage votes translated into the
METHOD confidence level. The class outputs were smoothed using a window of
METHOD size 5, with the prediction changed if the neighbors and the
METHOD predicted AA had higher average confidence levels for a different
METHOD structure. The average confidence level of each window is given to
METHOD the predicted structure. The singleton H's were replaced by the
METHOD neighboring residue prediction. The relevant pubs and references
METHOD are at http://morden.csee.usf.edu/~chawla for viewing.
MODEL 1
D C 0.88
N C 0.92
K C 0.73
P C 0.69
K C 0.60
N C 0.52
L C 0.44
D C 0.48
A H 0.49
S H 0.50
I H 0.50
T H 0.58
S H 0.60
I H 0.50
I H 0.46
H H 0.45
E C 0.43
I C 0.55
G C 0.60
V C 0.63
P C 0.61
A C 0.46
H C 0.43
I C 0.43
K C 0.38
G C 0.42
Y C 0.43
L C 0.38
Y H 0.41
L H 0.46
R H 0.57
E H 0.62
A H 0.68
I H 0.73
A H 0.74
M H 0.73
V H 0.74
Y H 0.68
H H 0.64
D H 0.60
I H 0.51
E H 0.51
L H 0.43
L H 0.39
G H 0.43
S H 0.44
I C 0.39
T H 0.40
K H 0.37
V C 0.47
L C 0.46
Y C 0.52
P C 0.52
D C 0.52
I C 0.43
A C 0.46
K C 0.41
K C 0.41
Y C 0.48
N C 0.57
T C 0.56
T C 0.57
A C 0.53
S H 0.47
R H 0.61
V H 0.64
E H 0.74
R H 0.72
A H 0.71
I H 0.68
R H 0.66
H H 0.55
A H 0.49
I H 0.47
E E 0.42
V E 0.45
A H 0.38
W H 0.36
S C 0.49
R C 0.53
G C 0.54
N C 0.50
L C 0.45
E E 0.35
S H 0.47
I H 0.56
S H 0.54
S H 0.55
L H 0.44
F C 0.48
G C 0.43
Y C 0.39
T E 0.44
V E 0.47
S E 0.44
V C 0.48
S C 0.61
K C 0.74
A C 0.79
K C 0.76
P C 0.75
T C 0.68
N C 0.57
S C 0.44
E H 0.47
F H 0.55
I H 0.53
A H 0.58
M H 0.58
V H 0.66
A H 0.68
D H 0.68
K H 0.66
L H 0.58
R H 0.58
L H 0.52
E H 0.52
H H 0.47
K C 0.50
A C 0.54
S C 0.92
END


