- DECOYSETS2019

Detailed description of the experiment

Goals

Scope

Timetable

Goals

The main goal of the CASP experiments is to obtain an in-depth and objective assessment of our current abilities and inabilities in the area of protein structure modeling. To this end, participants will produce models of soon to be released experimental structures. These models will be true predictions, not ‘post-dictions’ made on already known structures.

In addition to the traditional themes of CASP, there will be a strong focus on new methods for predicting three-dimensional contacts, modeling heterocomplexes and multimers (in collaboration with CAPRI), and assessment of the extent to which a model can help addressing specific biological questions, or sparse experimental data and predicted contacts can improve the accuracy of models.

CASP12 will address the following questions:

How similar are the models to the corresponding experimental structure?
Are domain orientations, subunit interactions and the protein initeractions in complexes modeled correctly?
How much better are template-based models than those that can be obtained by simply copying the best template?
How reliable are overall, residue, and atomic level error estimates?
How much can current refinement methods improve the accuracy of models?
How effective are newly emerging methods at predicting protein three dimensional contacts?
How well do the models help answering relevant biological questions?
How helpful is additional information, such as sparse NMR data, chemical cross-linking or SAXS in structure modeling?
What methods are most effective?
Has there been progress since the last CASP?
Where can future effort be most productively focused?

Assessment categories

The High Accuracy Modeling category will include domains where majority of submitted models are of sufficient accuracy for detailed analysis.
For CASP12, established numerical methods will be used to evaluate main chain, side chains, atomic accuracy, and contacts, as well as hydrogen bonds and covalent geometry. This category replaces the previous Template Based Modeling category.
The Biological Relevance category will assess models on the basis of how well they provide answers to biological questions. This category builds on the CASP11 pilot assessment. Target providers will be asked to say what questions prompted the determination of the experimental structure, and the ability of models to provide answers to those questions will be compared with the extent to which the experimental structure can do so in addition to assessing aspects of accuracy that include sequence alignment, backbone accuracy, and side chain placement.
The Topology category (formerly Free Modeling) will assess domains where all submitted models are of relatively low accuracy using the established CASP metrics together with assessor judgment.
The Data Assisted category will assess how much the accuracy of models is improved by the addition of sparse data. Targets for which such data are available will be re-released after initial data independent models have been collected, together with the available data. Data types are expected to include simulated and actual sparse NMR data, crosslinking data, and low angle X-ray scattering data.
The Contact Prediction category will assess the ability of methods to predict three dimensional contacts in targets structures.
The Refinement category will analyze success in refining models beyond the accuracy obtained in the initial submissions. Selected targets from among those released in the main modeling experiment will be included. We will select one of the best models received during the prediction season, and reissue it as a starting structure for refinement.
The Assembly category will assess how well current methods can determine domain-domain, subunit-subunit, and protein-protein interactions. As in CASP11, we expect to work closely with CAPRI in this category.
The Accuracy Estimation category will assess the ability to provide useful accuracy estimates for models at the overall, residue, and atomic levels.

Timetable

April 2016 - Start of the registration for CASP12 prediction experiment.
April 20, 2016 - Start of the testing of server connectivity ("dry run" for server predictors).
May 1, 2016 - Release of the first CASP12 prediction target.
July 15, 2016 - Last date for releasing regular prediction targets.
July 31, 2016 - End of the regular prediction season.
August 18, 2016 - End of the refinement and data-assisted prediction season.
July/August 2016 - Early bird registration for the December predictors' meeting.
September 2016 - Collection of abstracts describing the methods tested in CASP12.
October/November 2016 - Invitation of groups with the most accurate predictions and interesting methods to give talks at the meeting.
Novermber 2016 - Publishing of the program of the meeting.
December 2016 - Predictors' meeting to discuss the results of the experiment.

Registration

Participation is open to all.

If you are new to CASP and don't have an account with the Prediction Center, you will have to register with the Prediction Center first and only then proceed to CASP12 registration.
If you already have an account with the Prediction Center, you can go directly to the CASP12 registration page. Please check, though, that your basic registration information is current. If it has changed - please update it through the My Personal Data link from the main Menu.

Predictors with servers are requested to register before April 19, 2016 as we are planning to start checking servers' format and connectivity on that day.

Targets

Targets suggested for prediction in CASP12 can be found on the Target List page.
Details on the target collection and release procedures are available at our Q&A page.
For the experiment to succeed, it is essential that we obtain the help of the experimental community. As in previous CASPs, we invite protein crystallographers and NMR spectroscopists to provide details of structures they expect to have made public before September 15, 2016. The last day for suggesting proteins as CASP targets is July 14, 2016. A target submission form is available here.

Predictions

Predictions can be submitted through the Prediction Submission form available from this web site or by the email provided on the CASP12 format page . Please comply with the instructions on submission procedures and format provided there. Server predictions will be made publicly available shortly after the closing of the prediction window for a specific target.

Assessment of Predictions

As in previous CASPs, independent assessors will evaluate the predictions. Assessors will be provided with the results of numerical prediction evaluations performed at the Prediction Center, and will judge the results primarily on that basis. They will be asked to focus particularly on the effectiveness of different methods. Evaluation criteria will as far as possible be similar to those used in previous CASPs, although the assessors are welcome to introduce additional measures.

There will be five assessors, focusing on the following areas of prediction:

Biological Relevance of models - Russ Altman (Stanford University, USA)
Topology and Data-Assisted modeling - Matteo Dal Peraro (EPFL, Lausanne, Switzerland)
Contacts - Alexandre Bonvin (University of Utrecht, Netherlands)
Refinement - Francesco Luigi Gervasio (University College London, UK)
Assembly (quaternary structure and complexes) - Guido Capitani (Paul Scherrer Institut, Switzerland)

High accuracy models and estimates of model accuracy will be assessed with standard CASP metrics.

Click here for the list of assessors in all CASPs held so far.

In accordance with CASP policy, assessors are not directly involved in the organization of the experiment, nor can they take part in the relevant parts of the experiment as predictors. Predictors must not contact assessors directly with queries, but rather these should be sent to the email address.

Results and Publication

All CASP predictions and results of numerical evaluation will be made available through this web site shortly before the meeting. The proceedings of the meeting will be published in a scientific journal (see publications of previous experiments). All participants will also be required to describe their methods in the abstracts (published locally at our web site) and encouraged to discuss them on the FORCASP forum. These contributions will be discussed and scored by other predictors, and this material will be taken into account in choosing some presentations at the meeting. Also, predictors presenting posters at the meeting should be prepared to give a short presentation, as some talks will be invited during the meeting based on the discussion of poster sessions.

Meeting

The meeting to discuss results of the experiment will be held at Hotel Serapo in Gaeta, Italy in December 2016. The meeting will start at 6pm on December 10 and run through noon of December 13. The total cost of the meeting, including the early registration fee and an all-inclusive lodging fee (room, all meals and coffee breaks for 3 nights) is 850 EURO per person in a single room and 750 EURO in double. Some financial assistance may be available for the most successful predictors and students. Registration for the meeting will open in July/August 2016.

Organizing Committee

       John Moult, CASP chair and founder; IBBR, University of Maryland, USA
       Krzysztof Fidelis, University of California, Davis, USA
       Andriy Kryshtafovych, University of California, Davis, USA
       Torsten Schwede, University of Basel, Switzerland
       Anna Tramontano, University of Rome, Italy

Scientific Advisory Board

       David Baker, University of Washington
       Nick Grishin, University of Texas
       David Jones, University College, London
       Justin MacCallum, University of Calgary
       Michael Sternberg, Imperial College, London

Submission rules for all types of groups

Predictions in CASP12 may be submitted in 4 formats:

  TS    # Atomic coordinates 
  RR    # Pairs of residues in contact
  QA    # Model accuracy assessment
  IA    # Interface accuracy assessment
  Note. IA category has been announced on June 20, 2016.

One team may make a prediction of a target by submitting up to five models in the TS categories, one model in the RR and IA categories and two models in the QA category (see the QA format section for the timeline example of a typical QA prediction).
Submissions for regular prediction targets as well as refinement and data-assisted targets should be submitted in the TS format. Input data for contact-assisted predictions will be provided in the RR format; a starting model for the refinement will be selected from among the best server models.
Each submission file should contain prediction for only one target.
Each submission file should contain only one of the allowed format categories.
Submission files in RR and QA categories should contain only one model.
Submission files in TS categories may contain either one or several models. Most of the evaluation and assessment will focus on the model labeled '1' (model index 1, see MODEL record). Each model should begin with the MODEL record, end with the END record, and contain no target residue repetitions. You may specify only one set of required header fields (PFRMAT, TARGET, AUTHOR, METHOD) above the first MODEL record in the prediction file. A multiple-model file will be split into separate files (one model per file) and each model (up to 5) will be sent separately to the verification server.
Submission of a duplicate model (same target, format category, group, model index) will replace previously accepted model, provided it is received before the deadline.
Each submission must begin with the PFRMAT, TARGET and AUTHOR records, contain the METHOD field and at least one block starting with the MODEL and ending with the END record.
Each submitted model is automatically verified by the format verification server. In case of successful submission no confirmation email will be sent. A unique model ACCESSION CODE is composed from the number of the target, prediction format category, prediction group number, and model index.
```
   Example:

   Accession code  T0444TS005_2  has the following components:
     T0044   target number
     TS      Tertiary Structure (PFRMAT TS)
     005     prediction group 5
     2       model index 2 
```
The accepted predictions could be viewed using Model Viewer link from the CASP12 web page.
If the submission contains an error, the regular group leader or server contact person will be immediately notified through email. If your prediction is rejected for format inconsistency, you will have the possibility to correct problems and re-send prediction(s) within the target prediction time window.

Submission rules for expert groups (usually, 3-week deadline in TS and RR categories, 2 day deadline for QA)

Predictions can be submitted by a group leader or a group member with submission privileges. The group leader can set the privileges (regular member or submitter) for every member of his group using the 'Review member status' option from 'My CASP12 profile' link. Members of prediction groups who intend to submit predictions should receive submission permission from the group leader first and then use the 12-digit Registration Code of the group to submit predictions for that group.
Models for regular deadline groups should be submitted directly by e-mail to models AT predictioncenter.org or using the CASP12 model submission facility.
When sending predictions by email, please send them in the body of the message.
When sending predictions by email, please remember to use only the email address registered with the Prediction Center as origination points (make sure we have the updated email address for you on file - check for this your "My Personal Data" link from the menu). If you temporary cannot use the registered email address for submission, please use the submission form instead.
Time for returning regular group predictions is set separately for each target. Usually regular deadline predictors have around 3 weeks from the date of target release to return a prediction.
Predictions in TS and RR categories should be normally sent only on all-group targets.
Predictions in TS categories should contain sensible residue error estimates in the column reserved for the B-factor value in the PDB format.
Predictions in QA category should be sent for all targets.
Multimeric predictions may be sent for all targets.
Interface accuracy predictions may be sent for all multimeric targets, including homomultimers and heteromultimers. In CASP12, IA predictions should be sent on all applicable targets between July 1 and July 22.

Submission rules for server groups (3-day deadline in TS and RR categories, 2 day deadline for QA)

CASP12 queries will be sent to the registered servers from the CASP distribution server casp-meta AT predictioncenter.org. Email servers are advised to reply to this address immediately upon receiving the query with an acceptance email with subject: "T0xxx - query received by MY_SERVER". This will help us to track whether your server received a request from us so that we can timely address any connectivity issues. Please do not send your predictions to this address as they will be ignored.
We will be sending 3 variables to your server's submission URL (or email): the SEQUENCE, the TARGET-NAME and the REPLY-E-MAIL (where to return the results). For the servers participating in model accuracy assessment and data-assisted categories, we will be sending the TARBALL-LOCATION variable instead of (or in addition to, if you specify so) the SEQUENCE. Names for these server-specific parameters will be taken from your server registration form.
Server models should be returned automatically to the address specified in the REPLY-E-MAIL field of the query. Please note that the return address should be always taken from our query and not hard-coded as we may change it during the season.
TS and RR servers are requested to return predictions in 72 hours from the target release time. No additional time for corrections will be allotted, but corrections will be accepted within the original 72 hour window. Please, send your corrections manually to the address specified in the REPLY-E-MAIL field of the original query. Remember, that corrections can be submitted only by a group leader or a group member with submission privileges. The group leader can set the privileges (regular member or submitter) for every member of his group using the 'Review member status' option from 'My CASP12 profile' link. Members of prediction groups who intend to submit predictions should receive submission permission from the group leader first.
Server models must be submitted in the body of the email as a plain text. Subject of the email preferrably should contain the target number and the group name.
Each submission may contain several models. If server returns more than 5 models, the models numbered 6 and higher will be ignored (or 2 and higher for RR category). In QA category either model 1 or model 2 will be accepted depending on the stage of the QA request (see the General Rules above or description of the MODEL record below).
The submission engine will resend the query if it encounters obvious connecting problems (network timeouts, 'no response' etc.). Failures that go beyond that require special attention, but we'll make every effort to notify server curators ASAP if we suspect something is not working. The facility that allows checking accepting predictions from servers is available from our website.

Format description

All submissions should contain records described below. Each of these records must begin with a standard keyword. In all submissions standard keywords must begin in the first column of a record. The keyword set is as follows:

PFRMAT     Format specification code:  TS , RR , QA, IA 
TARGET     Target identifier from the CASP12 target table
AUTHOR     XXXX-XXXX-XXXX   Registration code of the Group Leader or Server Group Name 
SCORE      Reliability of the model (optional) 
REMARK     Comment record (may appear anywhere after the first 3 required lines, optional)
METHOD     Records describing the methods used
MODEL      Beginning of the data section for the submitted model
PARENT     Specifies structure template used to generate the TS model 
TER        Terminates independent segments of structure in the TS model
END        End of the submitted model

Models should be submitted in Plain Text format.

Record PFRMAT should appear on the first line of the prediction and is used for all submissions.

   PFRMAT TS
     TS  indicates that submission contains 3D atomic coordinates
         in standard PDB format

   PFRMAT RR
     RR  indicates that submission contains a residue-residue 
         separation distance prediction

   PFRMAT QA
     QA  indicates that submission contains estimates of model accuracy

   PFRMAT IA
     IA  indicates that submission contains prediction of interchain interfaces

Record TARGET should appear on the second line of the prediction and is used for all submissions.

   TARGET Txxxx
     Txxxx indicates id of the target predicted.

Record AUTHOR should appear on the third line of the prediction and is used for all submissions.

 For all groups:
   AUTHOR XXXX-XXXX-XXXX
          XXXX-XXXX-XXXX indicates the Group Registration code.
          This is the code obtained by the group leader upon registration.

	  Note: Members of prediction groups who intend to submit predictions
          should receive submission permissions from the group leader and 
	  use the registration code of the Group for all predictions submitted by 
	  that group. If sending predictions by email, please send them from the 
	  registered emails of the group leader or group submitter. 
	  If you temporary can not use these emails for submission, please login 
	  to our website and then use our web-based submission facility. 

 Servers alternatively can be identified using their registered group names: 
   AUTHOR MY_SERVER_NAME     
      or 
   REMARK AUTHOR MY_SERVER_NAME
          where MY_SERVER_NAME is a name selected for the server group at registration

SCORE Optional. This record may be used to report a model reliability score. It will not influence the evaluation.

REMARK Optional. PDB style 'REMARK' records may be used anywhere in the submission. These records may contain any text and will in general not influence evaluation.

Records METHOD are used for all submissions.
These records describe the method used. Predictors are urged to provide a concise description of the method, including data libraries used, and values of default and non-default parameters.

Record MODEL is used for all submissions.
Signifies the beginning of model data.

   MODEL  n  
     n          Model index n is used to indicate predictor's ranking
                according to her/his belief which TS model is closest to the 
                target structure (1 <= n <= 5). Model index is included
                automatically in the ACCESSION CODE. All models with index
                higher than 5 will be discarded.

Model index should be set to 1 in RR category. In QA category, predictors are requested to use model index '1' for the predictions submitted at the first QA stage (i.e., for the quality estimates made on the selected set of server models released 5 days after the target release for tertiary structure prediction), and use model index '2' for the predictions submitted on a larger set of TS models at the second QA stage (i.e., for the quality estimates made on the models released 2 days after the release of the first set of models in QA category).

Record PARENT is required only for the submissions in the TS format.
PARENT record indicates structure templates used to generate any independent segment of MODEL (see description of the TS format below). The PARENT record should be placed as the first record of any such independent segment. Only one PARENT record per structure segment is allowed. For multimeric predictions only one PARENT record per whole structure is allowed.

   PARENT N/A
     Indicates that a prediction is not directly based on any known
     structure. Note that this is the only indication in the file that the
     prediction is ab initio, so is a critical piece of information.

   PARENT 1abc_A
     Indicates that the model or the independent segment of structure is
     based on a single PDB entry 1abc chain A (use _A to indicate chain A).
     All template-based predictions should be submitted with this form 
     of the PARENT record. Note that, in order to be accepted, the code 
     must correspond to a current PDB entry.

   PARENT 1cdc 2def_g [3hij_k ...]
     Indicates that the model is based on more than one structural template. 
     Up to five PDB chains may be listed here with additional detailed information 
     included in the METHOD records. Subdomains of the target structure found 
     to correspond to different known folds may be submitted as independent 
     segments of structure with reference to only one PDB chain per segment.

Record INTERFACE is required only for the submissions in the IA format.
INTERFACE record provides names of chains forming the interface. The INTERFACE record should be placed as the first record of an INTERFACE - TER block in the IA prediction.

INTERFACE AB Indicates that data below correspond to the interface between chains A and B.

Record TER is used to terminate an independent segment of structure in TS prediction categories or data block for a specific interface in IA category. In TS predictions, every TER record should correspond to the preceding PARENT record in the model.
Using of only one PARENT-TER block within a TS prediction is strongly encouraged in CASP12.
In IA prediction, every TER record should correspond to the preceding INTERFACE record in the model.

TER

Atomic coordinates (PFRMAT TS).
Standard PDB atom records are used for the atomic coordinates. Format of the submission requires that 80 column long records are used. These may be spaces when needed (see target template PDB files as provided in specific target descriptions available through the CASP12 target table).

Coordinates for each model or an independent structure segment should begin with a single PARENT record and terminate with a TER record (see above).

It is requested that coordinate data be supplied for at least all non-hydrogen main chain atoms, i.e. the N, CA, C and O atoms of every residue. Specifically, if only CA atoms are predicted by the method, predictors are encouraged to build the main chain atoms for every residue before submission to CASP. One program that can make such a conversion is Maxsprout server of Liisa Holm and co-workers. (If only CA atoms were submitted it would not be possible to run most of the analysis software, which would severely limit the evaluation of that prediction.)

When multiple independent segments of structure are used in a prediction, they will be evaluated separately with no assumption of a common frame of reference between the segments. For any given MODEL, no target residue may be repeated among all such independent structure segments. Even though all of the independent PARENT-TER frames will be evaluated, only the best scoring frame will contribute to the group score on any given evaluation domain. Potential multi-domain nature of targets will be addressed in the evaluation even if the prediction is made in a single frame of reference (i.e. without separation into multiple segments of structure). Using of only one PARENT-TER block within a prediction is strongly encouraged.

For quaternary structure predictions, coordinates for all chains should be submitted in the same frame of reference and therefore only one PARENT - TER section is allowed per prediction. This means that no TER record should separate different chains (this is different from the PDB!). First chain should be labeled as A and all the subsequent chains should follow the latin alphabet, e.g., tetramer's chains should be labeled as A, B, C, D.
There will be no announcements or assignments of targets to oligomeric prediction category. Instead, for every target you can submit either tertiary structure prediction or quaternary structure prediction. There is no need to submit monomer in addition to a multimer: we will automatically extract coordinates of the first chain from the quaternary prediction and save it as a monomer for future mainstream evaluation alongside with monomers submitted by other groups. Multimeric predictions will be evaluated separately. Tentative oligomeric state of the protein (if provided by the experimentalists) will be announced through our Target List page, but it is up to predictor to decide what oligomerization state the protein is in.

Atoms for which a prediction has been made must contain a value between 0.01 and 1.00 (usually "1.00") in the occupancy field; those for which no prediction has been made must either contain "0.00" in that field or be skipped altogether.

In place of temperature factor field, the error estimates, in Angstroms, should be provided. We require predictors to submit their error estimates for own predictions as these results will be separately evaluated in the quality assessment category. Models with all residues having the same 'B-factor' will be rejected. If your software predicts per-residue B-factor-like score instead of distance in Angstroms - please convert your B-score to distance d inverting the formula B=(8pi^2*d^2)/3 (or indicate nature of your score in the REMARKS).

Residue-Residue contact prediction (PFRMAT RR).
Data in this format are inserted between MODEL and END records of the submission file.

The prediction should start with the sequence of the predicted target splitted (if necessary) in several rows (see Example 2). The sequence should be followed by the list of contacts in the five-column format:

   i  j  d1  d2  p

   Notes (see Example 2):
     - indices i and j of the two residues in contact should be provided 
	such that i < j, i.e. only half of the contact map is supplied.
     - the numbers d1 and d2 indicate the distance limits defining a contact. 
	In CASP, a pair of residues is defined to be in contact when 
	the distance between their C-beta atoms (C-alpha in case of glycine) 
	is less then 8 Angstroms. Therefore, typically d1=0 and d2=8. 
	These parameters are currently dumb and left in the format 
	only for the consistency with previous CASPs. 
     - the real number p indicates probability of the two residues being 
	in contact, and should be in the range 0.0 - 1.0. Values larger 
	than 0.5 identify the pairs of residues that are predicted to be 
	more likely in contact than not. In binary (two-class) evaluations, 
	the probability value of 0.5 will be considered as the cutoff 
	separating contacts from non-contacts.
	Contacts in the prediction should be listed 
	according to the decreasing probability p. If several contacts 
	are assigned the same probability, for the evaluation purposes 
	they will be considered in the order provided in the prediction. 
     - any pair NOT listed is assumed to be predicted as not in contact. 
     - for multichain predictions, residue indices should be composed of 
       chain ID and residue number, e.g. A2, B44 (see Example 3B).

Interface accuracy prediction (PFRMAT IA).
In this category predictors are asked to provide the probability of specific residues belonging to the interchain interface(s). Residues will be considered to be in the interface if the distance between the closest heavy atoms in the residues belonging to different chains is below 5 Angstroems.
Data in this format are inserted between MODEL and END records of the submission file and should be provided separately for each predicted interface (usually - pair of chains) as an INTERFACE - TER block. Data in each interface block should start with the keyword INTERFACE and terminated with a TER card (see Example 5).

You may submit your estimates of residues belonging to different interfaces in one of the two different modes - IMODE 1 or IMODE 3. Keyword IMODE should be placed in the IA predictions immediately before the first MODEL keyword.
IMODE 1 : probability of residues being in interface of own quaternary structure models.
IMODE 3 : probability of residues being in interface predicted from sequence alone.

The prediction should contain a list of predicted interface residues in in the two-column format:

   res  p

   Notes:
     - the string res identifies a residue in the interface and should be provided 
        in the form A123, where A is chain ID and 123 is a residue number within 
        the range of the target sequence length.
     - the real number p indicates probability of the residue belonging to the 
        interface between two chains and should be in the range 0.0 - 1.0. 
        Probability values larger than 0.5 identify residues that are predicted 
        to be more likely in the interface than not. In binary (two-class) 
        evaluations, the value of 0.5 may be considered as the cutoff separating 
        residues predicted to be in the interface from those that are not. 
     - any residue NOT included in the prediction is assumed to have the probability 
        of 0 of belonging to the interface. 
     - data for each predicted interface in the multimolecular complex should be 
        separated by the TER cards (see Example 5).

Estimation of model accuracy (PFRMAT QA).

In QA category, predictors are requested to use model index '1' for predictions submitted in the first stage (i.e., estimating quality of the selected server models released 5 days after the initial target release), and use model index '2' for predictions submitted on the second, larger set of TS models (i.e., estimating quality of models released 7 days after the initial target release).

Timeline example.
May 1, 9am PDT - target T0644 is released for prediction in non-QA categories.
May 4, noon - the deadline for submitting tertiary structure predictions by servers.
May 6, noon - the first set of server TS predictions (up to 20 models selected primarily to test single-model methods) is sent to the registered QA servers and posted on the casp12 archive page (http://predictioncenter.org/download_area/CASP12/server_predictions/). QA predictions (marked as MODEL 1) for this subset are accepted for two days.
May 8, noon - deadline for "stage 1" QA predictions. The second set of server TS predictions (150 models selected to test both, single-model and clustering methods) is sent to the registered QA servers and posted on the casp12 archive page. QA predictions (marked as MODEL 2) for this second subset of models are accepted for two more days.
May 10, noon - deadline for "stage 2" QA predictions. All server TS predictions are posted on the casp12 archive page. No further QA predictions (from servers or manual groups) are accepted for this target.

Data are inserted between MODEL and END records of the submission file. You may submit your quality assessment prediction in one of the two different modes:
QMODE 1 : global model quality score (MQS - one number per model)
QMODE 2 : MQS and error estimates on per-residue basis.

The first line of data should specify mode identifier, i.e. QMODE (see Example 4).

In both modes, the first column in each line contains model identifier (file name of the accepted 3D prediction). The second column contains the accuracy score for a model as a whole (MQS). The accuracy score is a real number between 0.0 and 1.0 (1.0 being a perfect model). If you don't provide MQS for a model please put "X" in the corresponding place. If you don't want to additionally provide error estimates on per residue basis (QMODE 1), your data table will consist of these two columns only.

If you do additionally provide residue error estimates (QMODE 2), each consecutive column should contain error estimate in Angstroms for all the consecutive resides in the target (i.e., column 3 corresponds to residue 1 in the target, column 4 - to residue 2 and so on). This way data constitute a table (Number_of_models_for_the_target) BY (Number_of_residues_in_the_target + 1). Do not skip columns if you are not predicting error estimates for some residues - instead put "X" in the corresponding column.
Please specify in the REMARKS what you consider to be an error estimate for a residue (CA location error, geometrical center error, etc.).

Note 1. Please, be advised that a QA record line may be very long and that some editors/mailing programs may force line wrap potentially causing unexpected parsing errors. To avoid this problem we recommend that you split long lines into shorter sublines (50-100 columns of data) by yourself. Our parser will consider consecutive sublines (starting with the line containing evaluated model name and ending with the line containing the next model name or tag END) a part of the same logical line.

Note 2. Please, be advised that model quality predictions in CASP are evaluated by comparing submitted estimates of global reliability and per-residue accuracy of structural models with the values obtained from CASP model evaluation packages (LGA, LDDT, CAD-score and others). Since the evaluation score that is used across the categories in CASP is GDT_TS, predictors should strive to predict this score in QMODE1 (QA1). Predicted per-residue distances in QMODE2 should ideally reproduce those extracted from the LGA optimal model-target superpositions.

END record is used for all predictions and indicates the end of a single model submission.

Example 1. Atomic coordinates (Tertiary Structure)

The primary CASP12 format used for tertiary structure prediction

(A) An example of prediction.

PFRMAT TS
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1 
PARENT 1abc 1def_A
ATOM      1  N   GLU     1      10.982  -9.774   1.377  1.00  0.50
ATOM      2  CA  GLU     1       9.623  -9.833   1.984  1.00  0.50
ATOM      3  C   GLU     1       8.913 -11.104   1.521  1.00  0.50
ATOM      4  O   GLU     1       9.187 -11.630   0.461  1.00  0.50
ATOM      5  CB  GLU     1       8.814  -8.614   1.546  1.00  0.50
ATOM      6  CG  GLU     1       7.372  -8.754   2.039  1.00  0.50
ATOM      7  CD  GLU     1       7.339  -8.625   3.562  1.00  0.50
ATOM      8  OE1 GLU     1       8.370  -8.307   4.131  1.00  0.50
ATOM      9  OE2 GLU     1       6.284  -8.846   4.132  1.00  0.50
ATOM     10  N   THR     2       7.998 -11.599   2.304  1.00  1.60
ATOM     11  CA  THR     2       7.266 -12.832   1.907  1.00  1.60
ATOM     12  C   THR     2       6.096 -12.456   1.005  1.00  1.60
ATOM     13  O   THR     2       5.008 -12.217   1.466  1.00  1.60
ATOM     14  CB  THR     2       6.731 -13.533   3.157  1.00  1.60
ATOM     15  OG1 THR     2       7.662 -13.379   4.220  1.00  1.60
ATOM     16  CG2 THR     2       6.526 -15.019   2.864  1.00  1.60
ATOM     17  N   VAL     3       6.308 -12.396  -0.278  1.00  1.70
ATOM     18  CA  VAL     3       5.190 -12.030  -1.187  1.00  1.70
ATOM     19  C   VAL     3       3.954 -12.870  -0.844  1.00  1.70
ATOM     20  O   VAL     3       2.834 -12.471  -1.090  1.00  1.70
ATOM     21  CB  VAL     3       5.608 -12.274  -2.641  1.00  1.70
ATOM     22  CG1 VAL     3       5.542 -13.771  -2.959  1.00  1.70
ATOM     23  CG2 VAL     3       4.664 -11.514  -3.573  1.00  1.70
ATOM     24  N   GLU     4       4.146 -14.029  -0.272  1.00  1.70
ATOM     25  CA  GLU     4       2.976 -14.882   0.086  1.00  1.60
ATOM     26  C   GLU     4       2.153 -14.190   1.175  1.00  1.50
ATOM     27  O   GLU     4       0.942 -14.141   1.109  1.00  1.40
ATOM     28  CB  GLU     4       3.465 -16.238   0.597  1.00  1.30
ATOM     29  CG  GLU     4       2.336 -17.264   0.479  1.00  1.20
ATOM     30  CD  GLU     4       2.929 -18.671   0.391  1.00  1.10
ATOM     31  OE1 GLU     4       4.056 -18.846   0.823  1.00  1.00
ATOM     32  OE2 GLU     4       2.246 -19.551  -0.108  1.00  0.90
TER
END

(B) A model consisting of 2 independent structure segments (could be a target modeled from two PDB domains, where relative orientation is unknown; could be 2 fragments predicted by ab initio methods - ab initio example shown). In a single MODEL no residue should appear twice among all such segments.

PFRMAT TS
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
PARENT N/A
ATOM      1  N   GLU     1      10.982  -9.774   1.377  1.00  0.50
ATOM      2  CA  GLU     1       9.623  -9.833   1.984  1.00  0.50
ATOM      3  C   GLU     1       8.913 -11.104   1.521  1.00  0.50
ATOM      4  O   GLU     1       9.187 -11.630   0.461  1.00  0.50
ATOM      5  CB  GLU     1       8.814  -8.614   1.546  1.00  0.50
ATOM      6  CG  GLU     1       7.372  -8.754   2.039  1.00  0.50
ATOM      7  CD  GLU     1       7.339  -8.625   3.562  1.00  0.50
ATOM      8  OE1 GLU     1       8.370  -8.307   4.131  1.00  0.50
ATOM      9  OE2 GLU     1       6.284  -8.846   4.132  1.00  0.50
ATOM     10  N   THR     2       7.998 -11.599   2.304  1.00  1.60
ATOM     11  CA  THR     2       7.266 -12.832   1.907  1.00  1.60
ATOM     12  C   THR     2       6.096 -12.456   1.005  1.00  1.60
ATOM     13  O   THR     2       5.008 -12.217   1.466  1.00  1.60
ATOM     14  CB  THR     2       6.731 -13.533   3.157  1.00  1.60
ATOM     15  OG1 THR     2       7.662 -13.379   4.220  1.00  1.60
ATOM     16  CG2 THR     2       6.526 -15.019   2.864  1.00  1.60
ATOM     24  N   GLU     4       4.146 -14.029  -0.272  1.00  1.70
ATOM     25  CA  GLU     4       2.976 -14.882   0.086  1.00  1.60
ATOM     26  C   GLU     4       2.153 -14.190   1.175  1.00  1.50
ATOM     27  O   GLU     4       0.942 -14.141   1.109  1.00  1.40
ATOM     28  CB  GLU     4       3.465 -16.238   0.597  1.00  1.30
ATOM     29  CG  GLU     4       2.336 -17.264   0.479  1.00  1.20
ATOM     30  CD  GLU     4       2.929 -18.671   0.391  1.00  1.10
ATOM     31  OE1 GLU     4       4.056 -18.846   0.823  1.00  1.00
ATOM     32  OE2 GLU     4       2.246 -19.551  -0.108  1.00  0.90
TER
PARENT N/A
ATOM     17  N   VAL     3       6.308 -12.396  -0.278  1.00  1.70
ATOM     18  CA  VAL     3       5.190 -12.030  -1.187  1.00  1.70
ATOM     19  C   VAL     3       3.954 -12.870  -0.844  1.00  1.70
ATOM     20  O   VAL     3       2.834 -12.471  -1.090  1.00  1.70
ATOM     21  CB  VAL     3       5.608 -12.274  -2.641  1.00  1.70
ATOM     22  CG1 VAL     3       5.542 -13.771  -2.959  1.00  1.70
ATOM     23  CG2 VAL     3       4.664 -11.514  -3.573  1.00  1.70
TER
END

Example 2. Residue-Residue contact prediction

PFRMAT RR
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
HLEGSIGILLKKHEIVFDGC # <- entire target sequence (up to 50 
HDFGRTYIWQMSDASHMD   #   residues per line)
1 8 0 8 0.720        
1 10 0 8 0.715       # <- i=1 j=10: indices of residues (integers), 
31 38 0 8 0.710       
10 20 0 8 0.690      # <- d1=0  d2=8: the range of Cb-Cb distance   
30 37 0 8 0.678      #    predicted for the residue pair (i,j)  
11 29 0 8 0.673       
1 9 0 8 0.63         # <- p=0.63: probability of the residues i=1 and j=9 
21 37 0 8 0.502      #    being in contact (in descending order) 
8 15 0 8 0.401
3 14 0 8 0.400
5 15 0 8 0.307
7 14 0 8 0.30
END

Example 3. Multichain predictions

(A) An example of 3D atomic coordinates model prediction.

PFRMAT TS
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1 
PARENT N/A
ATOM      1  N   GLU A   1      22.576  19.032  -5.026  1.00  0.00
ATOM      2  CA  GLU A   1      22.879  20.313  -4.321  1.00  0.00
ATOM      3  CB  GLU A   1      22.285  21.478  -5.449  1.00  0.00
ATOM      4  CG  GLU A   1      23.018  21.946  -6.707  1.00  0.00
ATOM      5  CD  GLU A   1      24.351  22.625  -6.434  1.00  0.00
ATOM      6  OE1 GLU A   1      25.379  21.908  -6.380  1.00  0.00
ATOM      7  OE2 GLU A   1      24.381  23.879  -6.291  1.00  0.00
ATOM      8  O   GLU A   1      22.237  20.962  -2.117  1.00  0.00
ATOM      9  C   GLU A   1      21.857  20.684  -3.261  1.00  0.00
ATOM     10  N   VAL A   2      20.585  20.675  -3.601  1.00  0.00
ATOM     11  CA  VAL A   2      19.530  21.006  -2.624  1.00  0.00
ATOM     12  CB  VAL A   2      18.277  21.590  -3.319  1.00  0.00
ATOM     13  CG1 VAL A   2      17.182  21.859  -2.270  1.00  0.00
ATOM     14  CG2 VAL A   2      18.656  22.833  -4.079  1.00  0.00
ATOM     15  O   VAL A   2      18.770  18.750  -2.603  1.00  0.00
ATOM     16  C   VAL A   2      19.096  19.721  -1.933  1.00  0.00
ATOM     17  N   HIS A   3      19.115  19.700  -0.603  1.00  0.00
ATOM     18  CA  HIS A   3      18.780  18.489   0.122  1.00  0.00
ATOM     19  CB  HIS A   3      19.559  18.445   1.410  1.00  0.00
ATOM     20  CG  HIS A   3      21.015  18.684   1.224  1.00  0.00
ATOM     21  CD2 HIS A   3      21.767  19.803   1.367  1.00  0.00
ATOM     22  ND1 HIS A   3      21.851  17.721   0.702  1.00  0.00
ATOM     23  CE1 HIS A   3      23.072  18.220   0.589  1.00  0.00
ATOM     24  NE2 HIS A   3      23.048  19.478   0.985  1.00  0.00
ATOM     25  O   HIS A   3      16.777  19.181   1.220  1.00  0.00
ATOM     26  C   HIS A   3      17.296  18.417   0.409  1.00  0.00
REMARK 
REMARK Predictors should NOT use TER separator between chains 
REMARK
ATOM   1321  N   GLU B   1     -22.603 -17.981  -4.847  1.00  0.00
ATOM   1322  CA  GLU B   1     -22.889 -19.285  -4.180  1.00  0.00
ATOM   1323  CB  GLU B   1     -22.342 -20.410  -5.372  1.00  0.00
ATOM   1324  CG  GLU B   1     -23.122 -20.828  -6.619  1.00  0.00
ATOM   1325  CD  GLU B   1     -24.447 -21.511  -6.324  1.00  0.00
ATOM   1326  OE1 GLU B   1     -25.468 -20.792  -6.207  1.00  0.00
ATOM   1327  OE2 GLU B   1     -24.479 -22.769  -6.227  1.00  0.00
ATOM   1328  O   GLU B   1     -22.172 -20.020  -2.026  1.00  0.00
ATOM   1329  C   GLU B   1     -21.830 -19.701  -3.172  1.00  0.00
ATOM   1330  N   VAL B   2     -20.572 -19.685  -3.557  1.00  0.00
ATOM   1331  CA  VAL B   2     -19.485 -20.056  -2.630  1.00  0.00
ATOM   1332  CB  VAL B   2     -18.260 -20.619  -3.392  1.00  0.00
ATOM   1333  CG1 VAL B   2     -17.131 -20.932  -2.393  1.00  0.00
ATOM   1334  CG2 VAL B   2     -18.674 -21.832  -4.184  1.00  0.00
ATOM   1335  O   VAL B   2     -18.711 -17.807  -2.553  1.00  0.00
ATOM   1336  C   VAL B   2     -19.020 -18.800  -1.909  1.00  0.00
ATOM   1337  N   HIS B   3     -18.990 -18.829  -0.580  1.00  0.00
ATOM   1338  CA  HIS B   3     -18.623 -17.648   0.178  1.00  0.00
ATOM   1339  CB  HIS B   3     -19.356 -17.649   1.494  1.00  0.00
ATOM   1340  CG  HIS B   3     -20.819 -17.875   1.353  1.00  0.00
ATOM   1341  CD2 HIS B   3     -21.571 -18.995   1.480  1.00  0.00
ATOM   1342  ND1 HIS B   3     -21.667 -16.890   0.896  1.00  0.00
ATOM   1343  CE1 HIS B   3     -22.894 -17.378   0.809  1.00  0.00
ATOM   1344  NE2 HIS B   3     -22.864 -18.650   1.156  1.00  0.00
ATOM   1345  O   HIS B   3     -16.586 -18.389   1.177  1.00  0.00
ATOM   1346  C   HIS B   3     -17.129 -17.592   0.414  1.00  0.00
TER
END

(B) An example of how to use the RR format to submit a prediction of interchain (chains A and B) residue-residue contacts defined as Cb-Cb distances < 8 A.

PFRMAT RR
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
HLEGSIGILLKKHEIVFDGC         # <- entire target sequence (up to 50 
HDFGRTYIWQMSD                #    residues per line)
A1 B9   0  8  0.70        
A1 B10  0  8  0.70           # <- indices of residues: Ai and Bj, 
A1 B12  0  8  0.60           # <- the range of Cb-Cb distance predicted
A1 B14  0  8  0.20           #    for the residue pair: d1 and d2 (real),
A1 B15  0  8  0.10           # <- probability of the distance between 
A1 B17  0  8  0.30           #    Cb atoms being within the specified
A1 B19  0  8  0.50           #    range: p (real)
A2 B8   0  8  0.90
A3 B7   0  8  0.70
A3 B12  0  8  0.40
A3 B14  0  8  0.70
A3 B15  0  8  0.30
A4 B6   0  8  0.90
A7 B14  0  8  0.30
A9 B14  0  8  0.50
END

Example 4. Estimates of model accuracy prediction

(A) Global Model Quality Score

PFRMAT QA
TARGET T0999
AUTHOR 1234-5678-9000
METHOD Description of methods used
MODEL 1
QMODE 1
3D-JIGSAW_TS1 0.8 
FORTE1_AL1.pdb 0.7 
END

(B) Residue-based Quality Assessment (fragment of the table). Note, that this case includes case (A) and there is no need to submit QMODE 1 predictions additionlly to QMODE 2.

PFRMAT QA
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Error estimate is CA-CA distance in Angstroms
METHOD Description of methods used
MODEL 1
QMODE 2
3D-JIGSAW_TS1 0.8 10.0 6.5 5.0 2.0 1.0  
5.0 4.3 4.6
FORTE1_AL1.pdb 0.7 8.0 5.5 4.5 X X 
4.5 4.2 5.0 
END

Example 5. Interface accuracy prediction

(A) Single interface

PFRMAT IA
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
IMODE  1
MODEL  1
INTERFACE AB
A1 0.70
A2 0.60
A3 0.45
B12 0.80           
B13 0.70           
B14 0.30           
TER
END

(B) Multiple interfaces between pairs of chains

PFRMAT IA
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
IMODE  1
MODEL  1
INTERFACE AB
A1 0.70
A2 0.60
A3 0.45
B12 0.80           
B13 0.70           
B14 0.30           
TER
INTERFACE AC 
A1 0.70
A2 0.60
A3 0.45
C121 0.60           
C122 0.55           
C141 0.30           
TER
INTERFACE BC 
B11 0.70
B12 0.60
B13 0.45
C30 0.80           
C31 0.70           
B44 0.30          
TER 
END

PFRMAT IA
TARGET T0999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
IMODE  3
MODEL  1
INTERFACE ABC
A1 0.70
A2 0.60
A3 0.45
B12 0.80           
B13 0.70           
B14 0.30           
C121 0.60           
C122 0.55           
C141 0.30           
TER
END