.

Rotamer energies file


This file contains the singleton energies ("self energies") and pairwise rotamer energies for a protein structure that is to be used in the multistate protein design procedure performed by SPRINT.
The rotamer energies file is in the FastInf format for probabilistic graphical models (example FastInf format). There are two alternative options for creating energies files for use with SPRINT (and fastInf): - In either case, the file basically contains a standard factor graph representation for a graphical model describing the interactions of a protein structure. See the example depicted below.

The energies file consists of 4 relevant sections:

  1. Variables

    This section contains a list of the positions to be designed and the total number of rotamers at the respective position.
    The position consists of the chain ID and the residue number, where the chain MUST be delimited on BOTH sides by an underscore ("_"). For example, the design position at chain A residue number 165 will be denoted by "_A_165".
    Thus, the following line denotes that chain A residue number 165 has 100 rotamers in total (for all of its designed amino acids):
    _A_165  100
    

    Note that the residue name can also contain any optional information prepended to the beginning, e.g., "design_A_165".

  2. Cliques

    This section contains a list of the subsets for which singleton and pairwise energies exist. Each line consists of 5 fields:
    1. A name for the subset of positions (can be anything).
    2. The number of positions in the set.
    3. The indices of the positions in the set, where indices start from 0 and refer to the order in which the positions were listed in the "Variables" section above.
    4. The number of "neighbors" of this set (i.e., for singleton sets, this is the number of pairwise edges in which it partakes; and for pairwise sets, this is 2).
    5. The indices of the "neighbors" of this set, where indices start from 0 and follow the order in which the sets are given in this section ("Cliques").
    Thus, the following lines indicate that set 0 contains a single variable (100), which is connected to one other set (2). Set 1 also contains a single variable (200), and is connected to one other set (2). Set 2 consists of variables 100 and 200 and is connected to 2 sets (sets 0 and 1):
    cliq0	1	100	1	2
    cliq1	1	200	1	2
    cliq2	2	100 200	2	0 1
    
  3. NOTE: In the "Cliques" section, the 5 fields MUST be separated by TABS (the '\t' character), as in the examples shown here.


  4. Measures

    This section consists of the rotamer energies calculated for each position and each pair of positions in contact. Each line consists of one such "matrix" of rotamer energies, and must contain the following 4 fields:
    1. A name for the energies matrix (can be anything).
    2. The number of positions which the energies describe (1 for singleton energies, 2 for pairwise rotamer-rotamer energies).
    3. For each of the positions that this matrix describes, the total number of rotamers at each such position.
    4. The actual rotamer energies for this matrix, ordered where the assignment advances like a binary number counter, i.e., 00 01 10 11.
    Thus, the following two lines indicate that energy matrix 0 contains a single variable with 3 rotamers, and matrix 1 contains two variable with 3 rotamers and 2 rotamers, respectively:
    matrix0	1	3	-3.5328 -4.5901 -3.6388
    matrix1	2	3 2	0.091058 0.066163 -0.066163 -0.91101 -0.39133 0.066163
    
  5. NOTE: In the "Measures" section, the 4 fields MUST be separated by TABS (the '\t' character), as in the examples shown here.


  6. CliqueToMeasure

    This section maps between indices of subsets of positions ("Cliques") to their respective energy matrices ("Measures"). For example, the following lines indicate that set 0 is mapped to matrix 0, set 1 to matrix 1, etc.:
    0       0
    1       1
    2       2
    

Each section is terminated by a line containing "@End", and the title of each section starts with "@" as well (e.g., "@Variables").


NOTE: In the "Cliques" and "Measures" sections, the fields MUST be separated by TABS (the '\t' character), as in the examples shown here.

A simple valid example is as follows (download it):
@Variables
_A_165  2
_A_200  3
_B_10   2
@End

@Cliques
cliq0	1	0	1	3
cliq1	1	1	2	3 4
cliq2	1	2	1	4
cliq3	2	0 1	2	0 1
cliq4	2	1 2	2	1 2
@End

@Measures
matrix0	1	2	-3 -4
matrix1	1	3	-1 -4 -6
matrix2	1	2	-4 3
matrix3	2	2 3	0 1 -1 2 -3 -4
matrix4	2	3 2	1 2 0 -1 -5 -2
@End

@CliqueToMeasure
0       0
1       1
2       2
3       3
4       4
@End

This example corresponds to the following energies between the rotamers of the protein positions:



Back to SPRINT page