-
Notifications
You must be signed in to change notification settings - Fork 230
kinetics estimation
What reactions are considered and what rate parameters are generated are governed by the libraries, seed mechanisms and families set in the input file. First when RMG starts, all reactions from any seed mechanisms and the associated rate information are included and used within the model. Then tree averaging is done to generate more general rules should they not exist within the tree (described in detail below). As RMG runs, whenever a new reaction is generated for a given family any available libraries are first searched for rate parameter data if rate data is found it is used, otherwise it estimates the rate using the families directly.
Rates in the reverse direction are calculated from the forward rate (as defined by the family) and the equilibrium constant (as calculated from thermodynamic parameters). For purposes of this discussion we only consider the forward rate as the reverse is calculated based on the forward rate.
Families are composed of a tree or a series of trees whose nodes are groups. Each tree branches independent segments usually defining the reactants and reaction related sites although the independent segments can in practice be on any independent properties. Every Group is connected to at most 1 parent group and any number of child groups. All child groups must be
Example 1: X-H + Y. <=> Y-H + X.
This one has one tree for X-H and one tree for Y.
Example 2: H-R2R1. <=> .R2R1-H
This one has three trees one for the backbone (~) between endgroups R1 and R2, one for R1 and one for R2. The backbone in general is usually defined as all atoms between the endgroups not directly participating in the reaction.
Groups are more general versions of molecules. Unlike molecules groups permit extra types of nodes within the graph. For example a node in a group could be any of R,R!H,C,Cs,Cd,Cdd,Ct,CS,CO or Cb all of which could represent a carbon atom, additionally different bond structure possibilities are allowed.
for example:
Atom-Number Label GroupAtom Unpaired-Electrons Bonds
1 *1 R!H u1 {2,T}
2 *4 Ct u0 {1,T} {3,S}
3 *6 [Cd,Ct,Cb] u0 {2,S} {4,[D,T,B]}
4 *5 [Cd,Ct,Cb] u0 {3,[D,T,B]} {5,S}
5 *2 [Cd,Cdd,Ct,CO,N,CS] u0 {4,S} {6,[D,T]}
6 *3 [Cd,Ct,Od,Sd,Cdd,N] u0 {5,[D,T]}
This is a backbone group from Intra_R_Add_Endocyclic, the labels define the endgroups and the backbone structure (although in general not all atoms need or should be labeled). Brackets enclose all possibilities for a given GroupAtom or bond. R!H will match anything except a Hydrogen. The GroupAtoms allow flexible definition of the atoms. It is important to note that there are usually more atoms and bonds in a molecule that matches a given group than in the group itself, any group or molecule that meets the requirements of the above structure will match the group.
When RMG starts it loads the trees and places the available rules and training reactions within the tree. It then averages up its available data into more general rules. An average rule is created for each node if possible by averaging the rules for the closest nodes that are down the trees from (less general than) the given node. This is done recursively such that the nodes at the bottom of the multi-tree are averaged to get average rules for averaging further up the tree. Averaging is currently done based on filtering by the 1-norm and then filtering the result by the 2-norm, although after the nodal distances update this will be done exclusively by 2-norm filtering. It is important to node that this averaging takes place in the multi-tree: [Rn;doublebond_intra_pri_2H;radadd_intra_cddouble] will have child nodes off of every tree in the family. It is also important to note that this does not fill the entire tree because many if not most nodes will have no children with rules.
When a given reaction is examined every tree in the family is descended. At each level the appropriate molecules are compared with each child at that level until it finds an subgraph-isomorphic match that matches the labels appropriately. It then descends to that match's children and so on until there are no children or the reaction matches no children. This should place that reaction at a set of tree coordinates such as [R3;doublebond_intra_pri_2H;radadd_intra_cddouble].
When the families are loaded first the the rate rules (associated with one group from each tree in the family) are loaded and placed at the corresponding locations in the tree. Then training reactions are added by descending the trees and placing the rate data from each reaction at the associated combination node coordinates. At this point averaging is done based on the distance from nodal coordinates. Currently the closest rules are determined by taking the rules that minimize the 1-norm distance and then taking the rules among those that minimize the 2-norm distance. After the nodal distances update this will be done by minimizing the 2-norm distance among all rules.
For example:
[R3;doublebond_intra_pri_2H;radadd_intra_cddouble] is a distance of 1 away from [Rn;doublebond_intra_pri_2H;radadd_intra_cddouble] because R3 is one node below Rn.
while
[R3;doublebond_intra_pri_2H;radadd_intra_cddouble] is a distance of sqrt(2) away from [R3;doublebond_intra_pri;radadd_intra] because doublebond_intra_pri is one away from doublebond_intra_pri_2H and radadd_intra is a distance of one away from radadd_intra_cddouble.
In terms of these distances for each node in the tree the data present at the closest node with a rule or training reaction is placed at that node in the tree, if two or more nodes having original rate data are equal distances away the geometrical average of the data at these nodes is placed at that node. In this way rate information is placed at every node in the tree. When rate data is requested for a reaction from a given family it then descends all of the trees and takes the rate information present at its final nodal coordinates.
- group/rate rules/libraries/ etc. (may already be in main documentation)
- degeneracy
- database usage
- small code to generate reactions