-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structure symmetry in OPTIMADE #35
Comments
Here is an argument for requiring a fully canonicalized output: One of the reasons for a non-canonical output was that otherwise one would frequently have to deal with implementations that return incorrect output. However, 'unsorted' symmetry operations is far from the only error an implementation can do when returning symmetry operators. One easy mistake is to return a set of symmetry operators that just cannot be correct together. (Because there is only a finite number of total possibilities). If we require translation coefficients to be [0-1), it is easy to return operations that do not fulfill this, etc. Hence, to avoid incorrect output it would be good if this field can be properly validated by our json schemas to belong exactly to the set of possibilities. If we require a canonical output (e.g., by a sorted list of operations), then one of us with a framework that uses complete sets of symmetry operations can just generate a list of ALL valid combinations of symmetry operations and we can put that in a schema and say: "The output MUST be exactly one of these alternatives". This is easy to validate. An unsorted list of operations (and possibly with several possibilities for coefficients that mean the same thing) is not easy to validate. |
I suspect this will be impractical since the list of combinations will be huge beyond manageability... |
Perhaps I'm missing something obvious, but for fully periodic systems, is this not just one entry per hall symbol? I.e., per row in this table: http://cci.lbl.gov/sginfo/hall_symbols.html#Table_6 But, right, that validation only works for fully periodic systems. If there are non-periodic directions, the situation gets a bit more tricky. Does the cif x,y,z format even properly handle point groups in non-periodic systems? How does one represent, e.g., C∞ around an axis? |
On 20/11/2018 01.44, Rickard Armiento wrote:
When you mentioned "generate a list of ALL valid combinations of X,Y,Z is possible, but also: X,Y,Z an also X,Y,Z and so on. N! and more combinations – too much to list. Regards, |
CIF has provisions for some aperiodic structures but not for others. Modulated (incomensurately modulated) structures are handles. Thus, I do not think there is a "one size fits all" solution for all possible situations, we need to address what is needed for current models (one of them – discrete atom model); possible leave options for expansion. |
Sorry if this was unclear: that part refered to the list of ALL canonical lists of symmetry operations. I.e., in practical terms, a canonical list of symmetry operations puts each individual symmetry operation on a canonical form, and then the list of those symmetry operations is 'sorted' according to some well-defined order. I'm presently of the impression that this list has a manageable length in fully periodic systems. Validation of a Regarding (partially)-non-periodic structures, if possible, I would really prefer one single scheme for everything that fits the OPTIMaDe structure object; i.e., a collection of atoms; not, e.g., electron densities. If that isn't possible, I guess we'd have to divide into cases as you are describing for CIF. But, is there really no standardized syntax for just listing symmetry groups in group theory form (e.g., C2v, D3h, C∞, etc.) and the axes/planes they operate on? Because it seems to me that would work universally. And such lists can be validated to some degree. |
I gather that there are standardization problems with symmetry operator lists (I miss discussion on floating point vs. fraction representation of translation components). However, maybe we could standardize on space group ITC number or Hall symbol and introduce symmetry operators later? |
Not sure if this is the right page, but if anyone is looking for a symmetry code, aflow offers a nice suite of symmetry symbols, operations, Wyckoff positions, etc. corey$ xzcat POSCAR.relax2.xz | aflow --edata |
On the discussion of implementations - we concluded recently in a discussion that what would be useful is a very permissively licensed (e.g. CC0) implementation of a Hall symbol <-> symmetry operator code. Since this is a difficult mapping that will be relevant for a number of API implementations. |
Gemmi library (MPLv2 or LGPLv3) seems to be able to do so. |
aflow is GPL, which is compatible with CC0. |
Sadly, this is not true. GPL is only compatible with GPL. |
this is news to me, their FAQs say otherwise. https://creativecommons.org/faq/#Can_I_apply_a_Creative_Commons_license_to_software.3F
is this one way? |
This citation says that you can use CC0-licensed code inside GPL-licensed code, not the other way round. |
@blokhin I think this depends on whether space group number and Hall symbol are enough to uniquely define any set of symmetry operators (I tend to think they are not). If not, this issue should remain open until symmetry operator list representation is standardized in OPTIMADE. |
@blokhin @merkys Indeed - lets keep this open until we've sorted out what to do about representing the full set of symmetry operators. My wish is to find a canonical format that is fairly compact, trivially translatable to a set of symmetry operator matrices, and with the ability to represent any set of symmetry operators, including those relevant for slabs, wires, and molecules. Perhaps also the ability to indicate a tensor transform for the symmetry operation to represent magnetism, etc. |
I'd stick to the IUCr definition, however probably there is no need to require symops for the standard settings / origins. So if they are omitted, one should assume ITC. |
Thanks for the link, this is a very interesting implementation, I'll have a closer look! It is definitely very useful as a sample implementation of symmetry handling. For the OPTIMADE we need, I think, a standardised way to represent symmetry operators, so that any standard-compliant implementation can handle them without conversion. |
IMHO such level of standardisation is unnecessary. Validation can be done easily without this (see below), and managing the list of all alternatives in a standard is an overkill and too much work to do, with virtually no added value. The symmetry operators are intended to be parsed first of all, and probably few searches will be done on the symop property. If someone does implement a search on symops, the it is very easy to canonicalise them internally, inside the server (we do so in the COD). The standard (OPTIMADE) just needs to mandate that all symops are faithfully converted to matrices, and all operator sets with the given matrices are found (matched). No need to spell out a list of alternatives for this.
Not true. Since symops are primarily intended to be parsed, there needs to be a grammar for this. AFAIK, there is no "official" grammar for the symmetry operators (I could not find a definitve reference to the Jones Faithful representation mentioned in Hall's 1981 paper), but there seems to be a community consensus on how these operators look like. Thus, we can easily write a BNF grammar to codify the current practices. I am prepared to write an EBNF syntax for the symmetry operators. When the grammar is in place, then valiadation of the symops is easy:
The step (2) is probably even unnecessary for protocol validation since it is a semantic check. |
While writing EBNF for the symmetry operation syntax, it occurred to me that what we are describing is a regular language, and so can be defined using a regular expression. I have checked the regexps for 3D symmetry operations, and the whole COD can be checked using the following ones:
Most COD symops match this regexp, and those that do not are clearly defective. This is of course a hard-to-read RE, but it can be typesetted comfortably by use Perl (PCER) /x option which allows arbitrary white space to be used for formatting the code. Moreover, we could use variables with informative names to define recurring parts of the RE. The advantages of the RE would be:
Disadvantages
If we go for defining symops using RE in OPTIMADE, we could use the Jones faithful notation, have mathematically exact definition for it and at the same time have easy means to parse the symops. My questions:
|
This is a space-delimited version of the symop matching regexp, for better readability:
|
@sauliusg I strongly support the regex idea. However, I would also suggest that we limit the numeric part to rational numbers in the form of fractions, i.e. "1/2", but not "0.5" or "2", etc. since proper crystallographic symmetry operations can always be translated to this form. Removing the "real number" part would simplify the regular expression and avoid some ambiguities with the precision (i.e. if we get "0.333" should it be applied as is, or should it be further extrapolated into a more precise number like "0.333333"). Also, the fractional number part should probably be "[1-9]/[1-9]" and not "[1-9]/[0-9]" (no 0 in the denominator). |
That's true. The provided REs were the ones that a) can capture the current interpretable symmetry operators in the COD and b) demonstrate how the symop REs would look like. The OPTIMADE specification should of course mandate strict(er) representation of symops. Not only we do not want 0 in the denominator; we can also expliciteky list '[2346]' as permitted denominators, since only such axes are allowed in 3D space groups. Or we could list all permissible shifts explicitly (' We (I?) need to look at the permissible axes in 4D crystallographic symmetry operations, this would limit symops for modulated crystals and some quazicrystals. |
PR #464 has been opened to add the REs for crystallographic 3D symmetry operators. |
In view of #464 please let‘s define explicitly once again the I tend to the For instance, consider a space group number 62, should that mean |
We can continue the discussion of awkward cases in the relevant issues (or open new ones): |
We need to think a bit about how we represent symmetry in relation to structures if we are going to include that. There is indeed good reason to allow queries on symmetry data when available.
If symmetry information is given, it may be best to require giving the list of symmetry operations, which is what strict CIF files do to avoid ambiguity. In addition, spacegroup number, ML, and Hall symbol can also be allowed as optional go give. However, it is probably unwise to allow giving those, without also giving symmetry operations, since that would replicate issues found for less strict CIF files.
The question is how to represent symmetry operations. Some alternatives:
The text was updated successfully, but these errors were encountered: