Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Property 'fractional_site_positions' #206

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
Open

Property 'fractional_site_positions' #206

wants to merge 11 commits into from

Conversation

merkys
Copy link
Member

@merkys merkys commented Nov 22, 2019

In #196 @rartino and I discussed the need to have a structures property for reduced (a.k.a. fractional) atom coordinates. This PR introduces fractional_site_positions among the standardized properties. I have copied the most of the text from cartesian_site_positions and slightly adjusted the relationships of the two coordinate-based properties. I have also changed a bit the definition of lattice_vectors, thus this PR fixes #196.

@merkys merkys added PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing topic/property-standardization The specification of the precise data representation of properties and entries labels Nov 22, 2019
Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, except for the inline comment; and one general thing:

The way you have implemented this clearly requires the cartisian_site_positions and fractional_site_positions to be in the exact same order - line-by-line referring to the same coordinates. I think that requirement should be written down somewhere; I suggest in the list of requirements on fractional_site_positions.

optimade.rst Outdated Show resolved Hide resolved
@merkys
Copy link
Member Author

merkys commented Nov 25, 2019

The way you have implemented this clearly requires the cartisian_site_positions and fractional_site_positions to be in the exact same order - line-by-line referring to the same coordinates. I think that requirement should be written down somewhere; I suggest in the list of requirements on fractional_site_positions.

Requirement added in 8eefdd2.

@merkys merkys requested a review from rartino November 25, 2019 09:51
Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've written in other issues/commits, I very much like the idea of standardizing fractional_site_positions. But I realize now with this PR in hand that if we do it this way, fractional_site_positions and cartesian_site_positions become "equal citizens". I.e., it would be OK for an implementation to support only the fractional_site_positions part of the specification and refuse to deal with cartesian_site_positions. This is a problem from a standards-perspective. Can we solve that somehow?

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
@merkys
Copy link
Member Author

merkys commented Nov 26, 2019

As I've written in other issues/commits, I very much like the idea of standardizing fractional_site_positions. But I realize now with this PR in hand that if we do it this way, fractional_site_positions and cartesian_site_positions become "equal citizens". I.e., it would be OK for an implementation to support only the fractional_site_positions part of the specification and refuse to deal with cartesian_site_positions. This is a problem from a standards-perspective. Can we solve that somehow?

In this PR cartesian_site_positions are defined as REQUIRED, and fractional_site_positions as OPTIONAL. However, if PRs #203 or #210 are accepted, both properties will become OPTIONAL with no requirement on their support.

Now I'm starting to understand the importance of REQUIRED in the response constraint. In the light of PRs #203 and #210 this constraint should be transformed to REQUIRED to be supported, that is, a property has to be required to return meaningful values upon request.

Edit: Actually, I've run through the specification, and it seems that even properties that MUST be included in the response may still be included with nulls for values. Thus we nowhere require any properties to be universally supported. Even a JSON document with all properties having null values is valid according to the specification. Or am I missing something?

merkys and others added 2 commits November 26, 2019 09:33
fractional\_site\_positions
~~~~~~~~~~~~~~~~~~~~~~~~~~~

- **Description**: Positions of each site expressed as fractions of the `lattice_vectors`_. Description of a site matches the one given in `cartesian_site_positions`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is currently formulated, fractional_site_positions and cartesian_site_positions have different status: cartesian is the primary definition, and both species_at_sites and fractional... refer to cartesian.... However, mathematically, cartesian... and fractional... are equivalent descriptions, and either of them or even both may be absent. It then makes this description ambiguous, since, if cartesian_site_positions are absent, it is not clear what the fractional_site_positions must match.

It seems therefore that it would be more consistent to describe species_at_site as a primary definition of an atomic site, and to let both fractional_site_positions and cartesian_site_positions refer to the species_at_site when the order and number of array elements is defined for each fractional and cartesian.

optimade.rst Outdated Show resolved Hide resolved
@@ -1710,14 +1731,14 @@ nsites
species\_at\_sites
~~~~~~~~~~~~~~~~~~

- **Description**: Name of the species at each site (where values for sites are specified with the same order of the property `cartesian_site_positions`_).
- **Description**: Name of the species at each site (where values for sites are specified with the same order of the properties `cartesian_site_positions`_ and `fractional_site_positions`_).
The properties of the species are found in the property `species`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that species_at_sites are moved to the beginning of the specification, and fractional_site_positions and cartesian_size_positions should both refer to the species_at_sites section.

The properties of the species are found in the property `species`_.
- **Type**: list of strings.
- **Requirements/Conventions**:

- **Response**: REQUIRED in the response unless explicitly excluded.
- **Query**: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
- MUST have length equal to the number of sites in the structure (first dimension of the list property `cartesian_site_positions`_).
- MUST have length equal to the number of sites in the structure (first dimension of the list properties `cartesian_site_positions`_ and `fractional_site_positions`_).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When species_at_site are moved to the front, the fractional_site_positions should the refer to species_at_sites, not vice versa.

- If a component of the position is unknown, the :val:`null` value should be provided instead (see section `Properties with unknown value`_).
Otherwise, it should be a float value.
If at least one of the coordinates is unknown, the correct flag in the list property `structure_features`_ MUST be set.
- The order of the list MUST match the order of `cartesian_site_positions`_, i.e., the list refers to the same positions as `cartesian_site_positions` *in the exact same order*, only expressed in fractional coordinates instead of cartesian.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of elements in both fractional_site_positions and cartesian_site_positions should refer to site descriptions, not to each other; the only suitable property that we specify currently is species_at_sites.

- **Type**: list of list of floats and/or unknown values
- **Requirements/Conventions**:

- **Response**: OPTIONAL in the response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest specifying that:

`fractional_site_positions` and `cartesian_site_positions` MAY be both present in the same response. 
If they are both present, they SHOULD describe the same points in space, i.e. cartesian coordinates 
of sites should be the same as fractional coordinates of the same sites with orthogonalisation 
convention, as expressed in `lattice_vectors`, applied.

cartesian_site_positions[isite][x,y,z] = 
     fractional_site_positions[isite][x,y,z] * latice_vectors[][];

where:
   -  isite is index of the site (isite = 0..length(species_at_site[])-1);
   -  '*' is understood in matrix multiplcation sense;
   - each lattiec_vectors[i] element (i=0..2) is the 
     corresponding lattice vector, \vec{a}, \vec{b} and \vec{c}.
   - both cartesian_site_positions[isite][x,y,z] and
     fractional_site_positions[isite][x,y,z] are row-vectors
     (not column-vectors).

@rartino
Copy link
Contributor

rartino commented Dec 2, 2019

Edit: Actually, I've run through the specification, and it seems that even properties that MUST be included in the response may still be included with nulls for values. Thus we nowhere require any properties to be universally supported. Even a JSON document with all properties having null values is valid according to the specification. Or am I missing something?

I have been concerned about this since we started allowing null-valued properties, and I think the question was raised on the last physical meeting if we really mean to allow returning null for a REQUIRED property.

However, with null being allowed for REQUIRED properties these two alternatives have a rather different result:

  • For every type of data OPTiMaDe supports, there is only a single format to represent it. Implementations either provide data in that format, or return null.
  • OPTiMaDe allows the same data to be returned in N number of ways, e.g., cartesian_site_positions and fractional_site_positions and implementations can choose to return null for either one of them (or both).

In the first case, we still have interoperability for all cases where it makes sense. A database either does not support coordinates at all (in which case interoperability with a client that wants coordinates is not possible) or it supports coordinates via cartesian_site_coordinates in which case a client that wants coordinates gets them as expected.

In the second case, we set ourselves up for a possible mismatch. The client wants coordinates from cartesian_site_positions, but the database returns null for that, and only supports coordinates via fractional_site_positions. This breaks interoperability.

Hence, it is my firm belief that if we are to accept the present PR, there needs to be some facility that says that IF you support fractional_site_positions, you MUST support cartesian_site_positions, but not the other way around. I.e., which makes cartesian_site_positions "more standard" than fractional_site_positions.

With the realization that we are opening a can of worms by allowing more than one standardized way to represent data, perhaps we need to delay this PR beyond 1.0 and sort out exactly how this should be handled?

Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @merkys!

I see you're updating now, so hopefully some of these change requests and comments have now been outdated. Mostly I am just agreeing with @sauliusg 😅

- **Requirements/Conventions**:

- **Response**: REQUIRED in the response unless explicitly excluded, except when property `dimension_types`_ is equal to :val:`[0, 0, 0]` (in this case it is OPTIONAL).
- **Response**: REQUIRED in the response unless explicitly excluded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is where the line is cut for PR #210. I believe we should attempt to merge this PR before #210 then...

- **Query**: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
- MUST be a list of three vectors *a*, *b*, and *c*, where each of the vectors MUST BE a list of the vector's coordinates along the x, y, and z Cartesian coordinates.
(Therefore, the first index runs over the three lattice vectors and the second index runs over the x, y, z Cartesian coordinates).
- For databases that do not define an absolute Cartesian system (e.g., only defining the length and angles between vectors), the first lattice vector SHOULD be set along *x* and the second on the *xy*-plane.
- This property MUST be an array of dimensions 3 times 3 regardless of the elements of property `dimension_types`_. The vectors SHOULD by convention be chosen so the determinant of the :property:`lattice_vectors` matrix is different from zero. The vectors in the non-periodic directions have no significance beyond fulfilling these requirements.
- :val:`null` values are allowed in the output. A vector of :val:`[null, null, null]` MAY be used for non-periodic directions where a numerical vector would not be meaningful. However, if the implementation also supports the `fractional_site_positions`_ property, the lattice vectors are relevant also for the non-periodic directions, and thus no all-`null` vectors are expected in the output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- :val:`null` values are allowed in the output. A vector of :val:`[null, null, null]` MAY be used for non-periodic directions where a numerical vector would not be meaningful. However, if the implementation also supports the `fractional_site_positions`_ property, the lattice vectors are relevant also for the non-periodic directions, and thus no all-`null` vectors are expected in the output.
- :val:`null` values are allowed in the output. A vector of :val:`[null, null, null]` MAY be used for non-periodic directions where a numerical vector would not be meaningful (including a zero vector). However, if the implementation also supports the `fractional_site_positions`_ property, the lattice vectors are relevant also for the non-periodic directions, and thus no all-`null` vectors are allowed in the output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a related question is, why are null-vectors necessary, when one can apply zero vectors? Unless zero vectors have a meaning different from null-vectors when using fractional_site_positions.

nsites
~~~~~~

- **Description**: An integer specifying the length of the :property:`cartesian_site_positions` property.
- **Description**: An integer specifying the length of :property:`cartesian_site_positions` and :property:`fractional_site_positions` properties.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As was pointed out by @sauliusg, this should refer instead only to species_at_sites.

@@ -1710,14 +1731,14 @@ nsites
species\_at\_sites
~~~~~~~~~~~~~~~~~~

- **Description**: Name of the species at each site (where values for sites are specified with the same order of the property `cartesian_site_positions`_).
- **Description**: Name of the species at each site (where values for sites are specified with the same order of the properties `cartesian_site_positions`_ and `fractional_site_positions`_).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Description**: Name of the species at each site (where values for sites are specified with the same order of the properties `cartesian_site_positions`_ and `fractional_site_positions`_).
- **Description**: Name of the species at each site (where values for sites are specified with the same order of the properties `cartesian_site_positions`_ as well as `fractional_site_positions`_).

I discourage and, even though it is correct, simply due to the fuzzyness it may present for an implementer in terms of "should it be the sum of these properties then"?

@@ -1912,7 +1933,7 @@ structure\_features
- **List of strings used to indicate special structure features**:

- :val:`disorder`: This flag MUST be present if any one entry in the :property:`species` list has a :property:`chemical_symbols` list that is longer than 1 element.
- :val:`unknown_positions`: This flag MUST be present if at least one component of the :property:`cartesian_site_positions` list of lists has value :val:`null`.
- :val:`unknown_positions`: This flag MUST be present if at least one component of the :property:`cartesian_site_positions` and :property:`fractional_site_positions` lists of lists has value :val:`null`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- :val:`unknown_positions`: This flag MUST be present if at least one component of the :property:`cartesian_site_positions` and :property:`fractional_site_positions` lists of lists has value :val:`null`.
- :val:`unknown_positions`: This flag MUST be present if at least one component of the :property:`cartesian_site_positions` or :property:`fractional_site_positions` lists of lists has value :val:`null`.

@merkys merkys self-assigned this Dec 4, 2019
@rartino rartino added PR/waiting-for-update This PR has been reviewed and is waiting for the author to response or update the PR and removed PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing labels Dec 4, 2019
@ml-evs
Copy link
Member

ml-evs commented Jul 7, 2021

Did the consensus evolve on this at either of the last 2 workshops? If this is something we still cannot deal with, I would suggest closing this PR (even if only temporarily).

@rartino
Copy link
Contributor

rartino commented Jul 7, 2021

Did the consensus evolve on this at either of the last 2 workshops? If this is something we still cannot deal with, I would suggest closing this PR (even if only temporarily).

I don't think we yet have a resolution to my comment above:


Hence, it is my firm belief that if we are to accept the present PR, there needs to be some facility that says that IF you support fractional_site_positions, you MUST support cartesian_site_positions, but not the other way around. I.e., which makes cartesian_site_positions "more standard" than fractional_site_positions.


But the PR is good otherwise, so isn't it better to mark it with a label that means that there is something fundamental that needs to be worked out, but leaving it open?

@merkys
Copy link
Member Author

merkys commented Jul 8, 2021

I prefer keeping the PR open as well. Once closed, a PR becomes virtually invisible and it is easy to forget it, scrapping all the effort.

@rartino
Copy link
Contributor

rartino commented Dec 30, 2022

It is my understanding that this is going to be superseded by asym_fractional_coords in a PR spawned by #416. (No one has so far suggested that we should include both a "filled cell" and an "asymetric unit" version of fractional coords).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/requires-discussion PR/waiting-for-update This PR has been reviewed and is waiting for the author to response or update the PR topic/property-standardization The specification of the precise data representation of properties and entries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Format of 'lattice_vectors' is unclear for 2D, 1D and 0D systems
5 participants