Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]: Entity referencing (AJV's $data keyword) #69

Open
M-casado opened this issue Jun 22, 2023 · 2 comments
Open

[Feature request]: Entity referencing (AJV's $data keyword) #69

M-casado opened this issue Jun 22, 2023 · 2 comments

Comments

@M-casado
Copy link
Contributor

M-casado commented Jun 22, 2023

Summary

Inclusion of AJV's solution for entity referencing: $data or something similar.

Motivation

Entity referencing within the schemas would allow to construct more complex restrictions in the validation.

Details

Similar to what AJV implemented as part of their combining-schemas documentation, the idea would be to allow Biovalidator to not only interpret $ref, which so far has been incredibly useful, but also $data keywords in the schemas.

The $data keyword would be used to dynamically reference data within the constraints of a JSON Schema definition. In other words, not fully knowing the value that may be provided for a property would not impede that such value could be used in a constraint.

I tested a locally deployed server of Biovalidator with the examples below, and the validation did not work as I expected, so I assume it's not part of it.

Examples

Some time ago I tested this feature with AJV and made three mock examples with schemas here. Below I format some of them in the schema & data format of Biovalidator's message:

# The following should pass validation, given that the first and second MD5 are equal, and that is the constraint stablished
#    in the schema (i.e. the data from MD5_1 should be the constant of MD5_2). 
{
    "schema": {
        "type": "object",
        "required": ["MD5_1", "MD5_2"],
        "properties": {
            "MD5_1": {
            "type": "string"
            },
            "MD5_2": {
            "type": "string",
            "const": { "$data": "1/MD5_1" }
            }
        }
    },
    "data": {
        "MD5_1": "06266488e1b14195523df877eac39b31",
        "MD5_2": "06266488e1b14195523df877eac39b31"    
    }
}

# The following should not pass validation, but it does, since the interpretation of the schema does not include the 
#    negative reference to the $data in MD5_1.
{
    "schema": {
        "type": "object",
        "required": ["MD5_1", "MD5_2"],
        "properties": {
            "MD5_1": {
                "type": "string"
            },
            "MD5_2": {
                "type": "string",
                "not": { 
                    "const": { "$data": "1/MD5_1" } 
                }
            }
        }
    },
    "data": {
        "MD5_1": "06266488e1b14195523df877eac39b31",
        "MD5_2": "06266488e1b14195523df877eac39b31"
    }
}

Use-cases

The flexibility that $data provides is enormous, but a few use cases, at least for the EGA, could be:

  • Checking whether the encrypted MD5 and the unencrypted MD5 checksums are different (e.g. submitters providing the same value incorrectly)
  • Checking whether the number of samples is above the number of referenced samples
  • Checking whether object identifiers that have the same details are the same
@M-casado
Copy link
Contributor Author

As additional context, the entity referencing that AJV allows with the keyword $data is similar to the entity referencing of JSON-LD through the @id keyword. Therefore, was Biovalidator to be JSON-LD-aware (#68), this feature could be replaced with the @id solution.

@M-casado
Copy link
Contributor Author

Another example of a use-case in the JSON Schemas would be when refencing the core identifier of the object. For example, in our relationships model, we allow for directional and tagged linkages to be made within the objects. These have a source and a target, pointing to the ends of the relationship.

So far, given that we couldn't make use of $data, we solved the issue of duplicating the core identifier of the object in all of these ends by having one of these missing and inferred in the logic later on. In other words, if the source is the one provided, then the target is assumed to be the object itself, and vice versa.

Albeit we may not go back to a solution with using $data, had we been able to use it in the beginning, we may have had that adapted as such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant