-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define Geospatial Accepted Formats for DCAT-US #5010
Comments
the schema error occurs here in the datajson extension |
What does DCATUS define as valid "spatial" values? Of those, what do we support? spec the "spatial" field is optional in DCATUS 1.1. if it exists it must be a string with at least 1 character.
Other cases
two_points = "[[3,4],[5,6]]"
translate_spatial(two_points) # returns => '{"type": "Polygon", "coordinates": [[[3, 4], [3, 6], [5, 6], [5, 4], [3, 4]]]}'
geojson = '{"type":"Polygon","coordinates":[[[-124.3926,32.5358],[-124.3926,42.0022],[-114.1252,42.0022],[-114.1252,32.5358],[-124.3926,32.5358]]]}'
translate_spatial(geojson) # returns => same as input just because the input can be JSON deserialized doesn't mean it's compatible with solr we could check if the input is valid geojson instead of letting solr complain when something is incompatible (assuming this happens but the point being some downstream process complains) import geojson
data = '{"type":"Polygon","coordinates":[[[-124.3926,32.5358],[-124.3926,42.0022],[-114.1252,42.0022],[-114.1252,32.5358],[-124.3926,32.5358]]]}'
geojson.loads(data) # => doesn't throw an exception which means it's valid Conclusion
|
I don't know how to interpret what
means. docs in GML mention both |
simple features profile: https://portal.ogc.org/files/?artifact_id=39853 |
"spatial" is optional in all dcatus schemas but if present needs to a string with at least 1 character in it. if the spatial data is an object like the 3rd example in this source ( control+f "spatial" and navigate to it ) then validation will fail. dcatus specifies a JSON object as an acceptable value in some circumstances which is different from a string. basically, the root of a common problem we see ( e.g. |
as long as we're using solr for search we have to conform to what it supports. "spatial" data expressed as geojson (e.g.
We can add support for translating a geojson "envelope" in # from this
{'coordinates': [[-78.9823, 35.5216], [-78.2607, 36.0742]], 'type': 'envelope'}
# into this
"""{
"type": "Polygon",
"coordinates": [
[[-78.9823, 35.5216], [-78.2607, 35.5216], [-78.2607, 36.0742], [-78.9823, 36.0742], [-78.9823, 35.5216]]
]
}""" |
facet query of "old-spatial" on catalog. interestingly, there's <?xml version=\"1.0\" encoding=\"UTF-8\"?>
<gml:Polygon xmlns:gml=\"http://www.opengis.net/gml/3.2\" srsName=\"EPSG:9825\">
<gml:outerBoundaryIs>
<gml:LinearRing>
<gml:posList>-90.0 -180.0 -90.0 180.0 90.0 180.0 90.0 -180.0 -90.0 -180.0</gml:posList>
</gml:LinearRing>
</gml:outerBoundaryIs>
<gml:innerBoundaryIs>
</gml:innerBoundaryIs>
</gml:Polygon> looks like we have 971 datasets with an old-spatial value containing the word i validated this xml against a simple feature profile level-2 schematron |
|
Quick summary:
Recommendations:
|
User Story
In order to support data providers and questions around DCAT-US accepted
spatial
field values, data.gov admins want a detailed list and test/example use cases for what should be validspatial
values for DCAT-US.Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
spatial
field filled outWHEN that field is examined
THEN it is clear whether that format is supported or not.
Background
Current examples: https://github.com/GSA/ckanext-datajson/tree/main/ckanext/datajson/tests/datajson-samples
Security Considerations (required)
None
Sketch
We need to fully define the list of acceptable sources. The current logic of support is here: https://github.com/GSA/ckanext-geodatagov/blob/main/ckanext/geodatagov/logic.py#L445-L515
Need to start the list of use cases, and decide if the envelope use case (see here) is an acceptable format and should be included.
Make sure every test case is defined.
The text was updated successfully, but these errors were encountered: