-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate MDTranslator into Datagov Harvesting Logic #4565
Labels
H2.0/Harvest-Runner
Harvest Source Processing for Harvesting 2.0
H2.0/Harvest-Transform
Transform Logic for Harvesting 2.0
Comments
7 tasks
4 tasks
rshewitt
moved this from 🏗 In Progress [8]
to 👀 Needs Review [2]
in data.gov team board
Dec 21, 2023
btylerburton
added
the
H2.0/Harvest-Runner
Harvest Source Processing for Harvesting 2.0
label
Jan 10, 2024
btylerburton
moved this from 📔 Product Backlog
to 📟 Sprint Backlog [7]
in data.gov team board
Feb 29, 2024
btylerburton
moved this from 📟 Sprint Backlog [7]
to 📔 Product Backlog
in data.gov team board
Feb 29, 2024
#4940 reminded me to configure the rails app for production ( e.g. |
spoke with james and tyler and we've decided to update the values for schema_type & source_type. schema_type can be |
3 tasks
MOved to Sprint backlog to focus on non-harvester stories |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
H2.0/Harvest-Runner
Harvest Source Processing for Harvesting 2.0
H2.0/Harvest-Transform
Transform Logic for Harvesting 2.0
User Story
In order to successfully transform datasets from one schema to another, datagovteam would like to use the MDTranslator library, via a Rails application, to do so.
As an interim step, while the DCAT-US writer is still in active development, datagovteam would like to transform an FGDC/CSDGM source into ISO, in order to validate that the mdTranslator is functioning correctly.
Depends on:
Acceptance Criteria
WHEN that source has been loaded into a variation of the Airflow ETL Pipeline, which does not include the validate and load steps
THEN datagov-harvesting-logic will utilize the MDTranslator Rails application to to transform the source into valid ISO 19115 format.
Background
This will require work in both the datagov-harvesting-logic repo to integrate the MDTranslator Rails application, and will also potentially require work in the datagov-harvester (our Airflow / Orchestration repo) to allow for deferred operations.
At present, this additional work to support deferred operations is not necessarily a given, and could be a future enhancement to unblock pipeline workers while the transform is processing.
Security Considerations (required)
None. All work will reside within the Cloud.gov boundary and no external routes should be necessary.
Sketch
The text was updated successfully, but these errors were encountered: