Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: OPMI Box Loader #5

Merged
merged 6 commits into from
Dec 5, 2024
Merged

FEAT: OPMI Box Loader #5

merged 6 commits into from
Dec 5, 2024

Conversation

rymarczy
Copy link
Contributor

@rymarczy rymarczy commented Dec 4, 2024

This change adds a new ETL job to load files from the OPMI BOX account into the AWS OMPI Research Server RDS.

Job steps:
1. confirm structure of BOX_IMPORT_FOLDER_ID
1.a BOX_IMPORT_FOLDER_ID should contain folder named "aws_loaded" (where sucessfully processed files will be moved to)
1.b BOS_IMPORT_FOLDER_ID should contain folder named "aws_error" (where files with load errors will be moved to)
1.c if either expected folder does not exist, create them
2. Iterate through all .csv files in BOX_IMPORT_FOLDER_ID
2.a Files should follow specific naming convention targetSchema_targetTable.csv
2.b First row of file should be column names that match targetTable
3. Download file to local disk from Box
4. Load file into RDS targetSchema.targetTable via csv load function
4.a On successful load, move file to "aws_loaded" folder
4.b If load error occurs, move file to "aws_error" folder

IsaacDOT and others added 6 commits October 24, 2024 15:11
Updated gitignore to 1) ignore .DS_Store files generated by MacOS, 2) ignore the token file, and 3) no longer ignore jupyter notebook
@rymarczy rymarczy merged commit 9309ec2 into main Dec 5, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants