-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mamluk studies journal ingestion #2
Milestone
Comments
verbalhanglider
added this to the ingest 291 PDFs into knowledgespace.uchicago.edu milestone
May 6, 2017
verbalhanglider
changed the title
metadata extracted needs to be evaluated
mamluk studies journal ingestion
May 6, 2017
verbalhanglider
pushed a commit
that referenced
this issue
Jul 6, 2017
verbalhanglider
pushed a commit
that referenced
this issue
Jul 6, 2017
verbalhanglider
pushed a commit
that referenced
this issue
Jul 12, 2017
verbalhanglider
pushed a commit
that referenced
this issue
Jul 14, 2017
I've created a mapper class and am feeding an input dict to each mapper to generate a dublin core ElementTree with the required elements with the appropriate values TODO: need to get stakeholder approval of this mapping before moving forward TODO: fold existing SAF creation code base into this one
verbalhanglider
pushed a commit
that referenced
this issue
Jul 17, 2017
split major loops in main() into two separate functions. also, moved all functions that were being re-typed into single functions being called multiple times.
verbalhanglider
pushed a commit
that referenced
this issue
Jul 18, 2017
I've created a separate function for each field being extracted from the data that needs to be read/normalized in its own way. This should make understanding the errors in the data a lot easier rather than the previous way where it was all being done in one giant main() function
SAFs for full volumes have been generated. Waiting on upload and DOIs for those per stakeholder request to add DOIs for volumes to individual articles for SAF generation of individual articles. |
verbalhanglider
pushed a commit
that referenced
this issue
Nov 7, 2017
I've created a mapper class and am feeding an input dict to each mapper to generate a dublin core ElementTree with the required elements with the appropriate values TODO: need to get stakeholder approval of this mapping before moving forward TODO: fold existing SAF creation code base into this one
verbalhanglider
pushed a commit
that referenced
this issue
Nov 7, 2017
split major loops in main() into two separate functions. also, moved all functions that were being re-typed into single functions being called multiple times.
verbalhanglider
pushed a commit
that referenced
this issue
Nov 7, 2017
I've created a separate function for each field being extracted from the data that needs to be read/normalized in its own way. This should make understanding the errors in the data a lot easier rather than the previous way where it was all being done in one giant main() function
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The metadata in extracted from the 291 pdf files needs to be evaluated by stakeholders and a consensus should be reached that
A. it looks acceptable
B. what needs to be added or not added for any records that do not have a value in a particular field
The text was updated successfully, but these errors were encountered: