Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mamluk studies journal ingestion #2

Open
verbalhanglider opened this issue May 6, 2017 · 1 comment
Open

mamluk studies journal ingestion #2

verbalhanglider opened this issue May 6, 2017 · 1 comment
Assignees

Comments

@verbalhanglider
Copy link
Contributor

The metadata in extracted from the 291 pdf files needs to be evaluated by stakeholders and a consensus should be reached that

A. it looks acceptable
B. what needs to be added or not added for any records that do not have a value in a particular field

@verbalhanglider verbalhanglider changed the title metadata extracted needs to be evaluated mamluk studies journal ingestion May 6, 2017
verbalhanglider pushed a commit that referenced this issue Jul 14, 2017
I've created a mapper class and am feeding an input dict to each mapper
to generate a dublin core ElementTree with the required elements with
the appropriate values

TODO: need to get stakeholder approval of this mapping before moving
forward
TODO: fold existing SAF creation code base into this one
verbalhanglider pushed a commit that referenced this issue Jul 17, 2017
split major loops in main() into two separate functions. also, moved all
functions that were being re-typed into single functions being called
multiple times.
verbalhanglider pushed a commit that referenced this issue Jul 18, 2017
I've created a separate function for each field being extracted from the
data that needs to be read/normalized in its own way. This should make
understanding the errors in the data a lot easier rather than the
previous way where it was all being done in one giant main() function
@verbalhanglider
Copy link
Contributor Author

SAFs for full volumes have been generated. Waiting on upload and DOIs for those per stakeholder request to add DOIs for volumes to individual articles for SAF generation of individual articles.

verbalhanglider pushed a commit that referenced this issue Nov 7, 2017
I've created a mapper class and am feeding an input dict to each mapper
to generate a dublin core ElementTree with the required elements with
the appropriate values

TODO: need to get stakeholder approval of this mapping before moving
forward
TODO: fold existing SAF creation code base into this one
verbalhanglider pushed a commit that referenced this issue Nov 7, 2017
split major loops in main() into two separate functions. also, moved all
functions that were being re-typed into single functions being called
multiple times.
verbalhanglider pushed a commit that referenced this issue Nov 7, 2017
I've created a separate function for each field being extracted from the
data that needs to be read/normalized in its own way. This should make
understanding the errors in the data a lot easier rather than the
previous way where it was all being done in one giant main() function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants