-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create basic ddi parser #13
Create basic ddi parser #13
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13 +/- ##
==========================================
+ Coverage 1.19% 20.26% +19.07%
==========================================
Files 25 26 +1
Lines 420 523 +103
==========================================
+ Hits 5 106 +101
- Misses 415 417 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Left some comments as a first pass -- mostly to simplify some things and to document/organize various parts of the code. I'll test out the API to see how it feels as well. Thanks @00krishna !
…files existence causes error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really great stuff here -- the docstrings are fantastic as are the examples within them! Left some feedback and thoughts.
What else are you planning to do with this PR? I am almost wondering if we could make the package extensions a separate PR as well as definitely making the way the parsed information gets returned into a separate PR. What do you think @00krishna ?
Oh also, do you know why Documenation is failing in the CI @00krishna ? Finally, in the PR, apparently the argumenterrors are not being code covered. Could you perhaps investigate why that is? Link here: https://app.codecov.io/gh/JuliaHealth/IPUMS.jl/pull/13?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaHealth |
Ah wait -- I broke documentation. 🤦♂️ |
…n DDIInfo for more info.
Yeah, we can create a separate PR for the package extensions stuff. That is not too bad. I think I am done with this PR, as I have the parsing working, and I also created a simple output that shows the captured metadata. The next PR will actually import the datafile into a dataframe--based upon the parsing info. |
This PR concentrates on developing a basic DDI file parser for an IPUMs extract. The DDI file provides all of the IPUMs and variable metadata, which is necessary for importing the data file itself. In the basic IPUMs download, there is no metadata in the data file.
Included items:
parsers
and addedddi_parser.jl
file.DDIInfo
andDDIVariable
objects to hold metadata for the extract.parse_ddi
that takes in a DDI XML file and return a DDI object. The details of the DDI object may be updated later.IPUMS.jl
file.testdata
subfolder to thetest
folder, and include some small test files.runtests.jl
file.TODO items: