-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import Documents: Read metas as the right type #677
Conversation
Not that I'm aware of. |
filtered[column].to_numpy(), | ||
to_metas=True | ||
) | ||
filtered = table_from_frame(filtered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to use the table_from_frame
, you need to reset the df.index:
filtered = table_from_frame(filtered.reset_index(drop=True))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try the above suggestion first, thanks! ^^
0833d28
to
25b7676
Compare
25b7676
to
6024a65
Compare
Codecov Report
@@ Coverage Diff @@
## master #677 +/- ##
==========================================
+ Coverage 74.02% 74.07% +0.04%
==========================================
Files 72 72
Lines 9155 9230 +75
Branches 1239 1253 +14
==========================================
+ Hits 6777 6837 +60
- Misses 2138 2150 +12
- Partials 240 243 +3 |
@@ -284,6 +287,8 @@ def _read_meta_data(self): | |||
content = data.content | |||
if isinstance(content, dict): | |||
content = pd.DataFrame(content, index=[0]) | |||
# if reader is YamlMetaReader: | |||
# content = content.replace("None", np.nan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VesnaT Should we keep this? Currently, null values in yamls are treated as "None" (strings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather put fixes like this inside the YamlMetaReader
.
Besides, I'm not sure the fix works correctly for theStringVariable
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this could work:
class YamlMetaReader(Reader):
ext = [".yaml"]
def read_file(self):
with open(self.path, "r") as f:
self.content = yaml.safe_load(f)
for k in self.content:
if self.content[k] is None:
self.content[k] = ""
Issue
Import Documents added meta files if found, but they were all read as StringVariables not as their own proper types.
Description of changes
Use table_from_frame to translate numeric values to ContinuousVariables and categorical to DiscreteVariables.CONS: now some attributes are in X, not in metas. Any easy way to fix this?Decided to follow @VesnaT suggestion and sanitize variables before sending them to
guess_data_type
, which is now used instead of StringVariable.Includes