-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add optional --discover capability #8
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
Update: I've now got it working successfully with or without the catalog input. Still "todo" is to ensure that if something is deselected in the catalog that it would also be ignored/excluded from the data stream. |
# else: | ||
# logger.info(catalog_stream_override) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo: remove these after testing.
# else: | ||
# logger.info(metadata_override) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Remove after testing.
tap_exchangeratesapi/__init__.py
Outdated
singer.Transformer().transform( | ||
data=record, | ||
schema=catalog_stream_override[0]["schema"], | ||
# metadata=metadata_override, # TODO: debug (not working) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get errors when I try to pass the metadata
arg.
- Any ideas?? ❓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be that the metadata is in list
format and it's expecting dict
or vice-versa. My gut says it needs to be a dict
, but haven't written this code in awhile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are correct. The below seems to resolve the error, now pulling in the full dict
which is one level up from the "metadata" element (which is itself of type list
).
singer.Transformer().transform(
data=record,
schema=catalog_stream_override[0]["schema"],
metadata=catalog_stream_override[0],
)
Note: It's still possible that I have this wrong and the failure just went away because I fixed the argument type.
- More testing needed to confirm whether those metadata arguments (especially
selected
status) are correctly getting piped through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More to follow (and more code cleanup to do) once I confirm the results are as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the behavior might be incorrect. Without 100% tracing the code, I would expect it to be something like metadata.to_map(metadata_override)
using the metadata
module in singer-python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API for the metadata is a bit strange since it uses a module/"functional" pattern, so I can chat over DM if anything needs cleared up. I'm trying to figure out how to explain it myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good start. I like how safe the catalog passing is for a gentle introduction. The code is a bit more complex, so that might be able to be refactored into functions to divide the intro (no config, no catalog, etc)
from the advanced (discovery + catalog + metadata)
and make the "stages" clearer to a reader.
Just a couple of comments/changes that came to mind. Thanks for taking this up!
tap_exchangeratesapi/__init__.py
Outdated
singer.Transformer().transform( | ||
data=record, | ||
schema=catalog_stream_override[0]["schema"], | ||
# metadata=metadata_override, # TODO: debug (not working) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be that the metadata is in list
format and it's expecting dict
or vice-versa. My gut says it needs to be a dict
, but haven't written this code in awhile.
One thing that occurs to me just now is that the |
Description of change
This PR adds (optional) discovery capability. This is a broadly used tap and care must be taken to avoid breaking existing samples.
In order to not break downstream consumers and samples:
--catalog
argument provided.sync
operation.--catalog
or-c
argument is provided, the tap should respect the singer spec, for instance column selections.Manual QA steps
--discover
flag and confirmed emitted json matches with expected schema.Risks
Rollback steps