-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-870432: use_logical_type for inferring timezone in pandas df #1134
SNOW-870432: use_logical_type for inferring timezone in pandas df #1134
Conversation
@@ -1478,6 +1479,41 @@ def test_create_dataframe_with_semi_structured_data_types(session): | |||
) | |||
|
|||
|
|||
@pytest.mark.skipif(not is_pandas_available, reason="pandas is required") | |||
def test_create_dataframe_with_pandas_df(session): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to port some changes to sp connector too to support use_logical_type
in stored proc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. We need to port this to sproc too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/snowflake/snowpark/session.py
Outdated
@@ -2008,6 +2014,7 @@ def create_dataframe( | |||
quote_identifiers=True, | |||
auto_create_table=True, | |||
table_type="temporary", | |||
use_logical_type=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a behavior change (i.e. wrong behavior -> correct behavior)? Should we highlight the implication in change log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the changelog detailing what exactly is going to change
…timestamp-correctly
…timestamp-correctly
…timestamp-correctly
…timestamp-correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. One comment on the changelog.
CHANGELOG.md
Outdated
- Earlier timestamp columns without a timezone would be inferred as `LongType()` but will now be correctly inferred as `TimestampType(TimestampTimeZone.NTZ)`. | ||
- Earlier timestamp columns without a timezone would be converted to nanosecond epochs, but will now be correctly be maintained as timestamp values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two lines for the same logic branch? If so please consider merging them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. I'll merge them
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #870432, SNOW-886649: write_pandas inserts datetime64[ns] to Snowflake as an Invalid Date #991
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Port the fix from snowflake python connector to use_logical_type when inferring datatype from parquet files.