Release Release · snowflakedb/snowpark-python

1.22.1 (2024-09-11)

This is a re-release of 1.22.0. Please refer to the 1.22.0 release notes for detailed release content.

1.22.0 (2024-09-10)

Snowpark Python API Updates

New Features

Added the following new functions in snowflake.snowpark.functions:
- array_remove
- ln

Improvements

Improved documentation for Session.write_pandas by making use_logical_type option more explicit.
Added support for specifying the following to DataFrameWriter.save_as_table:
- enable_schema_evolution
- data_retention_time
- max_data_extension_time
- change_tracking
- copy_grants
- iceberg_config A dicitionary that can hold the following iceberg configuration options:
  - external_volume
  - catalog
  - base_location
  - catalog_sync
  - storage_serialization_policy
Added support for specifying the following to DataFrameWriter.copy_into_table:
- iceberg_config A dicitionary that can hold the following iceberg configuration options:
  - external_volume
  - catalog
  - base_location
  - catalog_sync
  - storage_serialization_policy
Added support for specifying the following parameters to DataFrame.create_or_replace_dynamic_table:
- mode
- refresh_mode
- initialize
- clustering_keys
- is_transient
- data_retention_time
- max_data_extension_time

Bug Fixes

Fixed a bug in session.read.csv that caused an error when setting PARSE_HEADER = True in an externally defined file format.
Fixed a bug in query generation from set operations that allowed generation of duplicate queries when children have common subqueries.
Fixed a bug in session.get_session_stage that referenced a non-existing stage after switching database or schema.
Fixed a bug where calling DataFrame.to_snowpark_pandas without explicitly initializing the Snowpark pandas plugin caused an error.
Fixed a bug where using the explode function in dynamic table creation caused a SQL compilation error due to improper boolean type casting on the outer parameter.

Snowpark Local Testing Updates

New Features

Added support for type coercion when passing columns as input to UDF calls.
Added support for Index.identical.

Bug Fixes

Fixed a bug where the truncate mode in DataFrameWriter.save_as_table incorrectly handled DataFrames containing only a subset of columns from the existing table.
Fixed a bug where function to_timestamp does not set the default timezone of the column datatype.

Snowpark pandas API Updates

New Features

Added limited support for the Timedelta type, including the following features. Snowpark pandas will raise NotImplementedError for unsupported Timedelta use cases.
- supporting tracking the Timedelta type through copy, cache_result, shift, sort_index, assign, bfill, ffill, fillna, compare, diff, drop, dropna, duplicated, empty, equals, insert, isin, isna, items, iterrows, join, len, mask, melt, merge, nlargest, nsmallest, to_pandas.
- converting non-timedelta to timedelta via astype.
- NotImplementedError will be raised for the rest of methods that do not support Timedelta.
- support for subtracting two timestamps to get a Timedelta.
- support indexing with Timedelta data columns.
- support for adding or subtracting timestamps and Timedelta.
- support for binary arithmetic between two Timedelta values.
- support for binary arithmetic and comparisons between Timedelta values and numeric values.
- support for lazy TimedeltaIndex.
- support for pd.to_timedelta.
- support for GroupBy aggregations min, max, mean, idxmax, idxmin, std, sum, median, count, any, all, size, nunique, head, tail, aggregate.
- support for GroupBy filtrations first and last.
- support for TimedeltaIndex attributes: days, seconds, microseconds and nanoseconds.
- support for diff with timestamp columns on axis=0 and axis=1
- support for TimedeltaIndex methods: ceil, floor and round.
- support for TimedeltaIndex.total_seconds method.
Added support for index's arithmetic and comparison operators.
Added support for Series.dt.round.
Added documentation pages for DatetimeIndex.
Added support for Index.name, Index.names, Index.rename, and Index.set_names.
Added support for Index.__repr__.
Added support for DatetimeIndex.month_name and DatetimeIndex.day_name.
Added support for Series.dt.weekday, Series.dt.time, and DatetimeIndex.time.
Added support for Index.min and Index.max.
Added support for pd.merge_asof.
Added support for Series.dt.normalize and DatetimeIndex.normalize.
Added support for Index.is_boolean, Index.is_integer, Index.is_floating, Index.is_numeric, and Index.is_object.
Added support for DatetimeIndex.round, DatetimeIndex.floor and DatetimeIndex.ceil.
Added support for Series.dt.days_in_month and Series.dt.daysinmonth.
Added support for DataFrameGroupBy.value_counts and SeriesGroupBy.value_counts.
Added support for Series.is_monotonic_increasing and Series.is_monotonic_decreasing.
Added support for Index.is_monotonic_increasing and Index.is_monotonic_decreasing.
Added support for pd.crosstab.
Added support for pd.bdate_range and included business frequency support (B, BME, BMS, BQE, BQS, BYE, BYS) for both pd.date_range and pd.bdate_range.
Added support for lazy Index objects as labels in DataFrame.reindex and Series.reindex.
Added support for Series.dt.days, Series.dt.seconds, Series.dt.microseconds, and Series.dt.nanoseconds.
Added support for creating a DatetimeIndex from an Index of numeric or string type.
Added support for string indexing with Timedelta objects.
Added support for Series.dt.total_seconds method.

Improvements

Improve concat, join performance when operations are performed on series coming from the same dataframe by avoiding unnecessary joins.
Refactored quoted_identifier_to_snowflake_type to avoid making metadata queries if the types have been cached locally.
Improved pd.to_datetime to handle all local input cases.
Create a lazy index from another lazy index without pulling data to client.
Raised NotImplementedError for Index bitwise operators.
Display a more clear error message when Index.names is set to a non-like-like object.
Raise a warning whenever MultiIndex values are pulled in locally.
Improve warning message for pd.read_snowflake include the creation reason when temp table creation is triggered.
Improve performance for DataFrame.set_index, or setting DataFrame.index or Series.index by avoiding checks require eager evaluation. As a consequence, when the new index that does not match the current Series/DataFrame object length, a ValueError is no longer raised. Instead, when the Series/DataFrame object is longer than the provided index, the Series/DataFrame's new index is filled with NaN values for the "extra" elements. Otherwise, the extra values in the provided index are ignored.

Bug Fixes

Stopped ignoring nanoseconds in pd.Timedelta scalars.
Fixed AssertionError in tree of binary operations.
Fixed bug in Series.dt.isocalendar using a named Series
Fixed inplace argument for Series objects derived from DataFrame columns.
Fixed a bug where Series.reindex and DataFrame.reindex did not update the result index's name correctly.
Fixed a bug where Series.take did not error when axis=1 was specified.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release

1.22.1 (2024-09-11)

1.22.0 (2024-09-10)

Snowpark Python API Updates

New Features

Improvements

Bug Fixes

Snowpark Local Testing Updates

New Features

Bug Fixes

Snowpark pandas API Updates

New Features

Improvements

Bug Fixes