Release
1.22.1 (2024-09-11)
This is a re-release of 1.22.0. Please refer to the 1.22.0 release notes for detailed release content.
1.22.0 (2024-09-10)
Snowpark Python API Updates
New Features
- Added the following new functions in
snowflake.snowpark.functions
:array_remove
ln
Improvements
- Improved documentation for
Session.write_pandas
by makinguse_logical_type
option more explicit. - Added support for specifying the following to
DataFrameWriter.save_as_table
:enable_schema_evolution
data_retention_time
max_data_extension_time
change_tracking
copy_grants
iceberg_config
A dicitionary that can hold the following iceberg configuration options:external_volume
catalog
base_location
catalog_sync
storage_serialization_policy
- Added support for specifying the following to
DataFrameWriter.copy_into_table
:iceberg_config
A dicitionary that can hold the following iceberg configuration options:external_volume
catalog
base_location
catalog_sync
storage_serialization_policy
- Added support for specifying the following parameters to
DataFrame.create_or_replace_dynamic_table
:mode
refresh_mode
initialize
clustering_keys
is_transient
data_retention_time
max_data_extension_time
Bug Fixes
- Fixed a bug in
session.read.csv
that caused an error when settingPARSE_HEADER = True
in an externally defined file format. - Fixed a bug in query generation from set operations that allowed generation of duplicate queries when children have common subqueries.
- Fixed a bug in
session.get_session_stage
that referenced a non-existing stage after switching database or schema. - Fixed a bug where calling
DataFrame.to_snowpark_pandas
without explicitly initializing the Snowpark pandas plugin caused an error. - Fixed a bug where using the
explode
function in dynamic table creation caused a SQL compilation error due to improper boolean type casting on theouter
parameter.
Snowpark Local Testing Updates
New Features
- Added support for type coercion when passing columns as input to UDF calls.
- Added support for
Index.identical
.
Bug Fixes
- Fixed a bug where the truncate mode in
DataFrameWriter.save_as_table
incorrectly handled DataFrames containing only a subset of columns from the existing table. - Fixed a bug where function
to_timestamp
does not set the default timezone of the column datatype.
Snowpark pandas API Updates
New Features
- Added limited support for the
Timedelta
type, including the following features. Snowpark pandas will raiseNotImplementedError
for unsupportedTimedelta
use cases.- supporting tracking the Timedelta type through
copy
,cache_result
,shift
,sort_index
,assign
,bfill
,ffill
,fillna
,compare
,diff
,drop
,dropna
,duplicated
,empty
,equals
,insert
,isin
,isna
,items
,iterrows
,join
,len
,mask
,melt
,merge
,nlargest
,nsmallest
,to_pandas
. - converting non-timedelta to timedelta via
astype
. NotImplementedError
will be raised for the rest of methods that do not supportTimedelta
.- support for subtracting two timestamps to get a Timedelta.
- support indexing with Timedelta data columns.
- support for adding or subtracting timestamps and
Timedelta
. - support for binary arithmetic between two
Timedelta
values. - support for binary arithmetic and comparisons between
Timedelta
values and numeric values. - support for lazy
TimedeltaIndex
. - support for
pd.to_timedelta
. - support for
GroupBy
aggregationsmin
,max
,mean
,idxmax
,idxmin
,std
,sum
,median
,count
,any
,all
,size
,nunique
,head
,tail
,aggregate
. - support for
GroupBy
filtrationsfirst
andlast
. - support for
TimedeltaIndex
attributes:days
,seconds
,microseconds
andnanoseconds
. - support for
diff
with timestamp columns onaxis=0
andaxis=1
- support for
TimedeltaIndex
methods:ceil
,floor
andround
. - support for
TimedeltaIndex.total_seconds
method.
- supporting tracking the Timedelta type through
- Added support for index's arithmetic and comparison operators.
- Added support for
Series.dt.round
. - Added documentation pages for
DatetimeIndex
. - Added support for
Index.name
,Index.names
,Index.rename
, andIndex.set_names
. - Added support for
Index.__repr__
. - Added support for
DatetimeIndex.month_name
andDatetimeIndex.day_name
. - Added support for
Series.dt.weekday
,Series.dt.time
, andDatetimeIndex.time
. - Added support for
Index.min
andIndex.max
. - Added support for
pd.merge_asof
. - Added support for
Series.dt.normalize
andDatetimeIndex.normalize
. - Added support for
Index.is_boolean
,Index.is_integer
,Index.is_floating
,Index.is_numeric
, andIndex.is_object
. - Added support for
DatetimeIndex.round
,DatetimeIndex.floor
andDatetimeIndex.ceil
. - Added support for
Series.dt.days_in_month
andSeries.dt.daysinmonth
. - Added support for
DataFrameGroupBy.value_counts
andSeriesGroupBy.value_counts
. - Added support for
Series.is_monotonic_increasing
andSeries.is_monotonic_decreasing
. - Added support for
Index.is_monotonic_increasing
andIndex.is_monotonic_decreasing
. - Added support for
pd.crosstab
. - Added support for
pd.bdate_range
and included business frequency support (B, BME, BMS, BQE, BQS, BYE, BYS) for bothpd.date_range
andpd.bdate_range
. - Added support for lazy
Index
objects aslabels
inDataFrame.reindex
andSeries.reindex
. - Added support for
Series.dt.days
,Series.dt.seconds
,Series.dt.microseconds
, andSeries.dt.nanoseconds
. - Added support for creating a
DatetimeIndex
from anIndex
of numeric or string type. - Added support for string indexing with
Timedelta
objects. - Added support for
Series.dt.total_seconds
method.
Improvements
- Improve concat, join performance when operations are performed on series coming from the same dataframe by avoiding unnecessary joins.
- Refactored
quoted_identifier_to_snowflake_type
to avoid making metadata queries if the types have been cached locally. - Improved
pd.to_datetime
to handle all local input cases. - Create a lazy index from another lazy index without pulling data to client.
- Raised
NotImplementedError
for Index bitwise operators. - Display a more clear error message when
Index.names
is set to a non-like-like object. - Raise a warning whenever MultiIndex values are pulled in locally.
- Improve warning message for
pd.read_snowflake
include the creation reason when temp table creation is triggered. - Improve performance for
DataFrame.set_index
, or settingDataFrame.index
orSeries.index
by avoiding checks require eager evaluation. As a consequence, when the new index that does not match the currentSeries
/DataFrame
object length, aValueError
is no longer raised. Instead, when theSeries
/DataFrame
object is longer than the provided index, theSeries
/DataFrame
's new index is filled withNaN
values for the "extra" elements. Otherwise, the extra values in the provided index are ignored.
Bug Fixes
- Stopped ignoring nanoseconds in
pd.Timedelta
scalars. - Fixed AssertionError in tree of binary operations.
- Fixed bug in
Series.dt.isocalendar
using a named Series - Fixed
inplace
argument for Series objects derived from DataFrame columns. - Fixed a bug where
Series.reindex
andDataFrame.reindex
did not update the result index's name correctly. - Fixed a bug where
Series.take
did not error whenaxis=1
was specified.