Skip to content

kedro-datasets-6.0.0

Latest
Compare
Choose a tag to compare
@merelcht merelcht released this 18 Dec 16:47
87d5e62

Major features and improvements

  • Supported passing database to ibis.TableDataset for load and save operations.
  • Added functionality to save pandas DataFrames directly to Snowflake, facilitating seamless .csv ingestion.
  • Added Python 3.9, 3.10 and 3.11 support for snowflake.SnowflakeTableDataset.
  • Enabled connection sharing between ibis.FileDataset and ibis.TableDataset instances, thereby allowing nodes to save data loaded by one to the other (as long as they share the same connection configuration).
  • Added the following new experimental datasets:
Type Description Location
databricks.ExternalTableDataset A dataset for accessing external tables in Databricks. kedro_datasets_experimental.databricks
safetensors.SafetensorsDataset A dataset for securely saving and loading files in the SafeTensors format. kedro_datasets_experimental.safetensors

Bug fixes and other changes

  • Delayed backend connection for pandas.GBQTableDataset. In practice, this means that a dataset's connection details aren't used (or validated) until the dataset is accessed. On the plus side, the cost of connection isn't incurred regardless of when or whether the dataset is used. Furthermore, this makes the dataset object serializable (e.g. for use with ParallelRunner), because the unserializable client isn't part of it.
  • Removed the unused BigQuery client created in pandas.GBQQueryDataset. This makes the dataset object serializable (e.g. for use with ParallelRunner) by removing the unserializable object.
  • Implemented Snowflake's local testing framework for testing purposes.
  • Improved the dependency management for Spark-based datasets by refactoring the Spark and Databricks utility functions used across the datasets.
  • Added deprecation warning for tracking.MetricsDataset and tracking.JSONDataset.
  • Moved kedro-catalog JSON schemas from Kedro core to kedro-datasets.

Breaking Changes

  • Demoted video.VideoDataset from core to experimental dataset.
  • Removed file handling capabilities from ibis.TableDataset. Use ibis.FileDataset to load and save files with an Ibis backend instead.

Community contributions

Many thanks to the following Kedroids for contributing PRs to this release: