Skip to content

Releases: mlcommons/croissant

v1.0.12

04 Dec 15:51
3c3a7fa
Compare
Choose a tag to compare

What's Changed

  • Notebooks: Load right split names for fashionmnist by @ccl-core in #773
  • Don't handle untested and bugged case for excludes. by @marcenacp in #771
  • Handle non-capturing groups in regex transforms (partial-train/*.parquet). by @marcenacp in #774
  • Drop all useless operations when we filter on a field - so we know its value in advance. by @marcenacp in #775
  • Properly handle python variable. by @marcenacp in #777
  • Fix errors with nested subfields by @ccl-core in #776
  • Early return for num_shards==0 in the Beam pipeline. by @marcenacp in #778
  • Clean code by checking attribute. by @marcenacp in #779
  • Simplify ReadFromCroissant by removing the pipeline argument and making it a PCollection. by @marcenacp in #780
  • Create new version mlcroissant==1.0.12 with the new ReadFromCroissant. by @marcenacp in #781

Full Changelog: v1.0.11...v1.0.12

v1.0.11

26 Nov 14:37
6e233aa
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.0.10...v1.0.11

v1.0.10

17 Oct 13:33
60233a4
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.0.9...v1.0.10

v1.0.9

27 Sep 08:45
0cd52d5
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.0.8...v1.0.9

v1.0.8

06 Sep 10:11
94e3f84
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.0.7...v1.0.8

v1.0.7

24 Jul 09:00
a6b3fb5
Compare
Choose a tag to compare

What's Changed

  • Add URLs to pyproject.toml by @PGijsbers in #705
  • Implement filtering in the case of filename regular expression and add a test for this feature. by @marcenacp in #716
  • Fix broken Unit tests. by @ccl-core in #717
  • Add more info links on how to do releases. by @ccl-core in #718
  • Apply filters to a Hugging Face dataset to avoid repeating all variants. by @marcenacp in #719
  • Move filters from Dataset init to self.records by @ccl-core in #720
  • Release 1.0.7 by @ccl-core in #721

Full Changelog: v1.0.6...v1.0.7

v1.0.6

18 Jul 19:08
a2224e2
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.0.5...v1.0.6

v1.0.5

08 Apr 08:36
ed139d1
Compare
Choose a tag to compare

Following the Croissant v1.0 standard, we now use IDs (instead of names) to identify RecordSets.

v1.0.4

02 Apr 07:24
f9be163
Compare
Choose a tag to compare
  • schema.org/description is not enforced as optional anymore which clarifies the error messages.
  • Add a warning in case the user deviates from the standard JSON-LD @context.
  • Standardize JSON-LD @context across example datasets.

v1.0.3

13 Mar 13:43
1afdaf9
Compare
Choose a tag to compare
  • mlcroissant is more compliant towards Croissant 1.0.
  • New user-facing Python API using dataclass_transform.