-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse table metadata.configuration as TableProperties
#453
Merged
zachschuermann
merged 27 commits into
delta-io:main
from
zachschuermann:table-properties
Nov 25, 2024
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
ba8309e
checkpoint with serde but think i need to change that
zachschuermann 9d4b599
rough draft serde for table props
zachschuermann 1b7b193
make everything optional
zachschuermann 02d50ee
errors, comments, cleanup
zachschuermann e4676d6
fix
zachschuermann 9f8afa4
use new col name list parsing
zachschuermann ed2c10a
Merge remote-tracking branch 'upstream/main' into table-properties
zachschuermann 42e6028
docs
zachschuermann f1b9a16
Merge remote-tracking branch 'upstream/main' into table-properties
zachschuermann 82370b4
remove derive
zachschuermann 00b9d8e
make deserializer work on hashmap ref
zachschuermann f748f87
fix column mapping mode check
zachschuermann af08092
testing, errors, docs, cleanup
zachschuermann 4587794
cleanup
zachschuermann 1e7d286
fix skipping dat test
zachschuermann bd9ac7a
address feedback, cleanup
zachschuermann fa48054
Merge branch 'main' into table-properties
zachschuermann ff78623
remove unused const
zachschuermann b667a15
no more serde
zachschuermann b3cdc61
cleanup
zachschuermann a891b52
Merge remote-tracking branch 'upstream/main' into table-properties
zachschuermann d8a2933
add back col mapping mode fn
zachschuermann d1ce73d
address ryan review
zachschuermann d8af98c
Merge branch 'main' into table-properties
zachschuermann 6d1b466
use NonZero<u64>
zachschuermann f18b885
Merge remote-tracking branch 'refs/remotes/origin/table-properties' i…
zachschuermann 437b8db
clippy
zachschuermann File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,58 +1,79 @@ | ||
//! Code to handle column mapping, including modes and schema transforms | ||
use std::str::FromStr; | ||
use super::ReaderFeatures; | ||
use crate::actions::Protocol; | ||
use crate::table_properties::TableProperties; | ||
|
||
use serde::{Deserialize, Serialize}; | ||
|
||
use crate::{DeltaResult, Error}; | ||
use strum::EnumString; | ||
|
||
/// Modes of column mapping a table can be in | ||
#[derive(Serialize, Deserialize, Debug, Copy, Clone, PartialEq)] | ||
#[derive(Debug, Default, EnumString, Serialize, Deserialize, Copy, Clone, PartialEq, Eq)] | ||
#[strum(serialize_all = "camelCase")] | ||
#[serde(rename_all = "camelCase")] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aside: Do these classes still need serde support? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unfortunately yea, since it is still a field in |
||
pub enum ColumnMappingMode { | ||
/// No column mapping is applied | ||
None, | ||
/// Columns are mapped by their field_id in parquet | ||
Id, | ||
/// Columns are mapped to a physical name | ||
#[default] | ||
Name, | ||
} | ||
|
||
// key to look in metadata.configuration for to get column mapping mode | ||
pub(crate) const COLUMN_MAPPING_MODE_KEY: &str = "delta.columnMapping.mode"; | ||
|
||
impl TryFrom<&str> for ColumnMappingMode { | ||
type Error = Error; | ||
|
||
fn try_from(s: &str) -> DeltaResult<Self> { | ||
match s.to_ascii_lowercase().as_str() { | ||
"none" => Ok(Self::None), | ||
"id" => Ok(Self::Id), | ||
"name" => Ok(Self::Name), | ||
_ => Err(Error::invalid_column_mapping_mode(s)), | ||
/// Determine the column mapping mode for a table based on the [`Protocol`] and [`TableProperties`] | ||
pub(crate) fn column_mapping_mode( | ||
protocol: &Protocol, | ||
table_properties: &TableProperties, | ||
) -> ColumnMappingMode { | ||
match table_properties.column_mapping_mode { | ||
Some(mode) if protocol.min_reader_version() == 2 => mode, | ||
Some(mode) | ||
if protocol.min_reader_version() == 3 | ||
&& protocol.has_reader_feature(&ReaderFeatures::ColumnMapping) => | ||
{ | ||
mode | ||
} | ||
_ => ColumnMappingMode::None, | ||
} | ||
} | ||
|
||
impl FromStr for ColumnMappingMode { | ||
type Err = Error; | ||
#[cfg(test)] | ||
mod tests { | ||
use super::*; | ||
use std::collections::HashMap; | ||
|
||
fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
s.try_into() | ||
} | ||
} | ||
#[test] | ||
fn test_column_mapping_mode() { | ||
let table_properties: HashMap<_, _> = | ||
[("delta.columnMapping.mode".to_string(), "id".to_string())] | ||
.into_iter() | ||
.collect(); | ||
let table_properties = TableProperties::from(table_properties.iter()); | ||
|
||
impl Default for ColumnMappingMode { | ||
fn default() -> Self { | ||
Self::None | ||
} | ||
} | ||
let protocol = Protocol::try_new(2, 5, None::<Vec<String>>, None::<Vec<String>>).unwrap(); | ||
assert_eq!( | ||
column_mapping_mode(&protocol, &table_properties), | ||
ColumnMappingMode::Id | ||
); | ||
|
||
impl AsRef<str> for ColumnMappingMode { | ||
fn as_ref(&self) -> &str { | ||
match self { | ||
Self::None => "none", | ||
Self::Id => "id", | ||
Self::Name => "name", | ||
} | ||
let empty_features = Some::<[String; 0]>([]); | ||
let protocol = | ||
Protocol::try_new(3, 7, empty_features.clone(), empty_features.clone()).unwrap(); | ||
assert_eq!( | ||
column_mapping_mode(&protocol, &table_properties), | ||
ColumnMappingMode::None | ||
); | ||
|
||
let protocol = Protocol::try_new( | ||
3, | ||
7, | ||
Some([ReaderFeatures::DeletionVectors]), | ||
empty_features, | ||
) | ||
.unwrap(); | ||
assert_eq!( | ||
column_mapping_mode(&protocol, &table_properties), | ||
ColumnMappingMode::None | ||
); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only meaningful change is adding
pub mod table_properties
, other changes are shifting to colocate module declarations