-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse table metadata.configuration as TableProperties
#453
Changes from 10 commits
ba8309e
9d4b599
1b7b193
02d50ee
e4676d6
9f8afa4
ed2c10a
42e6028
f1b9a16
82370b4
00b9d8e
f748f87
af08092
4587794
1e7d286
bd9ac7a
fa48054
ff78623
b667a15
b3cdc61
a891b52
d8a2933
d1ce73d
d8af98c
6d1b466
f18b885
437b8db
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -63,8 +63,16 @@ pub mod actions; | |
pub mod engine_data; | ||
pub mod error; | ||
pub mod expressions; | ||
pub(crate) mod predicates; | ||
pub mod table_features; | ||
pub mod scan; | ||
pub mod schema; | ||
pub mod snapshot; | ||
pub mod table; | ||
pub mod table_properties; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. only meaningful change is adding |
||
pub mod transaction; | ||
|
||
pub(crate) mod predicates; | ||
pub(crate) mod utils; | ||
|
||
#[cfg(feature = "developer-visibility")] | ||
pub mod path; | ||
|
@@ -76,13 +84,6 @@ pub mod log_segment; | |
#[cfg(not(feature = "developer-visibility"))] | ||
pub(crate) mod log_segment; | ||
|
||
pub mod scan; | ||
pub mod schema; | ||
pub mod snapshot; | ||
pub mod table; | ||
pub mod transaction; | ||
pub(crate) mod utils; | ||
|
||
pub use delta_kernel_derive; | ||
pub use engine_data::{DataVisitor, EngineData}; | ||
pub use error::{DeltaResult, Error}; | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -95,7 +95,7 @@ impl ScanBuilder { | |||||
let (all_fields, read_fields, have_partition_cols) = get_state_info( | ||||||
logical_schema.as_ref(), | ||||||
&self.snapshot.metadata().partition_columns, | ||||||
self.snapshot.column_mapping_mode, | ||||||
self.snapshot.table_properties().get_column_mapping_mode(), | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't normally use
Suggested change
(rust can handle structs with both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm curious to hear everyone's thoughts on this.. I think we need two things here (and the API above was attempting to answer the second part)
this probably isn't the PR for tackling (2) - but would love to hear some thoughts so we could get started on a follow-up. For now I'll just do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it makes a lot of sense. These are checks that are going to be made for every table feature, so it helps to have all of it in one place. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah. Likely we don't need a method for each one and should rather check a struct of some sort. The API design is a bit subtle though, as some features can be set to multiple modes (i.e. column mapping) and others can just be on/off. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||
)?; | ||||||
let physical_schema = Arc::new(StructType::new(read_fields)); | ||||||
|
||||||
|
@@ -251,7 +251,7 @@ impl Scan { | |||||
partition_columns: self.snapshot.metadata().partition_columns.clone(), | ||||||
logical_schema: self.logical_schema.clone(), | ||||||
read_schema: self.physical_schema.clone(), | ||||||
column_mapping_mode: self.snapshot.column_mapping_mode, | ||||||
column_mapping_mode: self.snapshot.table_properties().get_column_mapping_mode(), | ||||||
} | ||||||
} | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ use serde::{Deserialize, Serialize}; | |
use crate::{DeltaResult, Error}; | ||
|
||
/// Modes of column mapping a table can be in | ||
#[derive(Serialize, Deserialize, Debug, Copy, Clone, PartialEq)] | ||
#[derive(Serialize, Deserialize, Debug, Copy, Clone, PartialEq, Eq)] | ||
#[serde(rename_all = "camelCase")] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aside: Do these classes still need serde support? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unfortunately yea, since it is still a field in |
||
pub enum ColumnMappingMode { | ||
/// No column mapping is applied | ||
|
@@ -17,9 +17,6 @@ pub enum ColumnMappingMode { | |
Name, | ||
} | ||
|
||
// key to look in metadata.configuration for to get column mapping mode | ||
pub(crate) const COLUMN_MAPPING_MODE_KEY: &str = "delta.columnMapping.mode"; | ||
|
||
impl TryFrom<&str> for ColumnMappingMode { | ||
type Error = Error; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just eagerly parse
TableProperties
instead doing theHashMap<String,String>
toTableProperties
separately. Do we split it up because of theSchema
derive?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup exactly. for now this seemed like the best way to separate the schema (
Metadata
struct) from the actual parsing ofTableProperties
. perhaps in the future we can look into unifying these? could just omit the derive and impl Schema ourselves (or add some new fancy mechanism that lets us annotate fields with[derive(Schema)]