Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
allow typeLength to come from opts.column when decoding FIXED_LEN_BYT…
…E_ARRAY (#108) Problem ======= typeLength is present in column options but decoding is throwing an error. `thrown: "missing option: typeLength (required for FIXED_LEN_BYTE_ARRAY)"` options object for reference: ``` { type: 'FIXED_LEN_BYTE_ARRAY', rLevelMax: 0, dLevelMax: 1, compression: 'SNAPPY', column: { name: 'BLOCK_NUMBER', primitiveType: 'FIXED_LEN_BYTE_ARRAY', originalType: 'DECIMAL', path: [ 'BLOCK_NUMBER' ], repetitionType: 'OPTIONAL', encoding: 'PLAIN', statistics: undefined, compression: 'UNCOMPRESSED', precision: 38, scale: 0, typeLength: 16, rLevelMax: 0, dLevelMax: 1 }, num_values: { buffer: <Buffer 00 00 00 00 00 00 27 10>, offset: 0 } } ``` using `parquet-tools schema` here is the schema for this column: ``` optional fixed_len_byte_array(16) BLOCK_NUMBER (DECIMAL(38,0)) ``` The parquet file is a direct export from snowflake and the data type of the column is `NUMBER(38,0)`. Solution ======== I traced through the code to find where the decode was erroring and added the ability to take the `typeLength` from `column` in the column options when it is not present at the top level. Change summary: --------------- see above Steps to Verify: ---------------- decode a parquet file with this type of field. --------- Co-authored-by: Wil Wade <[email protected]> Co-authored-by: Wil Wade <[email protected]>
- Loading branch information