Skip to content

Commit

Permalink
Update index.ts to support RLE_DICTIONARY (#112)
Browse files Browse the repository at this point in the history
Problem
=======
problem statement - when trying to read a parquet file that was
generated using V2 parquet and had RLE_DICTIONARY, got an error: invalid
encoding: RLE_DICTIONARY #96

Reported issue: #96

Solution
========
What I/we did to solve this problem
added: export * as RLE_DICTIONARY from './plain_dictionary';
----------------
I added this line to an existing project in the node modules and it
works. without this line I get an an error with this line added - it
passed

---------

Co-authored-by: Wil Wade <[email protected]>
  • Loading branch information
saritvakrat and wilwade authored Jan 18, 2024
1 parent 0a42955 commit 6fdb9da
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 5 deletions.
2 changes: 1 addition & 1 deletion lib/codec/index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
export * as PLAIN from './plain'
export * as RLE from './rle'
export * as PLAIN_DICTIONARY from './plain_dictionary'

export * as RLE_DICTIONARY from './plain_dictionary'

27 changes: 23 additions & 4 deletions test/test-files.js
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ describe('test-files', function() {
const scale = schema.fields["value"].scale;
assert.equal(scale, 2);
const divider = 10 ** scale;

for (let i = 0; i < data.length; i++) {
const valueToMatch = i + 1;
// Decimal values whose primitive types are fixed length byte array will
Expand All @@ -160,11 +160,11 @@ describe('test-files', function() {
assert.equal(numericalValue, valueToMatch);
}
});

it('byte_array_decimal.parquet loads', async function () {
const schema = await readSchema('byte_array_decimal.parquet');
const data = await readData('byte_array_decimal.parquet');

const scale = schema.fields["value"].scale;
assert.equal(scale, 2);
const divider = 10 ** scale;
Expand All @@ -173,7 +173,7 @@ describe('test-files', function() {
const valueToMatch = i + 1;
// Decimal values whose primitive types are byte array will
// be returned as raw buffer values.
// For the test data, the actual decimal values and the corresponding buffer lengths
// For the test data, the actual decimal values and the corresponding buffer lengths
// are small enough so we can treat the buffer as a positive integer and compare the values.
// In reality, the user will need to use a more novel approach to parse the
// buffer to an object that can handle large fractional numbers.
Expand All @@ -188,4 +188,23 @@ describe('test-files', function() {
assert.equal(decimalValue, valueToMatch);
}
});

describe("RLE", function () {
// Tracked in https://github.com/LibertyDSNP/parquetjs/issues/113
it.skip('rle_boolean_encoding.parquet loads', async function() {
const data = await readData('rle/rle_boolean_encoding.parquet');
assert.deepEqual(data[0],{ datatype_boolean: true });
assert.deepEqual(data[1],{ datatype_boolean: false });
});

it('rle-dict-snappy-checksum.parquet loads', async function() {
const data = await readData('rle/rle-dict-snappy-checksum.parquet');
assert.deepEqual(data[0],{ binary_field: "c95e263a-f5d4-401f-8107-5ca7146a1f98", long_field: "0" });
});

it('rle-dict-uncompressed-corrupt-checksum.parquet loads', async function() {
const data = await readData('rle/rle-dict-uncompressed-corrupt-checksum.parquet');
assert.deepEqual(data[0],{ binary_field: "6325c32b-f417-41aa-9e02-9b8601542aff", long_field: "0" });
});
})
});
Binary file added test/test-files/rle/rle-dict-snappy-checksum.parquet
Binary file not shown.
Binary file not shown.
Binary file added test/test-files/rle/rle_boolean_encoding.parquet
Binary file not shown.

0 comments on commit 6fdb9da

Please sign in to comment.