Skip to content

Commit

Permalink
feat(crypto): Add encoding parameter for hashes (#2657)
Browse files Browse the repository at this point in the history
  • Loading branch information
ibgreen authored Sep 24, 2023
1 parent 40135f3 commit ef6613c
Show file tree
Hide file tree
Showing 26 changed files with 345 additions and 137 deletions.
17 changes: 16 additions & 1 deletion docs/modules/crypto/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,25 @@

The `@loaders.gl/crypto` module provides a selection of optional cryptographic hash plugins for loaders.gl.

## Cryptographic Formats
Terminology:
- A **hash** is an

## Cryptographic Algorithmgs

MD5, SHA256 and many more, see [crypto-js](https://github.com/brix/crypto-js)

## Encoding

A hash algorithm takes input data and generates a digest. The digest is a number, usually containing a fixed number of bits.
Since the number of bits involved is often large (e.g. SHA256 generates a 256 bit digest), digests are typically encoded as strings instead of numbers.

The two most common string encodings of numbers are *hex* and *base64*.
Which one you want to use often depends on what you are planning to do with the digest.
If you are calling an existing API (perhaps a cloud storage service)
you will want to generate the encoding that matches or is required by that API.

All hash functions in the `@loaders.gl/crypto` module take an `encoding` parameter that lets you specify the encoding.

## Cryptographic Hash API

The API offers "transforms" that can calculate a cryptographic hash incrementally on data as it comes in on a stream.
Expand Down
26 changes: 21 additions & 5 deletions docs/modules/crypto/api-reference/hash.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,39 @@ The name of the hash algorithm

## Methods

#### `preload(): Promise<void>`
#### `preload()`

`preload(): Promise<void>`

Asynchronously loads required libraries. For some hash classes this must be completed before
`hashSync()` is available.

#### `hash(data: ArrayBuffer): Promise<ArrayBuffer>`
#### `hash()`

```typescript
hash.hash(data: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<ArrayBuffer>
```

Asynchronously hashes data.

#### `hashSync(data: ArrayBuffer): ArrayBuffer`
#### `hashSync()`

```typescript
hash.hashSync(data: ArrayBuffer, encoding: 'hex' | 'base64'): ArrayBuffer
```

Synchronously hashes data.

For some hashions `preload()` must have been called and completed before
:::caution
For some hash sub classes, `preload()` must have been called and completed before
synchronous operations are available.
:::

#### `hashInBactches()`

#### `hashBatches(data: AsyncIterable<ArrayBuffer>): AsyncIterable<ArrayBuffer>`
```typescript
hash.hashBatches(data: AsyncIterable<ArrayBuffer>, encoding: 'hex' | 'base64'): AsyncIterable<ArrayBuffer>
```

Asynchronously hashes data in batches.

Expand Down
6 changes: 6 additions & 0 deletions docs/upgrade-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ which now provide a built-in browser-compatible `fetch()` function by default.
This new built-in Node.js `fetch` function does not support reading from the file system,
and loaders.gl v4.0 aligns with this practice.

---

**@loaders.gl/crupto**

- All hashes now require an encoding parameter. To get previous behavior, just specify `'base64'`.

**@loaders.gl/arrow**

- Batches now contain a Table with a single `RecordBatch` (instead of just a `RecordBatch`).
Expand Down
4 changes: 4 additions & 0 deletions docs/whats-new.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,10 @@ Documentation:
- New functions that allow loaders to call sub-loaders with full type inference, without importing the core module.
- `parseWithContext()`, `parseSyncWithContext()` and `parseInBatchesWithContext()`

**@loaders.gl/crypto**

- All hashes now accept an encoding parameter allowing the app to choose between `'hex'` and `'base64'` encoding (rather than getting `'base64'` by default).

### Loader Modules

**@loaders.gl/gltf**
Expand Down
17 changes: 10 additions & 7 deletions modules/crypto/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
// import type {WorkerObject} from '@loaders.gl/worker-utils';
// loaders.gl, MIT license

// __VERSION__ is injected by babel-plugin-version-inline
// @ts-ignore TS2304: Cannot find name '__VERSION__'.
const VERSION = typeof __VERSION__ !== 'undefined' ? __VERSION__ : 'latest';

export {CRC32Hash} from './lib/crc32-hash';
export {CRC32CHash} from './lib/crc32c-hash';
Expand All @@ -8,12 +12,6 @@ export {SHA256Hash} from './lib/sha256-hash';
export {CryptoHash} from './lib/crypto-hash';
export {NodeHash} from './lib/node-hash';

export {hexToBase64 as _hexToBase64, toHex as _toHex} from './lib/utils/digest-utils';

// __VERSION__ is injected by babel-plugin-version-inline
// @ts-ignore TS2304: Cannot find name '__VERSION__'.
const VERSION = typeof __VERSION__ !== 'undefined' ? __VERSION__ : 'latest';

/**
* Small, fast worker for CRC32, CRC32c and MD5 Hashes
*/
Expand All @@ -40,3 +38,8 @@ export const CryptoJSWorker = {
cryptojs: {}
}
};

// EXPERIMENTAL

export {encodeNumber, encodeHex, encodeBase64} from './lib/utils/digest-utils';
export {asciiToBase64, base64ToAscii} from './lib/utils/base64-utils';
7 changes: 4 additions & 3 deletions modules/crypto/src/lib/algorithms/md5-wasm.ts
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ function makeMD5WA() {
setX = exports.setX;
memView = mView;
digestbytes = crypt.wordsToBytes(md5WA(message));
return options && options.asBytes ? digestbytes : crypt.bytesToHex(digestbytes);
return options && options.asBytes ? digestbytes : crypt.bytesconvertNumberToHex(digestbytes);
};
}

Expand Down Expand Up @@ -482,7 +482,8 @@ function makeMD5JS() {

return function (message, options) {
var digestbytes = crypt.wordsToBytes(md5JS(message, options)),
result = options && options.asBytes ? digestbytes : crypt.bytesToHex(digestbytes);
result =
options && options.asBytes ? digestbytes : crypt.bytesconvertNumberToHex(digestbytes);
return result;
};
}
Expand Down Expand Up @@ -534,7 +535,7 @@ function makeCrypt() {
return bytes;
},

bytesToHex: function (bytes) {
bytesconvertNumberToHex: function (bytes) {
for (var hex = [], i = 0; i < bytes.length; i++) {
hex.push((bytes[i] >>> 4).toString(16));
hex.push((bytes[i] & 0xf).toString(16));
Expand Down
22 changes: 10 additions & 12 deletions modules/crypto/src/lib/crc32-hash.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// CRC32
import {Hash} from './hash';
import CRC32 from './algorithms/crc32';
import {toHex, hexToBase64} from './utils/digest-utils';
import {encodeNumber} from './utils/digest-utils';

/**
* Calculates CRC32 Cryptographic Hash
Expand All @@ -23,28 +23,26 @@ export class CRC32Hash extends Hash {
* Atomic hash calculation
* @returns base64 encoded hash
*/
async hash(input: ArrayBuffer): Promise<string> {
return this.hashSync(input);
async hash(input: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<string> {
return this.hashSync(input, encoding);
}

hashSync(input: ArrayBuffer): string {
hashSync(input: ArrayBuffer, encoding: 'hex' | 'base64'): string {
this._hash.update(input);
const hashValue = this._hash.finalize();
const hex = toHex(hashValue);
const hash = hexToBase64(hex);
return hash;
const digest = this._hash.finalize();
return encodeNumber(digest, encoding);
}

async *hashBatches(
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>,
encoding: 'hex' | 'base64' = 'base64'
): AsyncIterable<ArrayBuffer> {
for await (const chunk of asyncIterator) {
this._hash.update(chunk);
yield chunk;
}
const hashValue = this._hash.finalize();
const hex = toHex(hashValue);
const hash = hexToBase64(hex);
const digest = this._hash.finalize();
const hash = encodeNumber(digest, encoding);
this.options.crypto?.onEnd?.({hash});
}
}
22 changes: 10 additions & 12 deletions modules/crypto/src/lib/crc32c-hash.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// CRC32c
import {Hash} from './hash';
import CRC32C from './algorithms/crc32c';
import {toHex, hexToBase64} from './utils/digest-utils';
import {encodeNumber} from './utils/digest-utils';

/**
* A transform that calculates CRC32c Hash
Expand All @@ -26,30 +26,28 @@ export class CRC32CHash extends Hash {
* Atomic hash calculation
* @returns base64 encoded hash
*/
async hash(input: ArrayBuffer): Promise<string> {
return this.hashSync(input);
async hash(input: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<string> {
return this.hashSync(input, encoding);
}

hashSync(input: ArrayBuffer): string {
hashSync(input: ArrayBuffer, encoding: 'hex' | 'base64'): string {
this._hash.update(input);
const hashValue = this._hash.finalize();
const hex = toHex(hashValue);
const hash = hexToBase64(hex);
return hash;
const digest = this._hash.finalize();
return encodeNumber(digest, encoding);
}

// runInBatches inherited

async *hashBatches(
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>,
encoding: 'hex' | 'base64' = 'base64'
): AsyncIterable<ArrayBuffer> {
for await (const chunk of asyncIterator) {
this._hash.update(chunk);
yield chunk;
}
const hashValue = this._hash.finalize();
const hex = toHex(hashValue);
const hash = hexToBase64(hex);
const digest = this._hash.finalize();
const hash = encodeNumber(digest, encoding);
this.options.crypto?.onEnd?.({hash});
}
}
15 changes: 10 additions & 5 deletions modules/crypto/src/lib/crypto-hash.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,17 +53,20 @@ export class CryptoHash extends Hash {
* Atomic hash calculation
* @returns base64 encoded hash
*/
async hash(input: ArrayBuffer): Promise<string> {
async hash(input: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<string> {
await this.preload();
// arrayBuffer is accepted, even though types and docs say no
// https://stackoverflow.com/questions/25567468/how-to-decrypt-an-arraybuffer
// @ts-expect-error
const typedWordArray = CryptoJS.lib.WordArray.create(input);
return this._hash.update(typedWordArray).finalize().toString(CryptoJS.enc.Base64);
// Map our encoding constant to Crypto library
const enc = encoding === 'base64' ? CryptoJS.enc.Base64 : CryptoJS.enc.Hex;
return this._hash.update(typedWordArray).finalize().toString(enc);
}

async *hashBatches(
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>,
encoding: 'hex' | 'base64' = 'base64'
): AsyncIterable<ArrayBuffer> {
await this.preload();
for await (const chunk of asyncIterator) {
Expand All @@ -74,7 +77,9 @@ export class CryptoHash extends Hash {
this._hash.update(typedWordArray);
yield chunk;
}
const hash = this._hash.finalize().toString(CryptoJS.enc.Base64);
this.options?.crypto?.onEnd?.({hash});
// Map our encoding constant to Crypto library
const enc = encoding === 'base64' ? CryptoJS.enc.Base64 : CryptoJS.enc.Hex;
const digest = this._hash.finalize().toString(enc);
this.options?.crypto?.onEnd?.({hash: digest});
}
}
7 changes: 4 additions & 3 deletions modules/crypto/src/lib/hash.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,19 @@ export abstract class Hash {
return;
}

abstract hash(arrayBuffer: ArrayBuffer): Promise<string>;
abstract hash(arrayBuffer: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<string>;

async *hashBatches(
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>,
encoding: 'hex' | 'base64' = 'base64'
): AsyncIterable<ArrayBuffer> {
const arrayBuffers: ArrayBuffer[] = [];
for await (const arrayBuffer of asyncIterator) {
arrayBuffers.push(arrayBuffer);
yield arrayBuffer;
}
const output = await this.concatenate(arrayBuffers);
const hash = await this.hash(output);
const hash = await this.hash(output, encoding);
this.options.crypto?.onEnd?.({hash});
}

Expand Down
6 changes: 3 additions & 3 deletions modules/crypto/src/lib/md5-hash.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// Fork of https://github.com/briantbutton/md5-wasm under MIT license
import {Hash} from './hash';
import md5WASM from './algorithms/md5-wasm';
import {hexToBase64} from './utils/digest-utils';
import {encodeHex} from './utils/digest-utils';

/**
* A transform that calculates MD5 hashes, passing data through
Expand All @@ -19,12 +19,12 @@ export class MD5Hash extends Hash {
* Atomic hash calculation
* @returns base64 encoded hash
*/
async hash(input: ArrayBuffer): Promise<string> {
async hash(input: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<string> {
const md5Promise = new Promise<string>((resolve, reject) =>
// @ts-expect-error
md5WASM(input).then(resolve).catch(reject)
);
const hex = await md5Promise;
return hexToBase64(hex);
return encodeHex(hex, encoding);
}
}
9 changes: 6 additions & 3 deletions modules/crypto/src/lib/node-hash.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ export class NodeHash extends Hash {
* Atomic hash calculation
* @returns base64 encoded hash
*/
async hash(input: ArrayBuffer): Promise<string> {
async hash(input: ArrayBuffer, encoding: 'hex' | 'base64'): Promise<string> {
await this.preload();
const algorithm = this.options?.crypto?.algorithm?.toLowerCase();
try {
Expand All @@ -47,7 +47,8 @@ export class NodeHash extends Hash {
}

async *hashBatches(
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>
asyncIterator: AsyncIterable<ArrayBuffer> | Iterable<ArrayBuffer>,
encoding: 'hex' | 'base64' = 'base64'
): AsyncIterable<ArrayBuffer> {
await this.preload();
const hash = createHash(this.options?.crypto?.algorithm?.toLowerCase());
Expand All @@ -57,6 +58,8 @@ export class NodeHash extends Hash {
hash.update(inputArray);
yield chunk;
}
this.options?.crypto?.onEnd?.({hash: hash.digest('base64')});
// We can pass our encoding constant directly to Node.js digest as it already supports `hex` and `base64`
const digest = hash.digest(encoding);
this.options?.crypto?.onEnd?.({hash: digest});
}
}
Loading

0 comments on commit ef6613c

Please sign in to comment.