A Command-Line Interface (CLI) tool for computing and adding Merkle Tree information to your SpatioTemporal Asset Catalog (STAC) directory structure. This tool ensures metadata integrity for your STAC Items, Collections, and Catalogs by encoding them in a Merkle tree via hashing.
- Overview
- Features
- Prerequisites
- Installation
- Directory Structure
- Usage
- Merkle Tree Extension Specification
- Output
- Contributing
- Verification Steps
The STAC Merkle Tree CLI Tool automates the process of computing and embedding Merkle Tree information into your STAC catalog. By integrating this tool into your workflow, you can:
- Ensure Metadata Integrity: Verify that your STAC objects (Items, Collections, Catalogs) have not been tampered with.
- Facilitate Verification: Enable users to verify the integrity of STAC objects using the Merkle hashes.
- Maintain Consistency: Automatically compute and update Merkle information across your entire catalog hierarchy.
- Recursive Processing: Traverses the entire STAC catalog, including Catalogs, Collections, and Items.
- Merkle Hash Computation: Computes
merkle:object_hash
for each STAC object based on specified hashing methods. - Merkle Root Calculation: Builds Merkle trees for Collections and Catalogs to compute
merkle:root
. - Extension Compliance: Adheres to the Merkle Tree Extension Specification for STAC.
- User-Friendly CLI: Built with the Click library for an intuitive command-line experience.
- Customizable Hash Methods: Supports various hash functions and field selections.
- Python 3.6 or higher
- pip (Python package installer)
pip install stac-merkle-tree-cli
-
Clone the Repository
git clone https://github.com/stacchain/stac-merkle-tree-cli.git cd stac-merkle-tree-cli
-
Install the Package
pip install -e .
Ensure your STAC catalog follows one of the directory structures below for optimal processing:
In this structure, all items are at the same level as the collection.json
file:
collection/
├── collection.json
├── item1.json
├── item2.json
└── ...
In this structure, items can be nested inside their own subdirectories within a collection:
collection/
├── collection.json
├── item1/
│ └── item1.json
├── item2/
│ └── item2.json
└── ...
A full STAC catalog with collections, where items can be either at the same level as the collection.json
or nested within subdirectories:
catalog/
├── catalog.json
├── collections/
│ ├── collection1/
│ │ ├── collection.json
│ │ ├── item1.json
│ │ ├── item2/
│ │ │ └── item2.json
│ ├── collection2/
│ │ ├── collection.json
│ │ ├── item1/
│ │ │ └── item1.json
│ │ └── item2.json
└── ...
- Catalog Level:
catalog.json
: Root catalog file.collections/
: Directory containing all collections.
- Collections Level:
- Each collection has its own directory inside
collections/
, named after the collection. - Inside each collection directory:
collection.json
: Collection metadata.item.json
,item2.json
, ...: Items belonging to the collection, either at the same level or nested within subdirectories.
- Each collection has its own directory inside
After installing the package, you can use the stac-merkle-tree-cli
command to compute or verify Merkle information in your STAC catalog.
The compute command computes and adds Merkle information (merkle:object_hash
, merkle:root
, merkle:hash_method
) to your STAC catalog.
stac-merkle-tree-cli compute path/to/catalog_directory [OPTIONS]
path/to/catalog_directory
: (Required) Path to the root directory containingcatalog.json
.
--merkle-tree-file TEXT
: (Optional) Path to the output Merkle tree structure file. Defaults tomerkle_tree.json
within the provided catalog_directory.
Assuming your directory structure is as follows:
my_stac_catalog/
├── catalog.json
├── collections/
│ ├── collection1/
│ │ ├── collection.json
│ │ ├── item1.json
│ │ └── item2/
│ │ └── item2.json
│ └── collection2/
│ ├── collection.json
│ ├── item1/
│ │ └── item1.json
│ └── item2.json
Run the tool:
stac-merkle-tree-cli compute my_stac_catalog/
Expected Output:
Processed Item: /path/to/my_stac_catalog/collections/collection1/item1.json
Processed Item: /path/to/my_stac_catalog/collections/collection1/item2/item2.json
Processed Collection: /path/to/my_stac_catalog/collections/collection1/collection.json
Processed Item: /path/to/my_stac_catalog/collections/collection2/item1/item1.json
Processed Item: /path/to/my_stac_catalog/collections/collection2/item2.json
Processed Collection: /path/to/my_stac_catalog/collections/collection2/collection.json
Processed Catalog: /path/to/my_stac_catalog/catalog.json
Merkle tree structure saved to /path/to/my_stac_catalog/merkle_tree.json
The verify
command validates the integrity of a Merkle tree JSON file by recalculating merkle:root
values and comparing them to the expected values.
stac-merkle-tree-cli verify path/to/merkle_tree.json
path/to/merkle_tree.json
: (Required) Path to the Merkle tree JSON file to verify.
Run the command:
stac-merkle-tree-cli verify my_stac_catalog/merkle_tree.json
Example Output (Success):
Verification Successful: The merkle:root matches.
Example Output (Failure):
Verification Failed:
- Expected merkle:root: 5808b480d9bed10e7663d52c218571d053c7b5df42a5aefc11e216c66c711f77
- Calculated merkle:root: f0ed08b316b917a98c085e699c090af1cea964b697dd0bc44491ebced4d0006c
Discrepancies found in the following nodes:
- Collection 'COP-DEM' has mismatched merkle:root.
- Catalog 'Catalogue' has mismatched merkle:root.
This tool complies with the Merkle Tree Extension Specification, which outlines how to encode STAC objects in a Merkle tree to ensure metadata integrity.
merkle:object_hash
(string, REQUIRED in Items, Collections, Catalogs)- A cryptographic hash of the object's metadata, used to verify its integrity.
- For Items: Located within the properties field.
- For Collections and Catalogs: Located at the top level.
merkle:hash_method
(object, REQUIRED in Collections and Catalogs)- Describes the method used to compute
merkle:object_hash
andmerkle:root
, including:function
: The hash function used (e.g., sha256).fields
: Fields included in the hash computation (e.g., ["*"] for all fields).ordering
: How child hashes are ordered when building the Merkle tree (e.g., ascending).description
: Additional details about the hash computation method.
- Describes the method used to compute
merkle:root
(string, REQUIRED in Collections and Catalogs)- The Merkle root hash representing the Collection or Catalog, computed from child object hashes.
All STAC objects processed by this tool will include the Merkle extension URL in their stac_extensions array:
"stac_extensions": [
"https://stacchain.github.io/merkle-tree/v1.0.0/schema.json"
]
After running the tool, each STAC object will be updated with the appropriate Merkle fields.
The tool generates a merkle_tree.json
file that represents the hierarchical Merkle tree of your STAC catalog. Below is an example of the merkle_tree.json
structure:
{
"node_id": "Catalogue",
"type": "Catalog",
"merkle:object_hash": "b14fd102417c1d673f481bc053d19946aefdc27d84c584989b23c676c897bd5a",
"merkle:root": "2c637f0bae066e89de80839f3468f73e396e9d1498faefc469f0fd1039e19e0c",
"children": [
{
"node_id": "COP-DEM",
"type": "Collection",
"merkle:object_hash": "17789b31f8ae304de8dbe2350a15263dbf5e31adfc0d17a997e7e55f4cfc2f53",
"merkle:root": "2f4aa32184fbe70bd385d5b6b6e6d4ec5eb8b2e43611b441febcdf407c4e0030",
"children": [
{
"node_id": "DEM1_SAR_DGE_30_20101212T230244_20140325T230302_ADS_000000_1jTi",
"type": "Item",
"merkle:object_hash": "ce9f56e695ab1751b8f0c8d9ef1f1ecedaf04574ec3077e70e7426ec9fc61ea4"
}
]
},
{
"node_id": "TERRAAQUA",
"type": "Collection",
"merkle:object_hash": "6ae6f97edd2994b632b415ff810af38639faa84544aa8a33a88bdf867a649374",
"merkle:root": "6ae6f97edd2994b632b415ff810af38639faa84544aa8a33a88bdf867a649374",
"children": []
},
{
"node_id": "S2GLC",
"type": "Collection",
"merkle:object_hash": "84ab0e102924c012d4cf2a3b3e10ed4f768f695001174cfd5d9c75d4335b7a48",
"merkle:root": "33631c1a3d9339ffc66b3f3a3eb3de8f558bcabe4900494b55ca17aff851e661",
"children": [
{
"node_id": "S2GLC_T30TWT_2017",
"type": "Item",
"merkle:object_hash": "3a3803a0dae5dbaf9561aeb4cce2770bf38b5da4b71ca67398fb24d48c43a68f"
}
]
}
]
}
-
Root Node (Catalogue):
- node_id: Identifier of the Catalog.
- type: Specifies that this node is a Catalog.
- merkle:object_hash: Hash of the Catalog's metadata.
- merkle:root: The Merkle root representing the entire Catalog.
- children: Array containing child nodes, which can be Collections or Items.
-
Child Nodes (e.g., COP-DEM, TERRAAQUA, S2GLC):
- node_id: Identifier of the Collection.
- type: Specifies that this node is a Collection.
- merkle:object_hash: Hash of the Collection's metadata.
- merkle:root: The Merkle root representing the Collection, calculated from its children.
- children: Array containing child nodes, which can be Items or further sub-Collections.
-
Leaf Nodes (e.g., DEM1_SAR_DGE_30_20101212T230244_20140325T230302_ADS_000000_1jTi, S2GLC_T30TWT_2017):
- node_id: Identifier of the Item.
- type: Specifies that this node is an Item.
- merkle:object_hash: Hash of the Item's metadata.
- No merkle:root or children: As Items are leaf nodes, they do not contain these fields.
{
"type": "Catalog",
"stac_version": "1.1.0",
"id": "my-catalog",
"description": "My STAC Catalog",
"links": [],
"stac_extensions": [
"https://stacchain.github.io/merkle-tree/v1.0.0/schema.json"
],
"merkle:object_hash": "abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890",
"merkle:root": "1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef",
"merkle:hash_method": {
"function": "sha256",
"fields": ["*"],
"ordering": "ascending",
"description": "Computed by excluding Merkle fields and including merkle:object_hash values in ascending order to build the Merkle tree."
}
}
{
"type": "Collection",
"stac_version": "1.1.0",
"id": "collection1",
"description": "My STAC Collection",
"extent": {},
"links": [],
"stac_extensions": [
"https://stacchain.github.io/merkle-tree/v1.0.0/schema.json"
],
"merkle:object_hash": "fedcba0987654321fedcba0987654321fedcba0987654321fedcba0987654321",
"merkle:root": "0987654321fedcba0987654321fedcba0987654321fedcba0987654321fedcba",
"merkle:hash_method": {
"function": "sha256",
"fields": ["*"],
"ordering": "ascending",
"description": "Computed by excluding Merkle fields and including merkle:object_hash values in ascending order to build the Merkle tree."
}
}
{
"type": "Feature",
"stac_version": "1.1.0",
"id": "item1",
"properties": {
"merkle:object_hash": "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"
},
"geometry": {},
"links": [],
"assets": {},
"stac_extensions": [
"https://stacchain.github.io/merkle-tree/v1.0.0/schema.json"
]
}
Contributions are welcome! If you encounter issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.
Use the compute
command to process your STAC catalog and generate a Merkle tree structure.
stac-merkle-tree-cli compute path/to/catalog_directory
--merkle-tree-file <file_path>: Specify the output file name for the Merkle tree JSON (default is merkle_tree.json).
Processed Item: /path/to/catalog_directory/collections/collection1/item1.json
Processed Item: /path/to/catalog_directory/collections/collection1/item2/item2.json
Processed Collection: /path/to/catalog_directory/collections/collection1/collection.json
Processed Item: /path/to/catalog_directory/collections/collection2/item1/item1.json
Processed Item: /path/to/catalog_directory/collections/collection2/item2.json
Processed Collection: /path/to/catalog_directory/collections/collection2/collection.json
Processed Catalog: /path/to/catalog_directory/catalog.json
Merkle tree structure saved to /path/to/catalog_directory/merkle_tree.json
- The tool will generate a
merkle_tree.json
file (or the specified output file), which represents the hierarchical structure of your STAC catalog, includingmerkle:object_hash
andmerkle:root
values.
Use the verify command to validate the integrity of the generated Merkle tree JSON file.
stac-merkle-tree-cli verify path/to/catalog_directory/merkle_tree.json
Verification Successful: The merkle:root matches.
Verification Failed:
- Expected merkle:root: 5808b480d9bed10e7663d52c218571d053c7b5df42a5aefc11e216c66c711f77
- Calculated merkle:root: f0ed08b316b917a98c085e699c090af1cea964b697dd0bc44491ebced4d0006c
Discrepancies found in the following nodes:
- Collection 'COP-DEM' has mismatched merkle:root.
- Catalog 'Catalogue' has mismatched merkle:root.
Ensure that the STAC files (e.g., catalog.json
, collection.json
, item files) have been updated correctly:
-
catalog.json
should include:merkle:object_hash
merkle:root
merkle:hash_method
- Each
collection.json
should include:merkle:object_hash
merkle:root
merkle:hash_method
- Each Item JSON should have
merkle:object_hash
within its properties field.
Review the generated merkle_tree.json file to confirm:
- Proper hierarchical representation of the catalog.
- Correct merkle:object_hash and merkle:root values for each node
Ensure that all tests pass by executing:
pytest -v