Skip to content

Commit

Permalink
update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
matthiasprobst committed Dec 12, 2023
1 parent 51b2be2 commit 555dc99
Show file tree
Hide file tree
Showing 7 changed files with 119 additions and 640 deletions.
56 changes: 50 additions & 6 deletions docs/conventions/examples/Provenance.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/conventions/ontologies.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -551,7 +551,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
"version": "3.8.18"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion docs/database/index.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
HDF5-Database
=============

HDF5 can be considered a database itself, as it allows multiple datasets and their metadata (attributes) to be stored in a single file. Most of the time you want to find records in an HDF5 file based on the attributes. However, the `h5py` package does not provide a function to do this.
HDF5 can be considered a database itself, as it allows multiple datasets and their metadata (attributes) to be stored in a single file. Most of the time, you want to find records in an HDF5 file based on the attributes. However, the `h5py` package does not provide a function to do this.

The `h5rdmtoolbox` provides an interface to perform queries on a single or even multiple HDF5 files. This is shown in one of the subchapters here. However, this may not always be the fastest way to find data in an HDF5 file. A more effective way is to map the metadata to a dedicated database. One such example is MongoDB. The query is performed on the much more efficient dedicated database, then returned to the original file to continue working.

Expand Down
4 changes: 1 addition & 3 deletions docs/database/mongo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,7 @@
"import numpy as np\n",
"from pprint import pprint\n",
"\n",
"h5tbx.use(None)\n",
"\n",
"h5tbx.__version__"
"h5tbx.use(None)"
]
},
{
Expand Down
659 changes: 37 additions & 622 deletions docs/wrapper/DumpFile.ipynb

Large diffs are not rendered by default.

32 changes: 26 additions & 6 deletions docs/wrapper/GroupCreation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -100,26 +100,46 @@
"id": "70a68b4a-6715-46ff-889d-fcc1a7f7ec3a",
"metadata": {},
"source": [
"The equivalent could be achieved by using the `find` query - in fact it is called in the method `get_groups()` (More on `find()` can be found [here](../database/Serverless.ipynb))"
"An alternative of exploring HDF5 files in general is the usage of databases. The HDF5 file can serve as a database itself. For this and more on database interfaces, see [here](../database/firstSteps.ipynb))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "b9edd387-e3a2-49b9-b51d-cf1da347c803",
"id": "50de3a0b-2aa6-4832-8d1b-2081dad74a42",
"metadata": {},
"outputs": [],
"source": [
"from h5rdmtoolbox.database import hdfdb"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "c90e3188-bb22-48f3-b7dc-88bc02daa5e4",
"metadata": {},
"outputs": [],
"source": [
"db = hdfdb.FileDB(h5.hdf_filename)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "22c37bdb-b7cd-4d58-a59a-2e478e704778",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[<HDF5 wrapper group \"/mygrp\" (members: 0, convention: \"h5py\")>]\n"
"<LGroup \"/mygrp\" in \"C:\\Users\\da4323\\AppData\\Local\\h5rdmtoolbox\\h5rdmtoolbox\\tmp\\tmp_3\\tmp1.hdf\">\n"
]
}
],
"source": [
"with h5tbx.File(h5.hdf_filename) as h5:\n",
" print(h5.find({'$name': {'$regex': '.*'}}, rec=False))"
"for r in db.find({'$name': {'$regex': '.*'}}, recursive=False):\n",
" print(r)"
]
}
],
Expand All @@ -139,7 +159,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.17"
"version": "3.8.18"
}
},
"nbformat": 4,
Expand Down
4 changes: 3 additions & 1 deletion h5rdmtoolbox/database/hdfdb/filedb.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ def find_one(self, *args, **kwargs) -> lazy.LHDFObject:
def find(self, *args, **kwargs) -> Generator[lazy.LHDFObject, None, None]:
"""Please refer to the docstring of the find method of the GroupDB class"""
with h5py.File(self.filename, 'r') as h5:
return GroupDB(h5).find(*args, **kwargs)
results = list(GroupDB(h5).find(*args, **kwargs))
for r in results:
yield r


class FilesDB(NonInsertableDatabaseInterface, HDF5DatabaseInterface):
Expand Down

0 comments on commit 555dc99

Please sign in to comment.