OfficeGraph is a large, real world knowledge graph containing measurements taken by 444 IoT devices, over 11 months, in a seven story office building. The devices are made up of 17 different sensor models, which make measurements of many different properties.
This is a zipped version of OfficeGraph. Instead of one file containing the entire knowledge graph the zipped folders contains a separate file for each individual device. Each file contains all the measurements made by the device.
The devices in room enrichment adds more information about which devices are located in which rooms, and on which floor those rooms are.
The Wikidata days enrichment provides a link to Wikidata, by matching the dates of the measurements to those dates' entities in Wikidata.
The graph learning enrichment provides additional properties that have beneficial effects on the learning process when using graph embedding models.
The enrichments are included in separate files, with the graph learning enrichment only containing the enrichments for the devices on the 7th floor, which were used in the machine learning experiment (the code of this experiment is available here.
All scripts used to create the dataset from the original json files are available in the mapping scripts folder and an example of the raw data is supplied in the mapping example folder.
OfficeGraph is expressed in the saref ontology, a domain standard model specifically created to model measurements of different IoT devices. The main structure we used from saref can be seen in the follow figure. For each individual device the “device template” creates triples for all consistent information about the device, such as the device type and model. For each individual measurement the “measurement template” creates triples to describe the measurement, its value, unit of measurement and timestamp. The device instance and measurement instances are connected in two ways, directly through the saref:makesMeasurement property, and indirectly through the sasref:Property instance. The latter describes what has been measured, such as temperature or humidity. In addition to the saref ontology we use two of its extensions. saref4bldg, which provides classes used to describe the relation between devices and rooms, and between rooms and buildings. The other extension is saref4ener, which provides additional classes for information about the device. As suggested in the saref documentation the om1.8 ontology is used to represent the units of measure of the measurements.
- Zenodo: The dataset is archieved on Zenodo, including the raw json files used to create OfficeGraph. DOI 10.5281/zenodo.10245814
- Live SPARQL endpoint: OfficeGraph is available as a live triplestore and can be queries throug a YASGUI editor, and directly with the endpoint: https://data.interconnect.labs.vu.nl/sparql.
- Jupyter notebook examples: Examples of using the endpoint are available in this repository: Data analytics cases.
The resource paper describing this dataset is currently under submission.
This work is licensed under a Creative Commons Attribution 4.0 International License.