Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6 update readme file #8

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 92 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,92 @@
# vocab-registry
FAIR vocabulary registry
# FAIR vocabularies registry

This repository contains code and documentation of the CLARIAH (1) (continued by SHHOCK.nl (2)) FAIR vocabulary registry project. Here we include all the documentation and references to the different components (sub-projects).

# 1. About the FAIR vocabularies project

## Introduction
The FAIR vocabularies project pursues the goal of gathering vocabularies that are relevant to researchers, developers, or curators in the humanities and social sciences who are connected to both the CLARIN (3), CLARIAH (2) and SSHOC.nl (2) communities. The results of this project convey in a "vocabulary registry", which is a one-stop reference service for vocabularies useful to these communities or created during any of their projects.

The project aligns to other international initiatives that aim to increase the FAIR-ness (Findability, Accessibility, Reproducibility and Interoperability) of research data or cultural heritage metadata, in this case with a focus on semantic artefacts (controlled vocabularies or any other knowledge organization system, such as authority lists, taxonomies, thesauri, classification schemes, or schemas and abstract models such as ontologies (see for example Zeng, 2008).

What distinguishes this FAIR vocabulary registry from similar international initiatives is that it aims to:
- do semi-automatic curation work, which include automatic processes in the selection and processing, but also involves the community of experts in the selection and curation of the vocabularies,
- serve a clear user group, which gives advantages when selecting vocabularies to include and datasets to link to; also to have a closer relation with users during the development and evaluation of the registry;
- formalize the characteristics that make a vocabulary FAIR,
- widen the scope to make room for vocabularies that are not only RDF-based;
- it will find relations between the vocabularies and the dataset registries used by these communities to see which vocabularies were used in those datasets (to get statistics about vocabulary use), it will collect user reviews, and provide recommendations to encourage vocabulary reuse.

## Development roadmap
The project started during the CLARIAH Plus project. It continues during SSHOC.nl (2).
Previous CLARIAH development roadmap: https://github.com/orgs/CLARIAH/projects/3 (to be updated)
Current development roadmap: (forthcoming)

# 2. How-to guides
## Using the prototype:

### Searching/browsing
If you want to use the registry for searching and browsing you can make use of the vocabulary registry right away in the resulting web portal for the registry, here: https://registry.vocabs.dev.clariah.nl/ (this is a development version). This registry is a search and browsing interface for the selected vocabularies. You can use the search query to find vocabularies relevant to your search per keywords in the title or descriptions. You can use the different facets to filter the results per type of vocabulary, and other characteristics.

### Rating a vocabulary
(forthcoming)

### Getting recommendations
(forthcoming)

## Publishing a vocabulary
- If your vocabulary is already available on the Web, you can suggest it using the button "Register a new vocabulary". This form will be received by the curators of the project, and you will be notified when the vocabulary is made available or, if not, why.
- If your vocabulary doesn't exist on the Web (e.g., you or your institution has created a vocabulary and you don't know how to publish it): you can get advise: (forthcoming)

## Installing and contributing to the registry's code base
- If you are interested in installing it and/or understanding the code, see the section below ("Development set Up")

# 3. Development set up

## 1. Intro
The vocabulary registry has different components:

![FAIR vocabulary registry architecture](https://github.com/CLARIAH/vocab-registry/blob/6-update-readme-file/documentation/cac.png?raw=true)
(Source: Meijer & Windhower, 2024)

A complete description of the architecture can be found in the paper by Meijer & Windower (2024). This is a summary of the different components:

- The Editor: it's built based on a CMDI (Clarin metadata infrastructure) profile, it serves the purpose to add descriptive metadata to the vocabularies
- See the code at this Github repository: (forthcoming)
- The CMDI profile can also be accessed as a FAIR vocabulary via the registry: (forthcoming; temporarily you can see the data model here: https://github.com/CLARIAH/vocab-registry/blob/ec430d55d5c76345e4f25b5726ea84600c96d522/documentation/model.plantuml)
- The editing process is done semi-automatically.

- The Workers: see the code at this Github repository: https://github.com/CLARIAH/vocab-workers). These are python-based applications that perform each of the tasks that build the registry, i.e.,: downloading/caching the vocabulary, summarizing, storing in a SPARQL store, converting to RDF, uploading to SKOSMOS (if it is a SKOS vocabulary), documenting, or finding if the vocabulary is also registered in other vocabulary registries.

- The interface (see the code at this Github repository: https://github.com/CLARIAH/vocab-registry). There are two components:
- a) A python layer for the API
- b) A public interface built in ReactJs' Java Script library

- A vocabulary recommender (see the code at this Github repository: (currently in private repo))

- The data (vocabulary records, cache) is temporarily stored in a private Gitlab repository: https://code.huc.knaw.nl/tsd/clariah/vocab-registry-data during the development phase. In the initial phase, vocabularies were uploaded automatically from two sources: YALC and Awsome humanities (more details forthcoming).

- The mappings for customizing the indexes in Elastic Search are available in this repository: (forthcoming)

## 2. Set up the vocabulary workers
(forthcoming)

## 3. Set up the vocabulary registry's API and public interfaces
(forthcoming)


# REFERENCES
- Meijer, K. and Windhouer, M. (2024). The CLARIAH FAIR Vocabulary Registry. CLARIN Annual Conference 2024 (forthcoming proceedings).
- Zeng, M.L. (2008). Knowledge Organization Systems (KOS). Knowledge Organization, 35, 160-182. https://doi.org/10.5771/0943-7444-2008-2-3-160

# FOOTNOTES
- (1) CLARIAH: Common Lab Research Infrastructure for the Arts and Humanities (https://www.clariah.nl/)
- (2) CLARIN: Common Language Resources and Technology Infrastructure (https://www.clarin.eu/)
- (3) SSHOC.nl: Digital Infrastructure for Social Sciences and Humanities (https://sshoc.nl/)

# CREDITS
- Ideation and project management: Menzo Windhouwer (lead software engineer at the KNAW Humanities Cluster)
- Ideation and development: Kerim Meijer (senior software engineer at the KNAW Humanities Cluster)
- Associate developer: Meindert Kroese (KNAW Humanities Cluster)
- Associate developer trainee and user researcher: Liliana Melgar (KNAW Humanities Cluster)
- Other project members: (forthcoming)
- This project is part of the CLARIAH infrastructure, it's funded by CLARIAH Plus and SSHOC.nl and gets collaboration from DANS (forthcoming)
103 changes: 103 additions & 0 deletions documentation/cac.drawio
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
<mxfile host="Electron" modified="2024-03-14T13:02:22.223Z" agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/21.2.1 Chrome/112.0.5615.87 Electron/24.1.2 Safari/537.36" etag="spbIbyiBrksS3jnYwhjk" version="21.2.1" type="device">
<diagram name="Pagina-1" id="O36NFVqc7uo9PPjtxkhS">
<mxGraphModel dx="1434" dy="854" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="827" pageHeight="1169" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="pIYPyGjyVhugXnKtoXwO-1" value="Editor" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="180" y="300" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-2" value="Workers" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="400" y="300" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-4" value="SPARQL" style="shape=cylinder3;whiteSpace=wrap;html=1;boundedLbl=1;backgroundOutline=1;size=15;" vertex="1" parent="1">
<mxGeometry x="300" y="190" width="60" height="80" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-5" value="CMDI" style="shape=note;whiteSpace=wrap;html=1;backgroundOutline=1;darkOpacity=0.05;spacingTop=10;" vertex="1" parent="1">
<mxGeometry x="210" y="190" width="60" height="80" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-6" value="Cache" style="shape=note;whiteSpace=wrap;html=1;backgroundOutline=1;darkOpacity=0.05;spacingTop=10;" vertex="1" parent="1">
<mxGeometry x="400" y="190" width="60" height="80" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-7" value="Docs" style="shape=note;whiteSpace=wrap;html=1;backgroundOutline=1;darkOpacity=0.05;spacingTop=10;" vertex="1" parent="1">
<mxGeometry x="490" y="190" width="60" height="80" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-8" value="Registry" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="180" y="100" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-9" value="Static file server" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="460" y="100" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-12" value="" style="endArrow=classic;startArrow=classic;html=1;rounded=0;entryX=0.5;entryY=1;entryDx=0;entryDy=0;entryPerimeter=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-1" target="pIYPyGjyVhugXnKtoXwO-5">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="390" y="210" as="sourcePoint" />
<mxPoint x="440" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-17" value="" style="endArrow=classic;html=1;rounded=0;exitX=1;exitY=0.5;exitDx=0;exitDy=0;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-1" target="pIYPyGjyVhugXnKtoXwO-2">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="390" y="210" as="sourcePoint" />
<mxPoint x="440" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-19" value="" style="endArrow=classic;html=1;rounded=0;entryX=0.5;entryY=1;entryDx=0;entryDy=0;entryPerimeter=0;exitX=0.25;exitY=0;exitDx=0;exitDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-2" target="pIYPyGjyVhugXnKtoXwO-6">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="460" y="290" as="sourcePoint" />
<mxPoint x="450" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-20" value="" style="endArrow=classic;html=1;rounded=0;exitX=0.75;exitY=0;exitDx=0;exitDy=0;entryX=0.5;entryY=1;entryDx=0;entryDy=0;entryPerimeter=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-2" target="pIYPyGjyVhugXnKtoXwO-7">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="400" y="210" as="sourcePoint" />
<mxPoint x="450" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-21" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="440" y="160" as="sourcePoint" />
<mxPoint x="440" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-22" value="" style="endArrow=classic;startArrow=classic;html=1;rounded=0;exitX=0;exitY=0;exitDx=30;exitDy=0;exitPerimeter=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-5" target="pIYPyGjyVhugXnKtoXwO-8">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="390" y="210" as="sourcePoint" />
<mxPoint x="440" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-24" value="" style="endArrow=classic;html=1;rounded=0;exitX=0.145;exitY=0;exitDx=0;exitDy=4.35;exitPerimeter=0;entryX=0.75;entryY=1;entryDx=0;entryDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-4" target="pIYPyGjyVhugXnKtoXwO-8">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="390" y="210" as="sourcePoint" />
<mxPoint x="440" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-27" value="" style="endArrow=classic;html=1;rounded=0;exitX=0;exitY=0;exitDx=45;exitDy=15;exitPerimeter=0;entryX=0.25;entryY=1;entryDx=0;entryDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-6" target="pIYPyGjyVhugXnKtoXwO-9">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="390" y="210" as="sourcePoint" />
<mxPoint x="440" y="180" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-28" value="" style="endArrow=classic;html=1;rounded=0;exitX=0;exitY=0;exitDx=45;exitDy=15;exitPerimeter=0;entryX=0.75;entryY=1;entryDx=0;entryDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-7" target="pIYPyGjyVhugXnKtoXwO-9">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="390" y="210" as="sourcePoint" />
<mxPoint x="440" y="160" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-29" value="" style="endArrow=classic;html=1;rounded=0;entryX=0.855;entryY=1;entryDx=0;entryDy=-4.35;entryPerimeter=0;exitX=0;exitY=0;exitDx=0;exitDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-2" target="pIYPyGjyVhugXnKtoXwO-4">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="420" y="300" as="sourcePoint" />
<mxPoint x="440" y="180" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-30" value="Recommender" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="320" y="100" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="pIYPyGjyVhugXnKtoXwO-31" value="" style="endArrow=classic;html=1;rounded=0;exitX=0.855;exitY=0;exitDx=0;exitDy=4.35;exitPerimeter=0;entryX=0.5;entryY=1;entryDx=0;entryDy=0;" edge="1" parent="1" source="pIYPyGjyVhugXnKtoXwO-4" target="pIYPyGjyVhugXnKtoXwO-30">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="340" y="200" as="sourcePoint" />
<mxPoint x="280" y="170" as="targetPoint" />
</mxGeometry>
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
Binary file added documentation/cac.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.