Replies: 2 comments 13 replies
-
The current approach would be:
This would create statements from all relations where the source of the relation is the subject, the relation is the predicate and the target of the relation is the object. If you use a local knowledge base that you build in INCEpTION, the IRIs of the concepts in are auto-generated with random identifiers. If you import an externally built knowledge base, you usually have more speaking IRIs. Example text annotated with entities and relations In either case, you could merge the ntriple statements mentioned above with your knowledge base to enrich it with information about the subjects. Example concept enriched based on the ntriples generated from the annotations using the script snipped above after importing the results back into the INCEpTION knowledge base Here is the toy project from which I generated the screenshots above. I hope in future versions of INCEpTION, we can have a project template and associated data exporter that may avoid having to use an external script -- at least for simple cases. Feedback and suggestions are welcome. |
Beta Was this translation helpful? Give feedback.
-
In #4549 you wrote:
Some time back, I had implemented a CAS-to-RDF binding. The RDF data produced by this looks somewhat like this;
I think it would be fairly straightforward to add this RDF binding to INCEpTION if it helps anybody (i.e. in this case you). The format is a complete CAS <-> RDF conversion. From the perspective of the CAS, tokens and sentences are also just annotations, so those would be included in the RDF format as well. The benefit over NIF would be that this binding should be able to represent all of the annotation data from INCEpTION, not just the handful of layers supported by NIF. That said, this RDF binding probably gives you a similar kind of access to the CAS data as your JSON-LD approach - except that things in fact do contain proper IRIs in the right places instead of plain numbers. No let's take a step back and consider the idea again: So the idea is that we want to essentially create triples by annotating a subject and object and linking them with a relation that serves as predicate - and then to export these annotations in a RDF format - right? The problem here is that in order to do this kind of annotation, you have to create a custom span layer and a custom relation layer - each with features allowing to link them against the KB. If instead INCEpTION would come with a set of predefined layers for this purpose, then it would probably be rather easy to implement a format that is aware of exactly these predefined layers and would be able to write them out in a simple RDF format - excluding tokens/sentences. Does that make sense? So I imagine we could set up a project template for say "statement annotation" that comes with two layers, e.g. "Resource" and "Property". Both would have a feature "iri" pointing to a knowledge base. Maybe the "Resource" layer would additionally have a feature "literal" in case it should not link to a KB but rather represent a value (e.g. for a year, monetary value, measurement etc.). That would basically be what I did in the Python script - but without the script - because with predefined layers, INCEpTION would know the semantics of the layers and could directly provide a suitable export format. What do you think is more interesting:
|
Beta Was this translation helpful? Give feedback.
-
I have a personal project where I publish a ton of various scanned PDF documents with various information from the public transportation industry (bus technical details, manufacturers, etc)
What I want to do is to publish all that information as RDF Linked Data. Short of manually copying and pasting to Turtle, or something, I was looking for a PDF annotation tool which would let me label entities, their relations and properties in a form close enough that I can more easily produce RDF in my desired ontology/structure.
Inception comes very close but I failed to find the right functionality for exporting the annotations in a way useful for me. What are my options?
Beta Was this translation helpful? Give feedback.
All reactions