-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index nanopubs with species interaction claims on BDJ via KnowledgePixels #923
Comments
fyi @tkuhn - any ideas to help index this? |
https://np.petapico.org/RAOgLBuvJRusIKPJyhXbx7sMI1aKj_AI0l1oG6XXsO4pU
|
See related publications - Many of which are included in iNaturalist https://globalbioticinteractions.org/datasets available via https://depot.globalbioticinteractions.org/reviews/globalbioticinteractions/inaturalist/nanopub.trig.gz . |
Here's an example from the existing inaturalist nanopub collection:
with result.trig being:
with associated iNaturalist observation at https://www.inaturalist.org/observations/28845843 With GloBI's view - |
Another example of GloBI generated nanopubs -
|
With "index" you mean to use these nanopublications in tools like https://www.globalbioticinteractions.org? Or the other way round, to feed the nanopublications you are already generating into the nanopub ecosystem? Or both? :) |
Ideally both. And, to get started, I'd want to be able to find nanopublications in Pensoft associated resources that make statements involving species interactions (or biotic interactions). In other words:
|
@tkuhn curious to hear your thoughts on the GloBI<>Pensoft Nanopub integration proposal . |
Yes, makes sense to first focus on the Nanopub>GloBI direction. You can, for example, get all taxon-taxon relations according to the templates used at Pensoft with this API call (add It executes in the back this SPARQL query: https://github.com/knowledgepixels/bdj-nanopub-api/blob/main/get-taxontaxon-nanopubs.rq Does that match what you are looking for? We can adjust the query to get out what you need, basically. That would cover points 1 and 2 in your list, as nanopublications are already versioned with their hash-based Trusty URI (unless the GloBI index requires a different kind of versioning). The above API call returns all nanopublications that match the template as defined for the Pensoft pilot. But these nanopublications are otherwise not necessarily associated with Pensoft. Some will be reviewed and formally accepted at Pensoft (coming soon), others might be associated with upcoming submissions there, and others might have no link to Pensoft other than using the same or similar nanopub structure. |
@tkuhn thanks for your prompt reply and detailed examples. Am eager to get this integration up and running. I tried the api call https://grlc.petapico.org/api-git/knowledgepixels/bdj-nanopub-api/get-taxontaxon-nanopubs in a web browser and got a timeout. Is that expected? |
Yeah, sorry, the petapico.org server keeps misbehaving. I should have given you this one: https://grlc.knowledgepixels.com/api-git/knowledgepixels/bdj-nanopub-api/get-taxontaxon-nanopubs |
@tkuhn thanks! I got some results now - |
A first pass at index configuration is available via https://github.com/globalbioticinteractions/knowledgepixels . For some reason, the url https://grlc.knowledgepixels.com/api-git/knowledgepixels/bdj-nanopub-api/get-taxontaxon-nanopubs.csv works 200/OK in curl / web browser. However, when using elton's http client, a 500 server error is generated. What can I do to help troubleshoot this likely http request header issue? |
See related issue globalbioticinteractions/knowledgepixels#1 . |
Hmm, I don't know what could be the reason that the elton client is getting a 500 error. I am seeing some Can you maybe see the precise request with HTTP headers that elton is sending? |
As far as I can tell, the request header is empty - As obtained via Line 14 in f215052
|
@tkuhn any update on why knowledgepixel crashes on retrieval of content via Elton? https://grlc.knowledgepixels.com/api-git/knowledgepixels/bdj-nanopub-api/get-taxontaxon-nanopubs.csv As far as I can tell, Elton works just fine with non-knowledgepixel servers. I suspect a server-side processing error of sorts. Can you confirm? |
It seems to be the occurring when the Accept header is missing. Is it possible to add ´Accept: */*´ on your side? On my side this is based on the third-party tool grlc, and I am not sure how easy it would be to change this behavior... |
@tkuhn Thanks for having a look and I'll make an effort to add the accept header. Perhaps worth a bug report for grlc ? Seems weird to have things crash on a missing Accept header. |
related grlc issue - CLARIAH/grlc#402 . |
Now, with elton v0.12.12, I can do -
to produce a table (abbreviated for display)
|
and, following the nanopub . . . @tkuhn very cool! via
got
or via
got
|
About the structure of the assertion . . . I was unable to find the definition of some of the biolink terms (e.g.,
|
Also, I noticed that on https://grlc.knowledgepixels.com/api-git/knowledgepixels/bdj-nanopub-api/get-taxontaxon-nanopubs , there's 2 claims with "source" DOIs. Others appear to be empty. However, in the nanopub assertion there's a neat statement for at least one of the DOI-less ones. e.g., from http://purl.org/np/RAlfH1ba32D9P9ODv0_pade_yQ8zmPmjPY6CxEXqyj_N8
but for some reason, https://doi.org/10.3897/zookeys.1181.107496 did not appear in the table version. Curious to hear your thoughts on keeping the table version integration, or perhaps let GloBI be a little more "semantic". I can see benefits either way. . . curious to hear your thoughts on how to let this integration mature. |
This seems to be an error at Biolink. This term is defined here: https://biolink.github.io/biolink-model/OrganismTaxonToOrganismTaxonAssociation/ And this page includes a link to the URI you mention above, leading to the same 404 error. We are using some other terms we had to define in our own namespace, a number of which we still have to properly define. So you might run into some undefined one that are on us. |
@tkuhn thanks for clarifying the background of the biolink term URIs used. Note that a first version of the knowledgepixel species interaction claims have been indexed by GloBI for instance, http://purl.org/np/RA7rvl83o5zWgX7xANozswLg2EQy9EpDDsZ-nACu2OYkc
, was indexed as Note that name resolving of taxon ids like @tkuhn What was your motivation to point to https://www.checklistbank.org/dataset/9880/taxon/6H9MK ? It appears that the taxon link points to a specific version of a taxonomic checklist instead of pointing to some (aspirational) persistent identifier. |
@tkuhn Also, I noticed that you are using the relation "trophically interacts with" . I would expect a more directional relationship like "eats". Can you please elaborate? |
We chose checklistbank.org because it seemed to be the closest to a universal repository with larger coverage than just catalogueoflife.org, we could get an export that we can use for autocomplete, and we have to deal with just one URI scheme. These URIs seemed as persistent as the catalogueoflife.org ones, but I might be wrong about that. Ultimately we followed the advise by @lyubomirpenev. |
I didn't choose it, as it's not my nanopublication :). The given user chose it from a dropdown, which also had "eats" as an option. |
That's the template that the user of the nanopublication above (http://purl.org/np/RA7rvl83o5zWgX7xANozswLg2EQy9EpDDsZ-nACu2OYkc) used to publish it: https://nanodash.petapico.org/publish?template=http://purl.org/np/RAh16oLqLJKo8I8R2CebR1n8Dwv95KL_H-azFfGt2FGW0 |
Ha! Perhaps the option should be removed as it doesn't really imply any directionality according to definition: "ObjectProperty: trophically interacts with Perhaps this is more of an organizing term, rather than one that should be used directly. Any chance I can convince you (or others) to remove this from the dropdown? |
I tend to agree. We selected the terms based on a hand-curated list by Pensoft, so I am not sure whether they had a good reason to have it on the list. The full list in nanopub format is here (in the assertion): http://purl.org/np/RAodaWZBY-yDEtl9reYazBfI-YVD5L4zPh8RrVFS0kbEo This is where the dropdown gets its values from. Are there any other candidates for removal? @lyubomirpenev, any objections against removing "trophically interacts with"? |
Most important reason: The ChecklistBank one-stop URL is richer in valid
names (includes nomenclators!) than the latest version of CoL. Looking
into the future, it will be ChecklistBank - e.g. via something like
"Catalogue of Life Expanded" - which will provide the most complete list
of names with their IDs, unless CoL change their policies to accept all
validly published names immediately upon publication (what is unlikely
to happen in the foreseeable future).
While COL is actually a well-curated taxonomic database, COL Expanded (a
new feature of ChecklistBank) should be the place for looking for most
complete list of validly published names, including the Catalogue of
Life itself.
On 28.11.2023 г. 15:06, Tobias Kuhn wrote:
> @tkuhn <https://github.com/tkuhn> What was your motivation to point >
to https://www.checklistbank.org/dataset/9880/taxon/6H9MK >
<https://www.checklistbank.org/dataset/9880/taxon/6H9MK> ? It appears >
that the taxon link points to a specific version of a taxonomic >
checklist instead of pointing to some (aspirational) persistent >
identifier. > > We chose checklistbank.org because it seemed to be the
closest to a > universal repository with larger coverage than just >
catalogueoflife.org, we could get an export that we can use for >
autocomplete, and we have to deal with just one URI scheme. These > URIs
seemed as persistent as the catalogueoflife.org ones, but I > might be
wrong about that. Ultimately we followed the advise by > @lyubomirpenev
<https://github.com/lyubomirpenev>. > > — Reply to this email directly,
view it on GitHub >
<#923 (comment)>,
or unsubscribe >
<https://github.com/notifications/unsubscribe-auth/ABDFNGKLERIZQBI6LW7WGE3YGXOVPAVCNFSM6AAAAAA5DGTHISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRZHAYDKMJTGY>.
> You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Pensoft logo <https://pensoft.net>
Prof. Dr. Lyubomir Penev
Managing Director
Pensoft Publishers <https://pensoft.net>
Phone: +359-2-8704281
12 Prof. Georgi Zlatarski Street
1700 Sofia, Bulgaria
<https://twitter.com/Pensoft> <https://www.facebook.com/Pensoft/>
<https://www.linkedin.com/company/pensoft-publishers/> Blog
<https://blog.pensoft.net>
valdi names and their
|
Thanks for elaborating. I consider checklist bank more of a registry of checklists, than some kind of authoritative taxonomic resource. So, yes, the coverage is likely to feel more complete, and you might select uncurated checklists. Also, the checklists registered with checklist bank are updated, and they'll get a new URI for each version. Perhaps a good point of discussion: how to link names to taxonomic resources such that they provide a bridge to the many taxonomic resources associated with that particular name? Right now, it appears as if the user specifically picked the checklist, whereas the actual selection was automated using some algorithm. |
"Trophically interacts with" may have a much wider biological meaning
than just "eats". We need to discuss the whole list before starting
removing/adding anything from/to there.
On 28.11.2023 г. 15:29, Tobias Kuhn wrote:
I tend to agree. We selected the terms based on a hand-curated list by
Pensoft, so I am not sure whether they had a good reason to have it on
the list.
The full list in nanopub format is here (in the assertion):
http://purl.org/np/RAodaWZBY-yDEtl9reYazBfI-YVD5L4zPh8RrVFS0kbEo
This is where the dropdown gets its values from.
Are there any other candidates for removal? @lyubomirpenev
<https://github.com/lyubomirpenev>, any objections against removing
"trophically interacts with"?
—
Reply to this email directly, view it on GitHub
<#923 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDFNGLVVQLCTFGCVW6GVJDYGXRMDAVCNFSM6AAAAAA5DGTHISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRZHA2DQMZXGI>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Pensoft logo <https://pensoft.net>
Prof. Dr. Lyubomir Penev
Managing Director
Pensoft Publishers <https://pensoft.net>
Phone: +359-2-8704281
12 Prof. Georgi Zlatarski Street
1700 Sofia, Bulgaria
<https://twitter.com/Pensoft> <https://www.facebook.com/Pensoft/>
<https://www.linkedin.com/company/pensoft-publishers/> Blog
<https://blog.pensoft.net>
|
I agree that a discussion on how to represent the terms would be in order before handpicking them. My main concern is that folks may be picking the more convenient option for some reason (closest in drop-down, more familiar language) instead of picking the one that best aligned with what an author may have expressed in the annotated work. Perhaps some example for usage may help folks to decide what to choose? (e.g., guidelines - be as specific as you can be, using directional interaction types when suitable). |
@lyubomirpenev thanks for elaborating
Yes, and I assume that there's many possible matches within checklist bank for a single name. I wonder how to make sure that a single name URL is not interpreted as "the" single name URL. |
This can be ensured by versioning history which, as far as I know, is maintained by CLB. Otherwise, the basic idea to have "stable CLB PIDs" fir names that are or will be linked to the COL canonical names PIDs is the feature we want to develop in BiCIKL+ (if successful).There is a definite need in such service and CLB know about that.At the very end, it is better to have an ID for a name based on an ID of the taxon name usage in a particular ID, than not to have any ID nor a reference link for a name.На 28.11.2023 г. 15:49 ч. Jorrit Poelen ***@***.***> написа:
@lyubomirpenev thanks for elaborating
The ChecklistBank one-stop URL is richer in valid
names (includes nomenclators!) than the latest version of CoL.
Yes, and I assume that there's many possible matches within checklist bank for a single name. I wonder how to make sure that a single name URL is not interpreted as "the" single name URL.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Here some responses by Andrey Frolov, who created the nanopublications we discussed above ("trophically interacts with") and who gave me permission to paste his explanation for the choice here:
|
@tkuhn thanks for sharing Andrey Frolov's response. As I was reading it, I was wondering whether it would have helped to use "body part" as "feces" . Or perhaps be more descriptive like: beetle eats the feces produced by a lemur, where feces is an abiotic thing, and "producing" would be some other verb/predicate. Great how these nanopublications help to facilitate these discussions! |
species interaction claims on BDJ via KnowledgePixels are actively indexed by GloBI. See e.g., |
via https://bdj.pensoft.net/nanopublications
The text was updated successfully, but these errors were encountered: