Dealing with changes in identifiers over time, adding provenance for local IDs #33

tr325 · 2020-05-27T13:57:51Z

identifier fields are (rightly) required throughout. However, at the start of a project a Dataset will not yet have a global identifier like a DOI, so implementing systems will need to provide a local identifier with type other. Similarly, if researchers have not yet signed up for an ORCID iD a local id will have to be supplied.

For these to be useful when systems integrate it would be helpful if the provenance of those identifiers could be added. Possibly as another entry in each of the something_id fields, with cardinality 0..1? Tracking the provenance of each identifier individually will enable integrating systems to re-use externally generated identifiers in the document rather than reassign them (for example, if a DMP is constructed from multiple cooperating systems).

As the DMP should be a living document, local ids should probably then be replaced with global ones when they become available (eg. when a DOI is created for a dataset). If an integrating system needs to track these changes and match the ids the record history of the DMP could be used to do so.

The text was updated successfully, but these errors were encountered:

briri · 2020-05-27T16:21:30Z

➕ for this. We have been thinking about it as well. It would also be nice to supply these system-specific identifiers when they call the API in the event that they do not have the time or resources to update their code to capture things like ORCIDs, etc. (perhaps helping lower the bar to adoption of this standard).

I was originally leaning towards something like:

  {
    "type": "other",
    "provenance": "system_a",
    "identifier": 123
  }

It could also be possible to supply the name of the system in the type attribute:

 {
    "type": "system_a",
    "identifier": 123
  }

briri · 2020-05-27T16:37:16Z

If we allow for this, we could do other interesting things as well like providing a callback URL for updates. Something along the lines of the HATEOAS pattern.

For example a researcher creates a DMP in some tool and designates a specific repository. The DMP system sends the DMP maDMP json to the repository system with the callback URL. The repository system (at some point in the future) receives a dataset from the researcher. The repository system could then use that callback URI to send the DMP system the dataset's DOI

cpina · 2021-01-19T11:38:19Z

I just joined the call earlier on, sorry if I miss some context in the wider project on this ticket...

Question: should identifiers expire? But still be recorded for data provenance?

E.g. have a primary identifier and also previous (unused?) identifiers. Or each identifier to have a timespan (created_date, deleted_date... or created_date, replaced_date, replaced_by). I've just found a similar approach: #34 (comment)

paulwalk · 2021-04-19T14:34:58Z

I think that there is a fundamental issue to be discussed first:

Should the DMP standard attempt to convey "history" or, rather, simply be a mechanism for conveying the current known state of the DMP?

If we decide that we want to convey a historic record of DMP changes too, then we need to model this very carefully, and we need to anticipate a significant growth in complexity. Provenance might need to be recorded for anything that could change (note, not only IDs).

I strongly recommend that we do not assume a need to record revision history, without carefully considering the consequences.

TomMiksa mentioned this issue May 28, 2020

Datasets with multiple and no (yet) IDs #34

Open

hmpf mentioned this issue Jun 3, 2020

grant_id should not be required #29

Closed

TomMiksa assigned peterneish, paulwalk and TomMiksa Aug 28, 2020

TomMiksa added decision Decision to be taken that alligns the approach duplicate labels Aug 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dealing with changes in identifiers over time, adding provenance for local IDs #33

Dealing with changes in identifiers over time, adding provenance for local IDs #33

tr325 commented May 27, 2020

briri commented May 27, 2020

briri commented May 27, 2020

cpina commented Jan 19, 2021

paulwalk commented Apr 19, 2021

Dealing with changes in identifiers over time, adding provenance for local IDs #33

Dealing with changes in identifiers over time, adding provenance for local IDs #33

Comments

tr325 commented May 27, 2020

briri commented May 27, 2020

briri commented May 27, 2020

cpina commented Jan 19, 2021

paulwalk commented Apr 19, 2021