Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What purpose does the "File" data type in the source selection page now serve? #295

Open
atomrab opened this issue Jul 20, 2020 · 7 comments

Comments

@atomrab
Copy link

atomrab commented Jul 20, 2020

In the current version of the client, I can identify two pathways to end up with a PeriodO collection file: by creating a backup for a local collection, and by downloading the JSON for a period definition or authority. I am very glad to have the first, which makes me feel much less worried about my cache being cleared, and I've successfully loaded a backup file. But a backup file only loads through the "restore backup" button, and won't validate as a PeriodO collection if you try to load it as a "File" data type. Neither will a JSON download of an authority or period from the client. The only other files that I can think of that might be out there are file exports from the old client, or saved patches -- but neither of those will validate as the "File" data type, either.

Should we plan for a workflow where users can share their local data sources by creating and sharing backups, which are then loaded through the "restore backup" function? In this case, we should probably drop that datatype, unless the idea is to eventually provide a template people can use to adapt existing JSON to the PeriodO model. Or is there a plan for another export format that will validate as a PeriodO data source when loaded as a file?

@ptgolden
Copy link
Member

You're right, it is confusing to have both that, and a "restore from backup" functionality. The backup storage format is different from a plain dataset because it preserves the full history of changes to a local data source, but the "restore" function should still work with a valid dataset. Ryan and I talked about having that be the case after you raised #278, but I forgot to make a note of it. Thanks for reminding me.

There are a couple things to think about, though...

  • Should we keep in the read-only file type data source? Or, should it be the case that you just load a backed up data source (whether it's a "true backup" including history, or just a plain dataset) and it's editable?

  • If we don't keep that read-only type, I think we should probably just change the backend selection to a radio button instead of a select, since there will only be two options.

  • Or, I could get rid of the "restore from backup" section, and just make "restore from backup" one of the options in the "type" dropdown.

What do you both think?

@atomrab
Copy link
Author

atomrab commented Jul 21, 2020

I like having the read-only version; it's not that it's confusing about which option to select, but that there currently doesn't seem to be any way to generate a file that can be loaded as the "File" datatype. You can't load an old file, you can't load a backup, you can't load a patch, and you can't load an export. So what can you load here? If the answer is "nothing", then we should definitely get rid of it. But I'm worried that I'm missing something -- that this is still important for some other reason I've overlooked.

My main use-case for the "File" datatype was checking datasets created by students or contributors before they were submitted as patches. But that was back when the patch review interface was less eloquent, and I haven't really needed this -- when I do, it usually involves editing. So I'm fine with using "restore backup" to pass local collections around for review/feedback. In some ways, it's even better, because then the original provenance would be preserved if I make a couple of minor edits and then submit the patch myself (right?).

I'm fine with either the radio button or the inclusion of "restore from backup" as a type.

@rybesh
Copy link
Member

rybesh commented Jul 23, 2020

there currently doesn't seem to be any way to generate a file that can be loaded as the "File" datatype

This is true as far as the client goes. But if you download the PeriodO dataset from http://n2t.net/ark:/99152/p0d.json you get a JSON file that could be loaded as a file data source.

I guess the question is what we want to result from loading from a file. Restoring from a backup creates / restores an editable in-browser database. Loading a dataset file as a datasource creates a read-only browsable data source. Do we need that distinction? I think Patrick and I had discussed eliminating the distinction, and just having an option, when creating an in-browser data source, to populate it either from a database backup file (includes history) or from a dataset file (no history).

But if we did that it would also eliminate the concept of "read-only data source from a file" (and we would just have web data sources or local in-browser data sources).

Adam, could you say a bit more about why you like having a read-only data source?

@atomrab
Copy link
Author

atomrab commented Jul 23, 2020

Actually, I never really did. It was just what we had to share local collections in the early days. I usually had to import the file into a new local IDB to make necessary changes, and then send that back as a file to the submitter, who then had to import that file into their own new local IDB to submit it again. I'm perfectly happy to use database backup files for these transactions; it would actually simplify things, because someone could send me one, I could fix it and send it back, and then they could submit it, without all the extra importing.

Especially if the only JSON file that can be loaded as read-only is the PeriodO dataset itself, I don't see why we need the read-only-data-source-from-a-file option at this point, unless we are considering a use-case where someone desperately needs to have a local copy of the whole dataset for offline work but is also going to carelessly change things in it all over the place and then submit the whole thing as a giant patch accidentally. Right?

@atomrab
Copy link
Author

atomrab commented Jul 23, 2020

I think the question still stands, though, if we allow an in-browser datasource to be populated from a dataset file, whether there is any way to generate such a dataset file, apart from the entire PeriodO dataset.

@rybesh
Copy link
Member

rybesh commented Jul 23, 2020

I think what we're proposing is that we

  1. Remove the concept of a "file data source"
  2. Have an option when creating a local data source to populate it from a file
  3. The file can either be a backup file (with history) or a dataset file (no history).

The client can create backup files but not dataset files.

@atomrab
Copy link
Author

atomrab commented Jul 24, 2020

I'm fine with this. So the use-case for a dataset file would be a PeriodO-compatible json file that a user had made independently? Do we provide a template or format that would explain how such a file should be structured? If not, does it make sense to do so? I'm thinking of something like the old Pelagios cookbook, or the Linked Places format the WHG is using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants