-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO.html
420 lines (417 loc) · 19.3 KB
/
TODO.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
<!DOCTYPE html>
<html>
<head>
<title>eprinttools - TODO.html</title>
<link href='https://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="/css/site.css">
</head>
<body>
<header>
<a href="http://library.caltech.edu" title="link to Caltech Library Homepage"><img src="/assets/liblogo.gif" alt="Caltech Library logo"></a>
</header>
<nav>
<ul>
<li><a href="/">Home</a></li>
<li><a href="README.html">README</a></li>
<li><a href="LICENSE">LICENSE</a></li>
<li><a href="install.html">INSTALL</a></li>
<li><a href="user-manual.html">User Manual</a></li>
<li><a href="search.html">Search Docs</a></li>
<li><a href="about.html">About</a></li>
<li><a href="https://github.com/caltechlibrary/eprinttools">GitHub</a></li>
</ul>
</nav>
<section>
<h1 id="action-items">Action items</h1>
<h2 id="next">Next</h2>
<p>Transitional support for feeds</p>
<ul class="task-list">
<li><label><input type="checkbox" />Simplfied record (invenio-RDM
record) to EPrint XML/EPrint JSON</label>
<ul class="task-list">
<li><label><input type="checkbox" />implement libdataset for v2 dataset
collection</label></li>
<li><label><input type="checkbox" />convert feeds’ datasets to v2 ds
collections</label></li>
<li><label><input type="checkbox" />CaltechDATA records are a good test
case for mapping Invenio-RDM to JSON that will drive EPrints templates
for feeds (v1 of feeds is EPrints centric)</label></li>
<li><label><input type="checkbox" />I need to generate EPrint XML as
Invenio-RDM records (simplified records) first and input into the test
instance of CaltechAUTHORS as RDM based repository</label></li>
<li><label><input type="checkbox" />I can then take the Invenio-RDM
records returned and turn those into EPrint XMl/JSON for integration
into existing feeds</label></li>
</ul></li>
</ul>
<p>This is for the simplified eprinttools codebase.</p>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />Make a list of the
missing page types I need to render to match feeds</label></li>
<li><label><input type="checkbox" />Render json types</label>
<ul>
<li>Misc. JSON files
<ul class="task-list">
<li><label><input
type="checkbox" />caltechauthors-grid.json</label></li>
<li><label><input type="checkbox" />caltechdata-grid.json</label></li>
<li><label><input type="checkbox" />caltechthesis-grid.json</label></li>
<li><label><input type="checkbox" />directory_info.json (only found for
people)</label></li>
<li><label><input type="checkbox" checked="" />index.json</label></li>
<li><label><input type="checkbox" />pagefind-entry.json</label></li>
<li><label><input type="checkbox" checked="" />group.json</label></li>
<li><label><input type="checkbox"
checked="" />group_list.json</label></li>
<li><label><input type="checkbox" />people.json</label></li>
<li><label><input type="checkbox"
checked="" />people_list.json</label></li>
</ul></li>
<li>CaltechTHESIS derived
<ul class="task-list">
<li><label><input type="checkbox" />advisor-bachelors.json (needed in
groups only)</label></li>
<li><label><input type="checkbox" />advisor-combined.json (needed in
groups only)</label></li>
<li><label><input type="checkbox" />advisor-engd.json (needed in groups
only)</label></li>
<li><label><input type="checkbox" />advisor-masters.json (needed in
groups only)</label></li>
<li><label><input type="checkbox" />advisor-other.json (needed in groups
only)</label></li>
<li><label><input type="checkbox" />advisor-phd.json (needed in groups
only)</label></li>
<li><label><input type="checkbox" />advisor-senior_major.json (needed in
groups only)</label></li>
<li><label><input type="checkbox" />advisor-senior_minor.json (needed in
groups only)</label></li>
<li><label><input type="checkbox" checked="" />advisor.json
(person/group)</label></li>
<li><label><input type="checkbox" />bachelors.json
(person/group)</label></li>
<li><label><input type="checkbox" />engd.json
(person/group)</label></li>
<li><label><input type="checkbox" />masters.json
(person/group)</label></li>
<li><label><input type="checkbox" />phd.json (person/group)</label></li>
<li><label><input type="checkbox" />senior_major.json
(person/group)</label></li>
<li><label><input type="checkbox" />senior_minor.json
(person/group)</label></li>
</ul></li>
<li>CaltechAUTHORS derived
<ul class="task-list">
<li><label><input type="checkbox" />combined.json (from CaltechAUTHORS),
should be renamed combined_authors.json</label></li>
<li><label><input type="checkbox" />pub_types.json</label></li>
<li><label><input type="checkbox" />article.json</label></li>
<li><label><input type="checkbox" />audiovisual.json</label></li>
<li><label><input type="checkbox" />book.json</label></li>
<li><label><input type="checkbox" />book_section.json</label></li>
<li><label><input type="checkbox" />collection.json</label></li>
<li><label><input type="checkbox" />conference_item.json</label></li>
<li><label><input type="checkbox" />dataset.json</label></li>
<li><label><input type="checkbox" />image.json</label></li>
<li><label><input
type="checkbox" />interactiveresource.json</label></li>
<li><label><input type="checkbox" />model.json</label></li>
<li><label><input type="checkbox" />monograph.json</label></li>
<li><label><input type="checkbox" />object_types.json</label></li>
<li><label><input type="checkbox" />patent.json</label></li>
<li><label><input type="checkbox" />software.json</label></li>
<li><label><input type="checkbox" />teaching_resource.json</label></li>
<li><label><input type="checkbox" />text.json</label></li>
<li><label><input type="checkbox" />thesis.json</label></li>
<li><label><input type="checkbox" />video.json</label></li>
<li><label><input type="checkbox" />workflow.json</label></li>
</ul></li>
<li>CaltechDATA
<ul class="task-list">
<li><label><input type="checkbox" />combined_data.json</label></li>
<li><label><input type="checkbox" />software.json</label></li>
<li><label><input type="checkbox" />data.json</label></li>
<li><label><input type="checkbox" />teaching_resource.json</label></li>
<li><label><input type="checkbox" />data_object_types.json</label></li>
<li><label><input type="checkbox" />data_pub_types.json</label></li>
<li><label><input type="checkbox" />data_types.json</label></li>
<li><label><input type="checkbox" />image.json</label></li>
</ul></li>
</ul></li>
<li><label><input type="checkbox" />Render keys types</label></li>
<li><label><input type="checkbox" />Render Markdown types</label></li>
<li><label><input type="checkbox" />Render include types</label></li>
<li><label><input type="checkbox" />Render BibTeX types</label></li>
<li><label><input type="checkbox" />Render RSS types</label></li>
<li><label><input type="checkbox" />Make a list of the new pages types
reflecting use of simplified records</label></li>
</ul>
<h2 id="bugs">Bugs</h2>
<ul class="task-list">
<li><label><input type="checkbox" />ep3datasets renders EPrints JSON
objects without primary_object being set</label></li>
<li><label><input type="checkbox" checked="" />Cleanup eprint content
for public views</label>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />Does sanitization happen
at rendering of JSON/Markdown documents or when harvesting the
content?</label>
<ul>
<li>Santization happens when we render content, this lets us do one
harvest for both dark and public archives</li>
</ul></li>
<li><label><input type="checkbox" checked="" />Cleanup email content
fields</label></li>
<li><label><input type="checkbox" checked="" />Remove
“notes”</label></li>
</ul></li>
<li><label><input type="checkbox" />Still debugging mapping the
advisor_id, thesis_id and authors_id to person_id for aggregation tables
and people feed generation</label>
<ul>
<li>caltechthesis record 15078 is showing up with a local group of
“Scott Cushing” who is actaully a committee member not a local
gorup.</li>
</ul></li>
<li><label><input type="checkbox" />Are messy people identifiers in
EPrints are preventing a simple mapping to a single person id, when the
EPRint record is read in it needs the ID should be corsswalked to the
cl_people_id value.</label></li>
<li><label><input type="checkbox" />If feeds are “public only” then I
need to strip email addresses from the JSON objects.</label></li>
<li><label><input type="checkbox" />For feeds generated as
REPO_NAME-RECORD_TYPE.json to name the feed by record type only, but
before I add this I need to see if there is any case where thesis in
CaltechAUTHORS need to be itemized along with thesis in
CaltechTHESIS</label></li>
<li><label><input type="checkbox" checked="" />updated value retrieved
from database isn’t converting correctly into a time.Time object in Go.
Need to figure the best way to make this correct</label></li>
<li><label><input type="checkbox" checked="" />Aggregation
group_list.json has empty “combined” mapped when there are no eprintid
for the specific group in the respository</label></li>
<li><label><input type="checkbox" />each index.html under people and
group should have a corresponding index.json that is used by Pandoc to
render index.md that then renders index.html,
include.include</label></li>
<li><label><input type="checkbox" checked="" />Issue 40, SQL reference
document_relation_type table issues</label></li>
<li><label><input type="checkbox" />Issue 41, Add related URL as DOI
value (really make eprints show this as a linked field in the display,
don’t do that in the data structure)</label></li>
<li><label><input type="checkbox" checked="" />Issue 44, Funders are
coming up as “UNSPECIFIED”</label></li>
<li><label><input type="checkbox" />Issue 45, Related URLs are coming in
as “UNSPECIFIED”</label></li>
<li><label><input type="checkbox" />Issue 47, Need to strip HTML from
Abstract field</label></li>
<li><label><input type="checkbox" />Issue 48, Imported EPrint doesn’t
show up in review buffer</label>
<ul class="task-list">
<li><label><input type="checkbox" />in release 1.1.1-next datestamp
isn’t set, example eprintid 111912</label></li>
<li><label><input type="checkbox" checked="" />I might be setting the
wrong event_status (e.g. buffer or inbox)</label></li>
<li><label><input type="checkbox" checked="" />I need to confirm all
timestamp fields and datestamp field is being set correctly</label></li>
</ul></li>
<li><label><input type="checkbox" checked="" />Issue 49, Field defaults
on import including resolver URL and collection</label></li>
<li><label><input type="checkbox" />Issue 50, Verify why imported and
published EPrints don’t show in recent additions (is the an issue with
generated views or with a datestamp not getting set
correctly?).</label></li>
</ul>
<h2 id="next-1">Next</h2>
<ul class="task-list">
<li><label><input type="checkbox" />Add deposit info to EPrintXML
output</label></li>
<li><label><input type="checkbox" />ioutil is depreciated, need to
update the code that uses it</label></li>
<li><label><input type="checkbox" checked="" />Need a means of filtering
for public EPrint records only</label>
<ul>
<li><code>is-public</code> end point added to ep3apid</li>
<li><code>?eprint_status=...</code> added for keys and keys by timestamp
ranges</li>
</ul></li>
<li><label><input type="checkbox" checked="" />Add Extended API support
to eputil command</label></li>
<li><label><input type="checkbox" />Implement Solr index record view for
Solr 8.9 ingest</label></li>
<li><label><input type="checkbox" />Add update end point to support
update EPrints Metadata</label>
<ul class="task-list">
<li><label><input type="checkbox" />Figure out how historical diffs of
EPrints XML are generated in EPrints’ History tab</label></li>
<li><label><input type="checkbox" />Implement updates versioning the
EPrint Metadata record</label></li>
<li><label><input type="checkbox" />Implement file upload and manage
document versioning</label></li>
</ul></li>
</ul>
<h2 id="completed">Completed</h2>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />Implement an example
ep3apid Python API</label></li>
<li><label><input type="checkbox" checked="" />Implement a /version end
point displaying ep3apid version number</label></li>
<li><label><input type="checkbox" checked="" />Create an example service
file for running ep3apid as a service under SystemD (Linux)</label></li>
<li><label><input type="checkbox" checked="" />Create an example service
file for running ep3apid as a service under LaunchD (macOS)</label></li>
<li><label><input type="checkbox" checked="" />Need a Users end point to
get a list of users in the system and retrieve their numeric user
id</label></li>
<li><label><input type="checkbox" checked="" />the various related
tables that represent item lists don’t have the same row count so I need
to explicitly query for eprintid, pos or do JOIN and handle the NULL
column cases.</label></li>
<li><label><input type="checkbox" checked="" />Fix
lemurprints-import-api-16 through 21 examples, re-import with
./bin/doi2eprintxml tool</label></li>
<li><label><input type="checkbox" checked="" />Add script to generate
“lemurprints” database with support for all fields present across our
repositories so I can do robust testing and generate appropriate
testdata</label>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />Include all fields and
tables in caltechauthors</label></li>
<li><label><input type="checkbox" checked="" />Include all fields and
tables in caltechthesis</label></li>
<li><label><input type="checkbox" checked="" />Include all fields and
tables in caltechconf</label></li>
<li><label><input type="checkbox" checked="" />Include all fields and
tables in caltechcampuspubs</label></li>
<li><label><input type="checkbox" checked="" />Include all fields and
tables in calteches</label></li>
<li><label><input type="checkbox" checked="" />Include all fields and
tabels in caltechoh</label></li>
<li><label><input type="checkbox" checked="" />Include all fields and
tabels in caltechln</label></li>
<li><label><input type="checkbox" checked="" />Exported selected records
from production, sanitize them and write import test against lemurprints
test database</label></li>
<li><label><input type="checkbox" checked="" />Fetch DOI of records
found in EPrints use them to test in lemurprints</label></li>
</ul></li>
<li><label><input type="checkbox" checked="" />Add create end points to
support importing EPrint XML metadata into eprints</label>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />Implement
SQLReadEPrint</label></li>
<li><label><input type="checkbox" checked="" />Implement
SQLCreateEPrint</label></li>
<li><label><input type="checkbox" checked="" />Implement ImportEPrint
for importing EPrint XML metadata</label></li>
<li><label><input type="checkbox" checked="" />Implement a method that
takes a table/column map and EPrint structure then renders a INSERT or
REPLACE sequence to create or update an EPrint record</label></li>
<li><label><input type="checkbox" checked="" />Implement a method that
takes a table/column map and EPrint structure and update the EPrint
structure from a sequnce of SELECT statements</label></li>
</ul></li>
<li><label><input type="checkbox" checked="" />Split clsrules into
separate options to allow for more specific control</label></li>
<li><label><input type="checkbox" checked="" />Add end point for
<code>/{REPO_ID}/year</code> (list years that have eprint records with a
“published” date type)</label></li>
<li><label><input type="checkbox" checked="" />Add end point for
<code>/{REPO_ID}/year/{YEAR}</code> lists eprint records published in
that year</label></li>
<li><label><input type="checkbox" checked="" />Implement a method to
show which tables a repository instance has and the column names in each
table</label>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />Implement a startup data
structure that captures the <code>/repository/</code> end point data so
that table/column map can be used to build the SQL queries need to read,
create, and update an EPrint record</label></li>
<li><label><input type="checkbox" checked="" />Implement
<code>/repository/<REPO_ID></code> end point with
<code>map[string][]string{}</code> output</label></li>
</ul></li>
<li><label><input type="checkbox" checked="" />doi2eprintxml list of DOI
should allow for pipe separator and URL to object and handle it like
Acacia does</label></li>
<li><label><input type="checkbox" checked="" />doi2eprintxml needs to
fetch the object URL and save results along side the generated EPrints
XML</label>
<ul>
<li>added with a -D,-download option in doi2eprintxml.</li>
</ul></li>
<li><label><input type="checkbox" checked="" />Added created (datestamp)
end point for feeds</label></li>
<li><label><input type="checkbox" checked="" />Implement Simplified JSON
record based on</label>
<ul>
<li>https://inveniordm.docs.cern.ch/reference/metadata/</li>
<li>https://github.com/caltechlibrary/caltechdata_api/blob/ce16c6856eb7f6424db65c1b06de741bbcaee2c8/tests/conftest.py#L147</li>
</ul></li>
<li><label><input type="checkbox" checked="" />Add simplified JSON
output option to</label>
<ul class="task-list">
<li><label><input type="checkbox" checked="" />eputil</label></li>
<li><label><input type="checkbox" checked="" />epfmt</label></li>
<li><label><input type="checkbox"
checked="" />doi2eprintxml</label></li>
</ul></li>
</ul>
<h2 id="someday-maybe">Someday, Maybe</h2>
<ul class="task-list">
<li><label><input type="checkbox" />Add end point to recreate Person A-Z
list</label></li>
<li><label><input type="checkbox" />Add end point for
subjects</label></li>
<li><label><input type="checkbox" />Add end point for events
(Conferences)</label></li>
<li><label><input type="checkbox" />Add end point for
collection</label></li>
<li><label><input type="checkbox" />Add end point for
publication</label></li>
<li><label><input type="checkbox" />Add end point for
place_of_pub</label></li>
<li><label><input type="checkbox" />Add end point for issn</label></li>
<li><label><input type="checkbox" />Add end point for Person (Person
IDs)</label></li>
<li><label><input type="checkbox" />Add end point for Authors
(creators)</label></li>
<li><label><input type="checkbox" />Add end point for
Editors</label></li>
<li><label><input type="checkbox" />Add end point for
contributors</label></li>
<li><label><input type="checkbox" />Add end point for types</label></li>
<li><label><input type="checkbox" />Add end point for
corp_creators</label></li>
<li><label><input type="checkbox" />Add end point ofr
issuing_body</label></li>
</ul>
</section>
<footer>
<span>© 2021 <a href="https://www.library.caltech.edu/copyright">Caltech Library</a></span>
<address>1200 E California Blvd, Mail Code 1-32, Pasadena, CA 91125-3200</address>
<span><a href="mailto:[email protected]">Email Us</a></span>
<span>Phone: <a href="tel:+1-626-395-3405">(626)395-3405</a></span>
</footer>
<!-- START: PrettyFi from https://github.com/google/code-prettify -->
<script>
/* We want to add the class "prettyprint" to all the pre elements */
var pre_list = document.querySelectorAll("pre");
pre_list.forEach(function(elem) {
elem.classList.add("prettyprint");
elem.classList.add("linenums");/**/
elem.classList.add("json"); /**/
});
</script>
<style>
li.L0, li.L1, li.L2, li.L3, li.L4, li.L5, li.L6, li.L7, li.L8, li.L9
{
color: #555;
list-style-type: decimal;
}
</style>
<link rel="stylesheet" type="text/css" href="/css/prettify.css">
<script src="https://cdn.jsdelivr.net/gh/google/code-prettify@master/loader/run_
prettify.js"></script>
<!-- END: PrettyFi from https://github.com/google/code-prettify -->
</body>
</html>