Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of items inconsistent #440

Closed
laurensorensen opened this issue Jan 5, 2023 · 11 comments
Closed

Number of items inconsistent #440

laurensorensen opened this issue Jan 5, 2023 · 11 comments
Assignees

Comments

@laurensorensen
Copy link
Collaborator

The counter under the "collection" box on the left sidebar says "9554" total items, but the number I have from our original dataset and entered into the EAD is 9920. Not sure why there's an inconsistent count..
Screen Shot 2023-01-05 at 11 54 18 AM

@ggeisler
Copy link
Contributor

ggeisler commented Jan 5, 2023

If you do a wildcard/blank search on the site you get 9861 documents. One of those is the collection itself, so if you subtract that (because it isn't a component itself, but the parent of all components) you get 9860, which matches the "Total components" number in the more info panel.

The More info panel says there are 9554 online items. If you take the 9860 components and subtract all of the manually created levels that don't actually have documents (i.e., you can't go to an item detail page for them: Record group, Series, Subseries) you get close to 9554 (9558, which is the number of Level -> Items we show in the Level facet). So it seems like maybe there are 4 items that are valid components but for whatever reason don't have an item detail page and thus aren't considered online items. (I'm just speculating, but the numbers seem to work for that explanation.)

Why those numbers are off from the number in the original dataset and entered into the EAD, I have no idea. But it does seem like the numbers on the site itself are more or less internally consistent. So maybe the count in the EAD needs updating?

@marlo-longley
Copy link
Collaborator

@ggeisler thank you for this analysis!

@marlo-longley
Copy link
Collaborator

marlo-longley commented Jan 6, 2023

@laurensorensen Hi Lauren, looking into this now. It'd be helpful if you can you explain exactly where you got the number 9920 from -- what software or view created this count? Thanks!

@laurensorensen
Copy link
Collaborator Author

Ok, I am unable to find where I got the 9920 number from. Sorry about that! I thought it was from the original inventory, but that is only 9627 rows.
There are also these 12 items that were in the inventory but not delivered from ICJ (noted in inventory that they were not delivered):
H-2785
H-4170
H-5245
H-5246
H-5260_0025A_1
H-5260_0171A_1
H-5260_0172A_1
H-5260_0173A_1
H-5260_0174A_1
H-5260_0187A_1
H-5260_2797A_1
H-5260_2799A_1

Currently going through and trying to see if there are any more that weren't delivered.

@laurensorensen
Copy link
Collaborator Author

Still a little confused and trying to sort this out.

  • items_only.csv: 9615 items
  • from my original spreadsheet from ICJ (w/o "missing" items): 9603
  • right now in NTA site app: 9554

@laurensorensen
Copy link
Collaborator Author

I downloaded all the CSVs from the series github page and I got more results that seem inconsistent..
Exhibits:
3494
Audio:
4590
Docbooks:
394
Emptyfolders:
150
Finalpleas:
113
Indictments:
3
Judgments:
13
Minutes:
1
Lists:
72
Commissiontrans:
236
Court transcripts:
740
Rules:
13
Statements:
59
Trial briefs:
39

Total:
9917

@laurensorensen
Copy link
Collaborator Author

Realizing that the above includes Record Group, sub-series etc. Going to try and do a count and revise without these added.

@laurensorensen
Copy link
Collaborator Author

Total number of sub-series and record groups is 71. Maybe we can discuss next week? @marlo-longley @thatbudakguy

@laurensorensen
Copy link
Collaborator Author

laurensorensen commented Jan 9, 2023

Hi again, So I'm not the best at math but I think what I have below is fairly accurate. I'm not sure how to fix / evaluate where this is at, but I feel like it's important that the numbers be consistent... I was wondering if there might be logs that still exist from when the CSVs were uploaded to ASpace? Sorry I didn't ask for the logs at the time of import.

9626 total items in Argo (documents, audio, film, image)
9558 total items according to extent field in ASpace (by series) (edited, missed Trial briefs and Rules previously)
9620 total items from ICJ (original data plus H-2785)
9554 total items in Arclight now
9616 total items in items_only

(after adding H-2785)

I asked Geoff about the appearance of 9626 items in Argo versus 9620 in original spreadsheet.

@laurensorensen
Copy link
Collaborator Author

@laurensorensen
Copy link
Collaborator Author

Closing in favor of #449

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants