Replies: 1 comment 1 reply
-
having implemented parent/child records in Solr, I can say that you almost never need this. if you need structured data in a Solr record, it’s better to store it as a JSON string and use a json field transformer to transform it. Solr returns documents in response to searches. If you want to return a document but your database has it in some sort of structure, it’s better to “hoist” the data into a single document to retrieve it. So if you have, for example, a book and page database, and want to return a book I response to a text query for each page, it’s better to put all pages’ text on the book record, rather than to keep the text on each page. If you want to return individual page documents, though, you can simply add a field that has the parent “book” ID, and retrieve that document if needed. remember: Solr is a search engine. It is different from what you might expect as a database. There is no need to model the data you put in — you need to be more careful to model the results you want to get out. |
Beta Was this translation helpful? Give feedback.
-
@dchiller, @homework36 and I had a meeting to discuss the structure of JSON-LD and how the design of the JSON-LD affects the search.
We covered several key points (feel free to edit and add more below):
JSON-LD Format: We explored the question of whether there's a need to alter the format of JSON-LD data. We confirmed that the current nested structure works well and that there's no immediate requirement to flatten the data for search purposes.
Child-Parent Relationship in Solr: We discussed potential approaches to implementing a child-parent relationship in Solr. This relationship would aid in searching for nested data structures, making it easier to retrieve relevant information.
Focus on Search: We emphasized that our primary concern is optimizing the search experience for users. This means that we don't require a full, flattened copy of the databases at this point; our focus is on improving search functionality.
Handling Reconciled Data: We addressed the distinction between data provided by the database and data that has been reconciled with Wikidata. **It's important to differentiate between these two sources of information in our system to ensure accuracy and traceability. Our goal is to prioritize results that are returned from database-provided Wikidata. **
We are still thinking about how to implement this strategy.
Beta Was this translation helpful? Give feedback.
All reactions