You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Creation of derivatives fails for a PDF of over 300 MB. The logs show a heap memory error on Solr, which suggests that the problem happens during indexing of the PDF? (At least, keyword searches on terms in the full text of the PDF return no results.)
It's useful to note that, at least in this case, it's not possible to add a file of this size through the GWSS UI, since Nginx (as currently configured) doesn't allow such a large upload. So the error occurs during the Bulkrax ingest.
It would be useful to determine if the failure is triggered by the size of the PDF or by something else. If by size, perhaps include a check in the ingest process to prevent the derivative creation for files that are too large? (Eventually, the jobs fails completely, but Sidekiq keeps retrying it for a while, which in one case seems to have caused issues with the Solr instance that prevented other works from being indexed properly.)
Troubleshooting steps
Check container memory usage while the job runs (docker stats).
Check Java settings on the Solr instance.
The text was updated successfully, but these errors were encountered:
Creation of derivatives fails for a PDF of over 300 MB. The logs show a heap memory error on Solr, which suggests that the problem happens during indexing of the PDF? (At least, keyword searches on terms in the full text of the PDF return no results.)
It's useful to note that, at least in this case, it's not possible to add a file of this size through the GWSS UI, since Nginx (as currently configured) doesn't allow such a large upload. So the error occurs during the Bulkrax ingest.
It would be useful to determine if the failure is triggered by the size of the PDF or by something else. If by size, perhaps include a check in the ingest process to prevent the derivative creation for files that are too large? (Eventually, the jobs fails completely, but Sidekiq keeps retrying it for a while, which in one case seems to have caused issues with the Solr instance that prevented other works from being indexed properly.)
Troubleshooting steps
docker stats
).The text was updated successfully, but these errors were encountered: