Adding OCR to ADOs¶
After uploading a book, Archipelago runs OCR in the background to create extracted text for every book item. Although this is done automatically through hydroponics, this can take days to complete. This process can be sped up by running specifical tasks on specific ADOs.
Go to
/search-and-replace.In the search bar, type the name of the ADO you wish to run OCR for. Select the item from the search results. Be sure to select “trigger strawberry runners process/reprocess for archipelago digital objects content item”. Then click “Apply to Selected Items”.
On the confirmation page, check the “pager” and “text” boxes. Then click “Apply”.
Go to
/admin/config/system/queue-ui.Check the box “Strawberry Runners Process via Cron Queue Worker”. Then click “Apply to selected items”.
Once that runs, click “Strawberry Runners Process on Background Queue Worker”. Then click “Apply to selected items”.
Once all the queues are empty except “Archipelago Temporary File Composter Queue Worker”, click that box. Select “Remove leases” in the Action drop down menu. Then click “Apply to selected items”.
Select “Archipelago Temporary File Composter Queue Worker” again. Select “Clear” in the Action drop down menu. Then click “Apply to selected items”.
Go to rancher-devpre.
Select
dev-cluster. UnderWorkloads, selectesmero-php. Click the three dots at the right side of the row. Click execute shell.Once inside the shell, enter
cd /tmp.Enter
du -sh.Enter
ls -la.Enter
rm -rf *. This deletes all the completed tasks.