-
Notifications
You must be signed in to change notification settings - Fork 4
Description
I suspect that this may end up as a 'won't fix' but I am noting the issue here so that, if that is the case, we can record the decision. This problem was observed in a preproduction IDS where the connected ICAT has a subset of the production database.
In normal operation, the IDS puts data in its cache. If it finds that the volume of data in its cache has exceeded the high threshold, it requests that the storage plugin walk the filesystem, finding the list of 'old' files which, if deleted, would free up enough space to take it below its low threshold. It then looks up the file locations in ICAT and loops over the results to request that the files are archived. (Any that are currently requested won't be archived because of the logic in the deferredOpsQueue).
So, if all the 'old' files are not found in ICAT, then it never frees up any space. What it doesn't do (and it is debatable whether it should) is try to delete more data to correct for the files that it skipped.
I think the behaviour here may not be optimal but I'm not sure the use case of lots of unknown data sitting in the IDS cache was foreseen or even if it should be accommodated. Could this problem arise in a production environment?
Solutions to this are complicated because the Tidier delegates finding files to the storage plugin and archiving/deleting files to the deferredOpsQueue. Some possible approaches:
- We issue a warning when an 'old' file is not found in ICAT (and therefore the space it occupies won't be freed)
- We aggregate the size of all the skipped files in the loop and, if this total is >= the difference between the thresholds, then we know that the disk space will never be freed. We issue an error in this case as the Tidier can no longer prevent the disk from running out of space.
- We document that the main storage should be clear or only contain ICAT files in the installation instructions.