Delete LP Images and repos when storage is low
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Solum |
Triaged
|
Wishlist
|
Unassigned |
Bug Description
When building a DU based on a custom LP, we download the LP image
to the worker node to build the DU image on top of it.
Once the DU is built lingering LP images will cause the worker to run out of storage if they are never pruned. The same is true for cloned source repos. Temporary fixes for deleting all LP images and repositories adversely affect the performance of subsequent builds:
https:/
https:/
The cloned repo and DU images are reused (updated with a git pull, and docker pull as needed) in subsequent builds in order to keep the build process fast. If you delete them before going back to the work queue, then you eliminate this speed benefit.
Please implement a more elegant solution that only evicts old DU images and repos if available storage drops below a configured threshold. Only prune clones repos and Du images (one of each in sequence) if there is a storage shortage. I'd like to see code that looks for >XX% storage utilization, and can use a simple eviction algorithm to delete the oldest repo and oldest DU image repeatedly until the utilization falls under XX%. The XX value should be configured in solum.conf. I suggest the initial default be set to 80%.
If we only delete DU images or repos, and not both together, we will end up with an accumulation of one or the other. That's why we should delete one of each until our available storage falls into acceptable range.
Note that if the storage constraint is because of something else on the filesystem, you may delete all DU images, and all cached repos, and still not be under the storage constraint. In that case, log a warning and attempt the build.
tags: | added: solum-worker |