what's probably happening is it's hitting ComputeResourcesUnavailable which triggers a reschedule, but since there is nowhere to reschedule to, it fails and is set to error.
yeah it gets into rt.resize_claim which does the claim test and raises the ComputeResourcesUnavailable exception which can't reschedule b/c it's resize to same host / single node and that all happens within a _error_out_instance_on_exception context manager so the instance is put in error state.
so i guess you'd have to handle ComputeResourcesUnavailable in _error_out_instance_on_exception and not set the instance to error state.
what's probably happening is it's hitting ComputeResource sUnavailable which triggers a reschedule, but since there is nowhere to reschedule to, it fails and is set to error.
yeah it gets into rt.resize_claim which does the claim test and raises the ComputeResource sUnavailable exception which can't reschedule b/c it's resize to same host / single node and that all happens within a _error_ out_instance_ on_exception context manager so the instance is put in error state.
so i guess you'd have to handle ComputeResource sUnavailable in _error_ out_instance_ on_exception and not set the instance to error state.