> I thought we already had code at the go-openstack (goose) level to retry requests.
> I know we implemented support for openstack telling us "come back again in 30s" sort
> of responses. It may be that there is another failure that we need to handle at a
> different level.
Looking at the logsink.log it does look like juju retried the call to create the lxc several times.
> One problem with something like EOF, is you may not actually know whether the request
> succeeded but didn't tell you it did, or failed. So you might have resources that
> were allocated but you didn't get identified.
Indeed, as an outside observer, this looks like a difficult problem.
I'll note that this is very rare. I've only seen it 2 times with 100s of deployments with 2.1.x juju and appears to require something impacting the state of the LXD service.
@John,
> I thought we already had code at the go-openstack (goose) level to retry requests.
> I know we implemented support for openstack telling us "come back again in 30s" sort
> of responses. It may be that there is another failure that we need to handle at a
> different level.
Looking at the logsink.log it does look like juju retried the call to create the lxc several times.
> One problem with something like EOF, is you may not actually know whether the request
> succeeded but didn't tell you it did, or failed. So you might have resources that
> were allocated but you didn't get identified.
Indeed, as an outside observer, this looks like a difficult problem.
I'll note that this is very rare. I've only seen it 2 times with 100s of deployments with 2.1.x juju and appears to require something impacting the state of the LXD service.