Autopilot Log Analyser

Overview
Code
Bugs
Blueprints
Translations
Answers

Bug #1655716
Comment #15

Comment 15 for bug 1655716

Revision history for this message

Francis Ginther (fginther) wrote on 2017-04-10:

#15

@John,

> I thought we already had code at the go-openstack (goose) level to retry requests.
> I know we implemented support for openstack telling us "come back again in 30s" sort
> of responses. It may be that there is another failure that we need to handle at a
> different level.

Looking at the logsink.log it does look like juju retried the call to create the lxc several times.

> One problem with something like EOF, is you may not actually know whether the request
> succeeded but didn't tell you it did, or failed. So you might have resources that
> were allocated but you didn't get identified.

Indeed, as an outside observer, this looks like a difficult problem.

I'll note that this is very rare. I've only seen it 2 times with 100s of deployments with 2.1.x juju and appears to require something impacting the state of the LXD service.