environment unstable after 1.25.8 upgrade
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
Andrew Wilkins | ||
2.1 |
Fix Released
|
High
|
Andrew Wilkins | ||
juju-core |
Fix Released
|
Critical
|
Andrew Wilkins | ||
1.25 |
Fix Released
|
Critical
|
Andrew Wilkins |
Bug Description
We recently upgraded a few environments from juju 1.25.6 to 1.25.8 and we started experiencing problems on some of them.
What we know so far:
* the problems only affects bigger environments, 8-10 machines and bigger. Smaller environments look stable
* on the problematic environments jujud uses lots of memory on node 0, for example nearly 1GB RES on bootstrap node with 2GB RAM
* we see "lost" agents ocassionally. It's intermittent, sometimes environments are fine for hours
* occasionally hooks end up in error state, we see error like this in the logs:
2016-11-29 09:30:29 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down
2016-11-29 09:30:29 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down
2016-11-29 09:30:29 ERROR juju.worker.
2016-11-29 09:30:29 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down
2016-11-29 09:30:29 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down
juju version is 1.25.8, running on amd64 trusty guests.
I uploaded logs from the bootstrap node here:
https:/
description: | updated |
Changed in juju-core: | |
importance: | Undecided → Critical |
description: | updated |
Changed in juju-core: | |
status: | New → Triaged |
milestone: | none → 1.25.9 |
Changed in juju-core: | |
status: | Triaged → In Progress |
assignee: | nobody → Andrew Wilkins (axwalk) |
milestone: | 1.25.9 → none |
importance: | Critical → High |
Changed in juju-core: | |
status: | In Progress → Fix Committed |
Changed in juju-core: | |
milestone: | none → 1.25.9 |
Changed in juju: | |
status: | New → Fix Committed |
importance: | Undecided → High |
milestone: | none → 2.0.3 |
assignee: | nobody → Andrew Wilkins (axwalk) |
no longer affects: | juju-core/2.0 |
Changed in juju-core: | |
importance: | High → Critical |
Changed in juju-core: | |
milestone: | 1.25.9 → none |
Changed in juju-core: | |
status: | Fix Committed → Fix Released |
tags: | added: canonical-is |
Changed in juju: | |
status: | Fix Committed → Fix Released |
Looking through the logs. This looks like a memory issue but will have one of the Juju Core folks dive into this asap. If there's any way to update the controller to one with a larger memory footprint and see if things settle better that'd be appreciated while we investigate.