HA bootstrap mode causes machines stuck in agent-state pending

Bug #1381340 reported by Brad Marshall
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Won't Fix
Medium
Unassigned

Bug Description

We have an environment using juju 1.20.9-0ubuntu1~14.04.1~juju1 and Maas 1.5.4+bzr2294-0ubuntu1.1 on trusty that seems to be having issues with HA bootstrap mode. We are specifically looking at 16 of nodes here.

When I bootstrap without HA mode, I'm able to add all the machines into the juju environment cleanly.

When using HA bootstrap node, the nodes don't all add cleanly - only between 1 and 3 of them go to agent-status started, the rest are stuck in agent-status pending.

The failure was an MTU misconfiguration, but that was unknowable from the log, having meaningful messages at juju log would have helped a lot, warnings/errors like e.g. 'read timeout from tcp connection'.

Revision history for this message
Brad Marshall (brad-marshall) wrote :
Revision history for this message
Brad Marshall (brad-marshall) wrote :

In our staging environment which has less nodes we don't seem to have this issue. It is using the same version of juju-core, but maas is version 1.5.2+bzr2282-0ubuntu0.2.

Revision history for this message
James Troup (elmo) wrote :

Sorry, this appears to have been a local problem caused by MTU mismatch between switch and host. Please don't investigate (further); we'll confirm and close shortly.

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Incomplete
Revision history for this message
JuanJo Ciarlante (jjo) wrote :

As we found later, this was most likely due to an MTU misconfiguration: hosts' interfaces set to mtu=9000 while switches not fully setup for it - after getting them properly setup, this deployment went ok.

Nevertheless, having meaningful messages at juju log would have helped a lot, warnings/errors like e.g. 'read timeout from tcp connection'.

Curtis Hovey (sinzui)
tags: added: logging ui
description: updated
Changed in juju-core:
status: Incomplete → Triaged
importance: Undecided → High
importance: High → Medium
Changed in juju-core:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.