[2.2] Duplicate communication is occurring to the same rack controller
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Critical
|
Blake Rouse | ||
2.2 |
Fix Released
|
Critical
|
Blake Rouse |
Bug Description
I'm adding a machine, which appears to work fine, but MAAS returns a 401 error:
Command failed: maas-cli root machines create hostname=taillow power_type=ipmi architecture=
b"Authorization Error: 'Nonce already used: afe9692868de46b
from regiond.log:
2017-07-14 21:20:13 regiond: [info] 10.245.208.32 POST /MAAS/api/
from maas.log:
Jul 14 21:19:43 lairon maas.api: [info] taillow: Enlisted new machine
Jul 14 21:19:43 lairon maas.node: [info] taillow: Status transition from NEW to COMMISSIONING
Jul 14 21:19:47 lairon maas.power: [info] Changed power state (on) of node: spinda (xneqy8)
Jul 14 21:20:14 lairon maas.power: [info] Changing power state (on) of node: taillow (6gtqax)
Jul 14 21:20:14 lairon maas.node: [info] taillow: Commissioning started
Jul 14 21:20:27 lairon maas.power: [info] Changed power state (on) of node: taillow (6gtqax)
This is very similar to bug 1702751 except the node is in commissioning state in this one, and I don't see any errors anywhere.
It's worth pointing out we're using HA rack controllers. I get the feeling that this is somehow related to it taking a while to contact the BMC for the first time.
Related branches
- Blake Rouse (community): Approve
-
Diff: 53 lines (+14/-7)2 files modifiedsrc/maasserver/rpc/regionservice.py (+2/-3)
src/maasserver/rpc/tests/test_regionservice.py (+12/-4)
- Mike Pontillo (community): Approve
-
Diff: 53 lines (+14/-7)2 files modifiedsrc/maasserver/rpc/regionservice.py (+2/-3)
src/maasserver/rpc/tests/test_regionservice.py (+12/-4)
description: | updated |
tags: | added: cdo-qa-blocker |
Changed in maas: | |
status: | In Progress → Fix Committed |
Changed in maas: | |
milestone: | 2.3.0 → 2.3.0alpha1 |
Changed in maas: | |
status: | Fix Committed → Fix Released |
newell is helping me look at this. At his suggestion, I stopped the maas-rackd service on my second rack controller and tried the add node over. It proceeds much quicker - nodes are added faster and it succeeds every time. It seems like having the second rack controller enabled causes a delay that causes this to fail.