backtrace on console 3-5 minutes after HA test completes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Heat |
Fix Released
|
Medium
|
Steven Hardy | ||
Grizzly |
Fix Released
|
Medium
|
Steven Hardy |
Bug Description
[root@bigiron heat]# ./run_tests.sh -a tag='HA'
Running tests
HaFunctionalTest
test_instance
OK 222.77
Failure
runTest SKIP: unable to import cfn helper, skipping
Ran 2 tests in 223.181s
OK (SKIP=1)
Slowest 1 tests took 222.77 secs:
222.77 test_instance (heat.tests.
This seems to work correctly
Then after about 3-5 minutes, engine prints the following backtrace to stdout:
Traceback (most recent call last):
File "/usr/lib/
func(*args, **kwargs)
File "/usr/lib/
self.state_
File "/usr/lib/
stack.update_
AttributeError: 'NoneType' object has no attribute 'update_and_save'
The engine.log then contains:
2012-09-06 13:50:29 DEBUG [heat.manager] Running periodic task EngineManager.
2012-09-06 13:50:55 ERROR [heat.engine.
Traceback (most recent call last):
File "/usr/lib/
self.handle_
File "/usr/lib/
meta = handle.metadata
File "/usr/lib/
rs = db_api.
File "/usr/lib/
return IMPL.resource_
File "/usr/lib/
raise NotFound("resource with id %s not found" % resource_id)
NotFound: resource with id 26 not found
2012-09-06 13:50:55 ERROR [heat.engine.
shardy:
======
I think this is the same problem described in #261, and is related to #264
The reason we're seeing this in the test is that utils.Instance.
I guess we need to look at the best fix for #261, and I have a fix (or at least improvement) for #264, but to fix this specific issue with the tests, we just need to add another test to ensure the stack is CREATE_COMPLETE before we do the tests, hence avoiding any risk of deleting while still in CREATE_IN_PROGRESS state
Actually, correction to above, utils.Stack.create does check for CREATE_COMPLETE, so I'm guessing this problem only occurs with the IHA test (not the HA one), steve can you confirm that is the case?
I think the problem is as described above, only the waitcondition ends up still in-progress after the IHA instance replacement has happened
Changed in heat: | |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in heat: | |
milestone: | none → grizzly-3 |
Changed in heat: | |
status: | Fix Committed → Fix Released |
Changed in heat: | |
milestone: | grizzly-3 → 2013.1 |
So I'm pretty sure the root cause of this is the waitcondition greenthread didn't get cancelled on stack delete, and there have been several fixes in that area recently (in particular bug 1096150), so I think this is fixed, but doing some testing to confirm, and also to check the behavior if you delete a stack while an IHA replacement is in progress.