config-changed hook timeout fails strangely
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pyjuju |
Triaged
|
Low
|
Unassigned |
Bug Description
Consider the opengrok charm, which to maintain idempotent behavior, must download
any source code defined by the user (multiple projects) and then generate a cross-reference index db.
For smaller projects this isn't a big deal but when you get to something like the Linux
kernel, you can easily spend 30 mins to clone and index the initial import. The config-changed
hook drops dead after about 7 mins.
This is related to https:/
fails to report the actual failure correctly to begin with.
To reproduce:
$ juju deploy opengrok
$ juju expose opengrok
# reconfigure
cat > kernel.yaml << EOF
# vim:ts=
opengrok:
og_content: '{
"repos":[
{
"url" : "lp:juju",
"alias" : "juju"
},
{
"url" : "git://
"alias" : "linux-2.6"
}
]
}'
EOF
$ juju set opengrok --config kernel.yaml
Everything is hunky dory until 09:33:12,876
2012-11-26 09:33:12,876 unit:opengrok/0: hook.output DEBUG: hook config-changed exited, exit code Traceback (most recent call last):
Failure: juju.errors.
Config changed started at 09:26:19,117
Problem with this error message is that it makes it look like there's a error in the actual hook instead of
juju itself, why can't it tell me that a timer expired? On further inspection the charm is clearly doing it's
job.
$ juju ssh opengrok/0
$ sudo tail -f /var/log/
which is running on behalf of:
$ initctl status opengrok-index
opengrok-index start/running, process 27974
clearly still running and spawned by.
hooks/py-
def update_
....
print(
p = Popen(['initctl', 'start', 'opengrok-index'])
p.wait()
updated = True
print(
Again, since the same hook could be executed in rapid succession I must serialize everything. The
upstart job handles configuration changes and index updates. To get around this I would need to
create a command queue that handles this asynchronously which dramatically
increases the complexity of the charm.
What I would like to see is a feature that allows me to info juju that the hook is still alive and
doing real work, like a progress bar api. I'd also like to see better error reporting in this instance,
juju isn't even reporting it's own stack trace (stderr) to juju debug-log.
# config-changed log snippet
2012-11-26 09:26:18,812 unit:opengrok/0: statemachine DEBUG: unitworkflowstate: execute action do_configure
2012-11-26 09:26:18,824 unit:opengrok/0: hook.output DEBUG: Cached relation hook contexts: []
2012-11-26 09:26:18,830 unit:opengrok/0: hook.executor DEBUG: Running hook: /var/lib/
2012-11-26 09:26:19,117 unit:opengrok/0: unit.hook.api INFO: Entering py-config-changed
2012-11-26 09:26:19,707 unit:opengrok/0: unit.hook.api INFO: matched a bzr url lp:juju
2012-11-26 09:26:19,910 unit:opengrok/0: unit.hook.api INFO: Skipping project /var/opengrok/
2012-11-26 09:26:20,102 unit:opengrok/0: unit.hook.api INFO: matched a git url git://git.
2012-11-26 09:26:20,300 unit:opengrok/0: unit.hook.api INFO: Checking out git branch git://git.
2012-11-26 09:26:20,512 unit:opengrok/0: unit.hook.api INFO: Waiting for all branch checkouts to finish...
2012-11-26 09:26:22,116 unit:opengrok/0: hook.output INFO: Cloning into '/var/opengrok/
2012-11-26 09:32:05,873 unit:opengrok/0: unit.hook.api INFO: Branch checkouts complete
2012-11-26 09:32:06,168 unit:opengrok/0: unit.hook.api INFO: Preparing to update configuration and index synchronously
2012-11-26 09:32:06,436 unit:opengrok/0: unit.hook.api INFO: Project configuration and index update...
2012-11-26 09:33:12,707 unit:opengrok/0: unit.lifecycle DEBUG: relation resolved changed
2012-11-26 09:33:12,876 unit:opengrok/0: juju.agents.unit INFO: No upgrade flag set.
2012-11-26 09:33:12,876 unit:opengrok/0: hook.output DEBUG: hook config-changed exited, exit code Traceback (most recent call last):
Failure: juju.errors.
.
2012-11-26 09:33:12,877 unit:opengrok/0: hook.executor DEBUG: Hook error: /var/lib/
2012-11-26 09:33:12,877 unit:opengrok/0: unit.lifecycle INFO: processing relation resolved changed
2012-11-26 09:33:13,105 unit:opengrok/0: statemachine DEBUG: unitworkflowstate: executing error transition error_configure, Error processing '/var/lib/
I keep my own logger for config changed that owns both STDOUT and STDERR *and* pushes both to juju-log, no errors
are reported here, no exceptions, it exits fine.
ubuntu@
2012-11-26 14:26:18,
2012-11-26 14:26:19,
2012-11-26 14:26:19,
2012-11-26 14:26:19,
2012-11-26 14:26:20,
2012-11-26 14:26:20,
2012-11-26 14:32:05,
2012-11-26 14:32:05,
2012-11-26 14:32:06,
2012-11-26 15:35:09,
2012-11-26 15:35:11,
Changed in juju: | |
milestone: | none → 0.8 |
Changed in juju: | |
importance: | Undecided → Low |
status: | New → Triaged |