juju run doesn't after upgrade to 1.20.11

Bug #1392745 reported by Curtis Hovey
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
Tim Penhey
1.20
Fix Released
Critical
Tim Penhey
1.21
Fix Released
Critical
Tim Penhey

Bug Description

Both Juju CI and a production env on bootstack cannot use juju run. We see output like this:

- MachineId: "6"
  ReturnCode: 255
  Stderr: "Warning: Identity file /var/lib/juju/system-identity not accessible: No
    such file or directory.\nPermission denied (publickey).\r\n"
  Stdout: ""

Most machines do *not* have /var/lib/juju/system-identity. We do not know how to recreate it.

This might be a red herring because in Juju CI's case, machine 6 does have system-identity
ls -l /var/lib/juju/system-identity
-rw------- 1 root root 1679 Apr 2 2014 /var/lib/juju/system-identity

We have confirmed that we can ssh and execute scripts on the machines, but we don't get the juju context provided by juju run

Curtis Hovey (sinzui)
tags: added: regression
JuanJo Ciarlante (jjo)
tags: added: canonical-bootstack
Curtis Hovey (sinzui)
summary: - juju run doen't after upgrade to 1.20.11
+ juju run doesn't after upgrade to 1.20.11
Revision history for this message
JuanJo Ciarlante (jjo) wrote :

I got hit by this also, in my case for a 1.18.4 -> 1.19.4 -> 1.20.11 upgrade.

As additional data point, the file was there at node0 before the upgrade
(while still at 1.18.4):

root@node0:~# locate system-identity
/var/lib/juju/system-identity
root@node0:~# ls -l /var/lib/juju/system-identity
ls: cannot access /var/lib/juju/system-identity: No such file or directory

Revision history for this message
Curtis Hovey (sinzui) wrote :

I can exec the juju-run symlink via ssh and see that it works:
    ssh <email address hidden> "juju-run unit-jenkins-0 id"
    uid=0(root) gid=0(root) groups=0(root)

    ssh <email address hidden> "juju-run unit-jenkins-0 ls"
    README.md
    TODO.md
    config.yaml
    copyright
    hooks
    icon.svg
    metadata.yaml
    revision
    templates

Ian Booth (wallyworld)
Changed in juju-core:
assignee: nobody → Tim Penhey (thumper)
Revision history for this message
Curtis Hovey (sinzui) wrote :

We can use juju ssh to select a machine or unit, not not all machines or all units in a service
    juju ssh 6 juju-run unit-jenkins-0 id
   uid=0(root) gid=0(root) groups=0(root)

And we can run hooks
    juju ssh 6 juju-run unit-jenkins-0 config-get
    plugins: description-setter build-failure-analyzer credentials git git-client rebuild
      scm-api scripttrigger ssh-credentials
    plugins-check-certificate: "yes"
    plugins-site: http://trusted-ingredients.s3.amazonaws.com/jenkins-1.424.6

Revision history for this message
Curtis Hovey (sinzui) wrote :

I removed the beta3 milestone because we have shown the bug is a regression going back to 1.20. We do want a fix made to 1.21, but as this is an existing issue in 1.20, this does not need to block 1.21-beta3. The goal is to develop a fix, backport it to 1.21, users get the fix when they upgrade.

Revision history for this message
Tim Penhey (thumper) wrote :

Right, after investigating yesterday with Ian, here is what we think the problem is. Back in the upgrade steps to 1.18 there was the addition of the system identity. Somewhere after that, and we think it is in the move to 1.20, the system identity was stored in mongo and the machine agents, when they started, wrote out the system identity file. The catch was that if the system identity was empty, it would remove the file. There was no migration step to put the system identity file into the database, so any upgraded system would lose the ability to use 'juju run'.

Curtis Hovey (sinzui)
no longer affects: juju-core/1.21
Tim Penhey (thumper)
Changed in juju-core:
status: Triaged → In Progress
Tim Penhey (thumper)
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.