ppc64el - jujud: Syntax error: word unexpected (expecting ")")

Bug #1420049 reported by Ryan Beisner
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Andrew Wilkins
1.22
Fix Released
High
Andrew Wilkins

Bug Description

jujud: Syntax error: word unexpected (expecting ")")

When deploying to ppc64el-hosted lxc containers, with the bootstrap node being amd64, all lxc units timeout in a pending state.

Inspection of juju unit logs in each lxc container show only the following in each:
/var/lib/juju/tools/machine-1-lxc-1/jujud: 1: /var/lib/juju/tools/machine-1-lxc-1/jujud: Syntax error: word unexpected (expecting ")")

See http://paste.ubuntu.com/10145407/ for 1 entire unit log.

The use case is:
We have one service which needs to be on amd64, one service which needs to be on ppc64el natively, and several services which need to be in lxc containers on the ppc64el host. Namely, using ppc64el as the nova-compute and OpenStack api services node, and the amd64 node for bootstrap + neutron-gateway.

The same bundle and charms deploy successfully when both hosts are amd64.

See attached additional info.

This scenario work in 1.20.x

Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Raw upstart init file tarball attached for review.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI, for continuity from #juju-dev conversation:
<davecheney> [18:30:16] best I can tell the error is coming from upstart
<davecheney> [18:34:44] can you make an issue and attach the raw file from upstart
<davecheney> [18:34:46] the one you pasted
<davecheney> [18:34:50] it has to be the raw file
<davecheney> [18:35:06] i suspect there is some control characters or other whitespace crap in there that is throwing off upstart

Revision history for this message
Curtis Hovey (sinzui) wrote :

Which version of juju is this?

tags: added: deploy
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Oh sorry, forgot that important detail. This is from the perspective of the admin machine which performs the juju bootstrap and deploy.

ubuntu@beisner-bastion:~/tmp$ apt-cache policy juju
juju:
  Installed: 1.21.1-0ubuntu1~14.04.1~juju1
  Candidate: 1.21.1-0ubuntu1~14.04.1~juju1
  Version table:
 *** 1.21.1-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/stable/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.20.11-0ubuntu0.14.04.1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     1.18.1-0ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.23
tags: added: regression
Ryan Beisner (1chb1n)
summary: - jujud: Syntax error: word unexpected (expecting ")")
+ ppc64el - jujud: Syntax error: word unexpected (expecting ")")
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Re-tested with juju/devel. The same error occurs in the ppc64el-hosted lxc container unit logs.

ubuntu@beisner-bastion:~/tmp$ apt-cache policy juju
juju:
  Installed: 1.22-beta2-0ubuntu1~14.04.1~juju1
  Candidate: 1.22-beta2-0ubuntu1~14.04.1~juju1
  Version table:
 *** 1.22-beta2-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/devel/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.21.1-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/stable/ubuntu/ trusty/main amd64 Packages
     1.20.11-0ubuntu0.14.04.1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     1.18.1-0ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

Ryan Beisner (1chb1n)
description: updated
Revision history for this message
Tim Penhey (thumper) wrote :

On the machine that is hosting the lxc containers, there is a directory "/var/lib/juju/containers".

There will be a directory for each container. That directory contains the cloud-init file and the logging from the lxc commands. Can you please attach those files?

Thanks.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

@thumper The environment is torn down at the moment, will re-deploy and collect tomorrow.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Redeployed with juju/stable, collected /var/lib/juju/containers, see attached tgz.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Also attaching /var/log/juju. Thanks again.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

^ Both from the ppc64-el host of course.

Revision history for this message
Tim Penhey (thumper) wrote :

OK, looking at those logs it is clear that something is screwed up but I can't quit tell what.

cloud-init is fine, except that the juju machine agent is repeatedly failing

AFAICT the upstart script is correct, but I'd like try some things.

Can you ping me on IRC around start of day NZ time? I'm on from about 2000UTC

Revision history for this message
Tim Penhey (thumper) wrote :

Actually it looks like the LXC container is being started with amd64 tools, not power ones. This is an issue.

Revision history for this message
Dave Cheney (dave-cheney) wrote : Re: [Bug 1420049] Re: ppc64el - jujud: Syntax error: word unexpected (expecting ")")

That is because machine-0 is amd64, tools selection must be fixated on
that version even though constraints force the provider to create
ppc64le machines.

On Thu, Feb 12, 2015 at 12:22 PM, Tim Penhey <email address hidden> wrote:
> Actually it looks like the LXC container is being started with amd64
> tools, not power ones. This is an issue.
>
> --
> You received this bug notification because you are subscribed to juju-
> core.
> Matching subscriptions: MOAR JUJU SPAM!
> https://bugs.launchpad.net/bugs/1420049
>
> Title:
> ppc64el - jujud: Syntax error: word unexpected (expecting ")")
>
> Status in juju-core:
> Triaged
> Status in juju-core 1.21 series:
> Triaged
> Status in juju-core 1.22 series:
> Triaged
>
> Bug description:
> jujud: Syntax error: word unexpected (expecting ")")
>
> When deploying to ppc64el-hosted lxc containers, with the bootstrap
> node being amd64, all lxc units timeout in a pending state.
>
> Inspection of juju unit logs in each lxc container show only the following in each:
> /var/lib/juju/tools/machine-1-lxc-1/jujud: 1: /var/lib/juju/tools/machine-1-lxc-1/jujud: Syntax error: word unexpected (expecting ")")
>
> See http://paste.ubuntu.com/10145407/ for 1 entire unit log.
>
> The use case is:
> We have one service which needs to be on amd64, one service which needs to be on ppc64el natively, and several services which need to be in lxc containers on the ppc64el host. Namely, using ppc64el as the nova-compute and OpenStack api services node, and the amd64 node for bootstrap + neutron-gateway.
>
> The same bundle and charms deploy successfully when both hosts are
> amd64.
>
> See attached additional info.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju-core/+bug/1420049/+subscriptions

Ian Booth (wallyworld)
no longer affects: juju-core/1.21
Revision history for this message
Curtis Hovey (sinzui) wrote :

Does this scenario work with 1.20.x?

Revision history for this message
Ryan Beisner (1chb1n) wrote :

This use case does indeed succeed with juju 1.20.x. See attached txt. Will also attach new tarballs.

Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Curtis Hovey (sinzui)
description: updated
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

An important log seems to be missing here - machine-0.log from the amd64 machine showing provisioning failures and what tools are selected, but also it will help to set logging-config: <root>=TRACE in environments.yaml and reproduce the issue, then the logs will be much more useful (now machine-1.log seems to be mostly empty, because logging-config is set to <root>=WARNING as soon as it starts).

Regarding the second case (1.20.x) - your juju client binary is 1.20.11, but your agent versions all show 1.20.14, which most likely means you've used --metadata-source or --upload-tools or something like that as arguments to juju bootstrap. If you're using custom images and metadata, then it's likely there's a issue with them which lead to selection of the wrong tools.

Revision history for this message
Curtis Hovey (sinzui) wrote :

I think the mismatched 1.20.x versions was caused by juju client's optimistic behaviour to select the highest micro version of jujud. Since 1.20.14 is the last of the 1.20's, anyone bootstrapping with any 1.20 will get it when working with public streams.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI - The commands for reproducing are in the attachment https://launchpadlibrarian.net/197983164/juju-lxc-ppc64el-1.20.x.txt.

Should we redeploy and collect machine-0.log? If so, which juju version should we use?

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

The issue should be fixed in 1.22 using https://github.com/juju/juju/pull/1635, however the same PR introduced test failures for ppc64 tests - see bug 1423950. So until these are fixed #1635 should NOT be ported to trunk to fix this bug.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Tested with juju-core 1.22-beta4-0ubuntu1~14.04.1~juju1 debs. Not quite resolved, but the behavior has changed. See attachment.

Units no longer get stuck in 'pending' state; they are now getting stuck in an 'allocating' state.

The unit log for lxc machines now shows:
ubuntu@gregory-ppc64:/var/log/juju$ sudo cat machine-1-lxc-0.log
/var/lib/juju/tools/machine-1-lxc-0/jujud: 3: /var/lib/juju/tools/machine-1-lxc-0/jujud: Syntax error: Unterminated quoted string
/var/lib/juju/tools/machine-1-lxc-0/jujud: 3: /var/lib/juju/tools/machine-1-lxc-0/jujud: Syntax error: Unterminated quoted string
/var/lib/juju/tools/machine-1-lxc-0/jujud: 3: /var/lib/juju/tools/machine-1-lxc-0/jujud: Syntax error: Unterminated quoted string

Let me know if we should provide any other info. The environment is still up as shown.

Thanks again!

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Ryan, it's not enough to use the updated client; Juju is still locating the old beta3 tools when bootstrapping and adding machines. You can check this by looking at the "agent-version" in the latest "juju status" output.

I guess CI hasn't published beta4 to the devel stream yet because of the test failure that Dimiter mentioned above.

Changed in juju-core:
assignee: nobody → James Tunnicliffe (dooferlad)
Changed in juju-core:
status: Triaged → In Progress
Changed in juju-core:
status: In Progress → Fix Committed
status: Fix Committed → In Progress
Revision history for this message
Dimiter Naydenov (dimitern) wrote :
Changed in juju-core:
status: In Progress → Fix Committed
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Download full text (3.7 KiB)

The issue persists in 1.22beta4:

juju-core:
  Installed: 1.22-beta4-0ubuntu1~14.04.1~juju1
  Candidate: 1.22-beta4-0ubuntu1~14.04.1~juju1
  Version table:
 *** 1.22-beta4-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/devel/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.21.3-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/stable/ubuntu/ trusty/main amd64 Packages
     1.20.11-0ubuntu0.14.04.1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     1.18.1-0ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

environment: maas
machines:
  "0":
    agent-state: started
    agent-version: 1.22-beta3
    dns-name: innocent-caption.dellstack
    instance-id: /MAAS/api/1.0/nodes/node-3f9822fc-823c-11e4-9105-d4bed9a84493/
    series: trusty
    containers:
      0/lxc/0:
        agent-state: started
        agent-version: 1.22-beta3
        dns-name: 10.245.172.254
        instance-id: juju-machine-0-lxc-0
        series: trusty
        hardware: arch=amd64
    hardware: arch=amd64 cpu-cores=8 mem=8192M
    state-server-member-status: has-vote
  "1":
    agent-state: started
    agent-version: 1.22-beta3
    dns-name: gregory-ppc64.dellstack
    instance-id: /MAAS/api/1.0/nodes/node-11c03686-9d7f-11e4-91da-d4bed9a84493/
    series: trusty
    containers:
      1/lxc/0:
        agent-state: pending
        instance-id: juju-machine-1-lxc-0
        series: trusty
        hardware: arch=ppc64el
      1/lxc/1:
        agent-state: pending
        instance-id: juju-machine-1-lxc-1
        series: trusty
        hardware: arch=ppc64el
      1/lxc/2:
        agent-state: pending
        instance-id: juju-machine-1-lxc-2
        series: trusty
        hardware: arch=ppc64el
      1/lxc/3:
        agent-state: pending
        instance-id: juju-machine-1-lxc-3
        series: trusty
        hardware: arch=ppc64el
    hardware: arch=ppc64el cpu-cores=152 mem=130552M
services:
  keystone-lxc-ppc64el:
    charm: cs:trusty/keystone-14
    exposed: false
    relations:
      cluster:
      - keystone-lxc-ppc64el
    units:
      keystone-lxc-ppc64el/0:
        agent-state: allocating
        machine: 1/lxc/2
  mongodb:
    charm: cs:trusty/mongodb-16
    exposed: false
    relations:
      replica-set:
      - mongodb
    units:
      mongodb/0:
        agent-state: started
        agent-version: 1.22-beta3
        machine: "1"
        open-ports:
        - 27017/tcp
        - 27019/tcp
        - 27021/tcp
        - 28017/tcp
        public-address: gregory-ppc64.dellstack
  mysql:
    charm: cs:trusty/mysql-20
    exposed: false
    relations:
      cluster:
      - mysql
    units:
      mysql/0:
        agent-state: started
        agent-version: 1.22-beta3
        machine: "0"
        public-address: innocent-caption.dellstack
  mysql-lxc-ppc64el:
    charm: cs:trusty/mysql-20
    exposed: false
    relations:
      cluster:
      - mysql-lxc-ppc64el
    units:
      mysql-lxc-ppc64el/0:
        agent-state: allocating
        machine: 1/lxc/1
  nfs-lxc-amd64:
    charm: cs:trusty/nfs-0
    exposed: ...

Read more...

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Ahh bugger. I see the agent is still 1.22beta3. Why is that?

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Did you try setting "agent-stream: devel" in your environ config? I believe that is necessary to pick up the right agent version. Not sure how you'd get beta3 without it though.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Yes, the enviro definition is as follows:

  maas:
    type: maas
    maas-server: 'http://10.245.168.2:80/MAAS/'
    maas-oauth: XXXXXX
    admin-secret: XXXXXX
    agent-stream: devel

On Thu, Feb 26, 2015 at 7:14 PM, Andrew Wilkins <
<email address hidden>> wrote:

> Did you try setting "agent-stream: devel" in your environ config? I
> believe that is necessary to pick up the right agent version. Not sure
> how you'd get beta3 without it though.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1420049
>
> Title:
> ppc64el - jujud: Syntax error: word unexpected (expecting ")")
>
> Status in juju-core:
> Fix Committed
> Status in juju-core 1.22 series:
> Fix Released
>
> Bug description:
> jujud: Syntax error: word unexpected (expecting ")")
>
> When deploying to ppc64el-hosted lxc containers, with the bootstrap
> node being amd64, all lxc units timeout in a pending state.
>
> Inspection of juju unit logs in each lxc container show only the
> following in each:
> /var/lib/juju/tools/machine-1-lxc-1/jujud: 1:
> /var/lib/juju/tools/machine-1-lxc-1/jujud: Syntax error: word unexpected
> (expecting ")")
>
> See http://paste.ubuntu.com/10145407/ for 1 entire unit log.
>
> The use case is:
> We have one service which needs to be on amd64, one service which needs
> to be on ppc64el natively, and several services which need to be in lxc
> containers on the ppc64el host. Namely, using ppc64el as the nova-compute
> and OpenStack api services node, and the amd64 node for bootstrap +
> neutron-gateway.
>
> The same bundle and charms deploy successfully when both hosts are
> amd64.
>
> See attached additional info.
>
> This scenario work in 1.20.x
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju-core/+bug/1420049/+subscriptions
>

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FWIW, during bootstrap, I'm seeing this:

WARNING failed to find 1.22-beta4 tools, will attempt to use 1.22-beta3

Revision history for this message
Andrew Wilkins (axwalk) wrote :

I saw on the juju mailing list that the beta4 images were missing from streams, and that's been rectified now. Can you please retry and let us know how it goes?

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Behavior has changed. This is what I'm seeing currently:

* 1.22beta4 tools are indeed being used now.

* amd64 lxc deploy succeeds as expected; ppc64el lxc deploy fails in a different way.

* ppc64el lxc unit agent-states are stuck in 'allocating' status.

* Machine 1 (ppc64el) logs show that the 1.22beta4 ppc64el tools were used to deploy a charm successfully (not in lxc).

* Machine 1 logs also show that juju thinks there are no ppc64el tools available for the charm deployed to lxc on the ppc64el host. However, tree output of /var/lib/juju shows the ppc64el 1.22beta4 tools available.

* ppc64el machine 1 has no /var/lib/juju/containers as we never reach lxc instantiation.

* See new attachments.

* General info:

## 1.22beta4 tools are now used:

2015-03-03 04:36:57 INFO juju.environs.bootstrap bootstrap.go:178 newest version: 1.22-beta4
2015-03-03 04:36:57 INFO juju.environs.bootstrap bootstrap.go:206 picked bootstrap tools version: 1.22-beta4

    units:
      mongodb/0:
        agent-state: started
        agent-version: 1.22-beta4

## package versions
juju:
  Installed: 1.22-beta4-0ubuntu1~14.04.1~juju1
--
juju-core:
  Installed: 1.22-beta4-0ubuntu1~14.04.1~juju1
--
juju-deployer:
  Installed: 0.4.3-0ubuntu1~ubuntu14.04.1~ppa1
--
juju-local:
  Installed: 1.22-beta4-0ubuntu1~14.04.1~juju1
--
juju-mongodb:
  Installed: 2.4.9-0ubuntu3
--
juju-quickstart:
  Installed: 1.6.0+bzr115+ppa30~ubuntu14.04.1
--
python-jujuclient:
  Installed: 0.50.1-2

Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI, reproducer for the latest iteration ^

#!/bin/bash -e
juju switch maas
juju bootstrap --debug --constraints arch=amd64
juju --debug deploy --constraints arch=ppc64el mongodb
juju --debug deploy --to lxc:0 nfs nfs-lxc-amd64
juju --debug deploy --to lxc:1 nfs nfs-lxc-ppc64el

Revision history for this message
Ryan Beisner (1chb1n) wrote :

PS, May I add... the new tabular juju stat output format is quite nice!

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Thanks for showing the commands, that highlights the problem quite clearly. You're setting an arch constraint for the environment; we should be ignoring this for LXC containers.

You can work around this for the moment by issuing "juju set-constraints" after bootstrapping to clear the constraints. Or set arch=ppc64el in constraints when deploying nfs-lxc-ppc64el.

Revision history for this message
Andrew Wilkins (axwalk) wrote :

So thinking about that a bit more, I don't think we should ignore the arch constraint. If you've set a constraint, we should honour it and either produce a machine with that architecture, or fail. We should just be failing earlier.

Revision history for this message
Andrew Wilkins (axwalk) wrote :

wallyworld convinced me it's okay to ignore the arch. I'll get onto it.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI: We have to set the amd64 arch constraint at the time of bootstrap so that a ppc64el host isn't arbitrarily chosen as the bootstrap node by MAAS.

The expected behavior in this scenario is for juju lxc deployments to mirror the behavior of non-lxc deployments. Since it is working in this case for non-lxc units, it should also be made to work for lxc units.

PS the reproducer script has been in the bug since comment #1, though I've cut it down for simplification.

Andrew Wilkins (axwalk)
Changed in juju-core:
status: Fix Committed → Triaged
assignee: James Tunnicliffe (dooferlad) → Andrew Wilkins (axwalk)
Andrew Wilkins (axwalk)
Changed in juju-core:
status: Triaged → In Progress
Andrew Wilkins (axwalk)
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.23 → 1.23-beta1
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Released → In Progress
Revision history for this message
Curtis Hovey (sinzui) wrote :

beisner: sinzui, things are looking up with those 1.23-alpha1-0ubuntu1~14.04.1~juju1 debs. fyi, also throwing the same at bug 1420049

Revision history for this message
Andrew Wilkins (axwalk) wrote :

It's not clear from that comment that the issue is still present. I read that as Ryan testing the fix with the latest code (I assume that deb version meant to read "beta1" and not "alpha1"). Ryan, would you please clarify if the problem persists with beta1?

Revision history for this message
Ian Booth (wallyworld) wrote :

This was marked from fix committed to in progress for 1.23; I checked the code and both 1.22 and 1.23 branches have had the fix applied. If it works for one, it will also work for the other.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

This appears to be resolved with 1.22beta6 (looks like it picked up 1.22beta5 tools). See attachment.

Ian Booth (wallyworld)
Changed in juju-core:
status: In Progress → Fix Committed
Revision history for this message
Sean Feole (sfeole) wrote :

Confirmed , fixed in 1.22beta5

  "1":
    agent-state: started
    agent-version: 1.22-beta5
    dns-name: 10.228.12.125
    instance-id: manual:10.228.12.125
    series: vivid
    containers:
      1/lxc/0:
        agent-state: started
        agent-version: 1.22-beta5
        dns-name: 10.0.3.134
        instance-id: juju-machine-1-lxc-0
        series: vivid
        hardware: arch=arm64
      1/lxc/1:
        agent-state: started
        agent-version: 1.22-beta5
        dns-name: 10.0.3.76
        instance-id: juju-machine-1-lxc-1
        series: vivid
        hardware: arch=arm64
    hardware: arch=arm64 cpu-cores=1 mem=64516M
services: {}

Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.