High CPU usage on 2.7 edge

Bug #1847145 reported by Kenneth Koski
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth

Bug Description

I have deployed a 2.7 edge Juju controller to Microk8s, and I'm getting 300% CPU usage from this process:

    /var/lib/juju/tools/jujud machine --data-dir /var/lib/juju --controller-id 0 --log-to-stderr --show-log

I don't yet have any models/applications deployed to it, so it's just the controller. The controller works properly, apart from the high CPU usage. When I look at the container logs, I don't see anything that looks relevant to the high CPU usage:

2019-10-07 20:27:12 INFO juju.cmd supercommand.go:79 running jujud [2.7-beta1 gc go1.12.10]
2019-10-07 20:27:12 INFO juju.agent identity.go:28 removing system identity file
2019-10-07 20:27:12 WARNING juju.mongo open.go:160 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: connect: connection refused
2019-10-07 20:27:12 WARNING juju.mongo open.go:160 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: connect: connection refused
2019-10-07 20:27:13 WARNING juju.mongo open.go:160 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: connect: connection refused
2019-10-07 20:27:14 INFO juju.replicaset replicaset.go:58 Initiating replicaset with config: {
  Name: juju,
  Version: 1,
  Protocol Version: 1,
  Members: {
    {1 "localhost:37017" juju-machine-id:0 voting},
  },
}
2019-10-07 20:27:14 INFO juju.worker.peergrouper initiate.go:57 replica set initiated
2019-10-07 20:27:14 INFO juju.worker.peergrouper initiate.go:58 finished InitiateMongoServer
2019-10-07 20:27:14 INFO juju.cmd.jujud bootstrap.go:479 started mongo
2019-10-07 20:27:16 INFO juju.state open.go:144 using client-side transactions
2019-10-07 20:27:17 INFO juju.state logs.go:91 controller settings not found, early stage initialization assumed
2019-10-07 20:27:17 INFO juju.state state.go:467 starting standard state workers
2019-10-07 20:27:17 INFO juju.state state.go:474 creating cloud image metadata storage
2019-10-07 20:27:17 INFO juju.state state.go:480 started state for model-b95eca3b-c79f-4af4-8287-60ab3dc72d05 successfully
2019-10-07 20:27:17 INFO juju.state initialize.go:184 initializing controller model b95eca3b-c79f-4af4-8287-60ab3dc72d05
2019-10-07 20:27:18 INFO juju.state logs.go:169 creating logs collection for b95eca3b-c79f-4af4-8287-60ab3dc72d05, capped at 20 MiB
2019-10-07 20:27:18 INFO juju.agent.agentbootstrap bootstrap.go:506 creating cloud service for k8s controller "766bb11b-da56-4586-87f1-1d14e8734cc2"
2019-10-07 20:27:18 WARNING juju.cmd.jujud bootstrap.go:357 cannot set up Juju GUI: cannot fetch GUI info: GUI metadata not found
2019-10-07 20:27:18 INFO cmd supercommand.go:525 command finished
2019-10-07 20:27:18 INFO juju.cmd supercommand.go:79 running jujud [2.7-beta1 gc go1.12.10]
2019-10-07 20:27:19 INFO juju.worker.upgradesteps worker.go:72 upgrade steps for 2.7-beta1 have already been run.
2019-10-07 20:27:19 INFO juju.state open.go:144 using client-side transactions
2019-10-07 20:27:19 INFO juju.state state.go:467 starting standard state workers
2019-10-07 20:27:19 INFO juju.state state.go:474 creating cloud image metadata storage
2019-10-07 20:27:19 INFO juju.state state.go:480 started state for model-b95eca3b-c79f-4af4-8287-60ab3dc72d05 successfully
2019-10-07 20:27:19 INFO juju.cmd.jujud machine.go:1022 juju database opened
2019-10-07 20:27:19 INFO juju.worker.apiservercertwatcher manifold.go:192 new certificate addresses:
2019-10-07 20:27:19 INFO juju.worker.httpserver worker.go:293 listening on "[::]:17070"
2019-10-07 20:27:19 INFO juju.api apiclient.go:624 connection established to "wss://localhost:17070/model/b95eca3b-c79f-4af4-8287-60ab3dc72d05/api"
2019-10-07 20:27:20 INFO juju.state open.go:144 using client-side transactions
2019-10-07 20:27:20 INFO juju.worker.apicaller connect.go:158 [b95eca] "controller-0" successfully connected to "localhost:17070"
2019-10-07 20:27:20 INFO juju.apiserver.common password.go:74 setting mongo password for "controller-0"
2019-10-07 20:27:20 INFO juju.apiserver.common password.go:99 setting password for "controller-0"
2019-10-07 20:27:20 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
2019-10-07 20:27:20 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
2019-10-07 20:27:20 INFO juju.api apiclient.go:624 connection established to "wss://localhost:17070/model/b95eca3b-c79f-4af4-8287-60ab3dc72d05/api"
2019-10-07 20:27:20 INFO juju.worker.migrationminion worker.go:139 migration phase is now: NONE
2019-10-07 20:27:20 INFO juju.worker.logger logger.go:118 logger worker started
2019-10-07 20:27:20 INFO juju.agent identity.go:28 removing system identity file
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:18 INFO juju.cmd supercommand.go:79 running jujud [2.7-beta1 gc go1.12.10]
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.worker.upgradesteps worker.go:72 upgrade steps for 2.7-beta1 have already been run.
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.state open.go:144 using client-side transactions
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.state state.go:467 starting standard state workers
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.state state.go:474 creating cloud image metadata storage
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.state state.go:480 started state for model-b95eca3b-c79f-4af4-8287-60ab3dc72d05 successfully
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.cmd.jujud machine.go:1022 juju database opened
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.worker.apiservercertwatcher manifold.go:192 new certificate addresses:
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.worker.httpserver worker.go:293 listening on "[::]:17070"
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:19 INFO juju.api apiclient.go:624 connection established to "wss://localhost:17070/model/b95eca3b-c79f-4af4-8287-60ab3dc72d05/api"
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.state open.go:144 using client-side transactions
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.worker.apicaller connect.go:158 [b95eca] "controller-0" successfully connected to "localhost:17070"
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.apiserver.common password.go:74 setting mongo password for "controller-0"
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.apiserver.common password.go:99 setting password for "controller-0"
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.api apiclient.go:624 connection established to "wss://localhost:17070/model/b95eca3b-c79f-4af4-8287-60ab3dc72d05/api"
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.worker.migrationminion worker.go:139 migration phase is now: NONE
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.worker.logger logger.go:118 logger worker started
b95eca3b-c79f-4af4-8287-60ab3dc72d05: controller-0 2019-10-07 20:27:20 INFO juju.agent identity.go:28 removing system identity file
2019-10-07 20:27:26 INFO juju.worker.logger logger.go:118 logger worker started

Revision history for this message
Kenneth Koski (knkski) wrote :

When I deploy a number of charms to MicroK8s running on an AWS instance, I'm also getting 1.5G of memory used, and I think that this is responsible for causing some CI errors due to a Kubeflow deploy timing out, even though I later SSHed into the AWS instance and checked that all of the pods were running properly (so it looks like it was just really slow to deploy).

Revision history for this message
Ian Booth (wallyworld) wrote :

Turns out pubsub was sending in a tight loop

https://github.com/juju/juju/pull/10723

Changed in juju:
milestone: none → 2.7-beta1
assignee: nobody → Ian Booth (wallyworld)
importance: Undecided → High
status: New → In Progress
Ian Booth (wallyworld)
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.