Multi-node sunbeam bootstrap on physical and libvirt machines fail

Bug #2038566 reported by Manuel Eurico Paula
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
New
Undecided
Unassigned

Bug Description

Hello, it's been 2 weeks not able to have success to bootstrp microstack by following:
1. Single node installation: https://microstack.run/docs/single-node
2. Single node guided: https://microstack.run/docs/single-node-guided
3. Multi-node: https://microstack.run/docs/multi-node

after seeing the Microstack launch demo at https://www.youtube.com/watch?v=ifDtBM_EHPE

I was trying to set the microstack in a newly acquired test physical machine according to:
https://microstack.run/docs/enterprise-reqs

my hardware specs is on attached file: phys_hw.txt, meets the requirements.

Physical host network configuration can be found on netplan.txt
Disks for OSD are wiped using:
#!/bin/bash

disks="sda sdb sdc sdd"
for d in $disks; do echo wipe disk /dev/$d;sudo wipefs -af /dev/$d; (echo gwq | sudo fdisk /dev/$d); done

Using multi-node procedure for:
== Control plane networking i have:
CIDR: 192.168.0.0/24
Gateway: 192.168.0.1
DHCP addr range: 192.168.0.30-192.168.0.199
Control plane addrr range: 192.168.0.2-192.168.0.29
Interface: br0

==External networking
CIDR: 192.168.2.0/24
Gateway: 192.168.2.1
DHCP addr range: 192.168.2.2-192.168.2.29
Floating IP addrr range: 192.168.2.30-192.168.0.199
Interface: br-ext

Steps execution:
1. sudo snap install openstack --channel 2023.1
(no issues)

2. sunbeam prepare-node-script | bash -x && newgrp snap_daemon
(no issues)

3. sunbeam -v cluster bootstrap --role control --role compute --role storage | tee -a multi_bootstrap_log.txt
Management networks shared by hosts (CIDRs, separated by comma) (192.168.0.0/24): [default selection]
MetalLB address allocation range (supports multiple ranges, comma separated) (10.20.21.10-10.20.21.20): 192.168.0.2-192.168.0.29
Disks to attach to MicroCeph
(/dev/disk/by-id/wwn-0x58ce38ec01e26c31,/dev/disk/by-id/wwn-0x58ce38ec01e274ab,/dev/disk/by-id/wwn-0x58ce38ec01e26abc,/dev/disk/by-id/wwn-0x58ce38ec01ea6bb6): [default selection]

(hours later...)

                    ca-offer-url = "opstk2464/openstack.certificate-authority"
                    keystone-offer-url = "opstk2464/openstack.keystone"
                    ovn-relay-offer-url = "opstk2464/openstack.ovn-relay"
                    rabbitmq-offer-url = "opstk2464/openstack.rabbitmq"
                    , stderr=
           DEBUG Application monitored for readiness: ['certificate-authority', openstack.py:229
                    'keystone-mysql-router', 'glance-mysql', 'traefik', 'placement-mysql',
                    'neutron-mysql', 'keystone-mysql', 'cinder-mysql', 'horizon-mysql', 'nova-mysql',
                    'horizon-mysql-router', 'keystone', 'placement-mysql-router',
                    'cinder-ceph-mysql-router', 'placement', 'rabbitmq', 'horizon', 'cinder-ceph',
                    'glance', 'ovn-central', 'glance-mysql-router', 'nova-cell-mysql-router',
                    'ovn-relay', 'nova-mysql-router', 'neutron-mysql-router', 'nova-api-mysql-router',
                    'cinder-mysql-router', 'neutron', 'nova', 'cinder']
[15:15:27] WARNING Timed out while waiting for model 'openstack' to be ready openstack.py:240
           DEBUG Finished running step 'Deploying OpenStack Control Plane'. Result: ResultType.FAILED common.py:260
Error: Timed out while waiting for model 'openstack' to be ready

ubuntu@opstk2464:~$ juju models
Controller: sunbeam-controller

Model Cloud/Region Type Status Machines Cores Units Access Last connection
admin/controller* sunbeam/default manual available 1 12 4 admin just now
openstack sunbeam-microk8s/localhost kubernetes available 0 - 30 admin 3 hours ago

reports only openstack model is created.

juju status model openstack model:

ubuntu@opstk2464:~$ juju status -m openstack
Model Controller Cloud/Region Version SLA Timestamp
openstack sunbeam-controller sunbeam-microk8s/localhost 3.2.0 unsupported 17:38:19Z

SAAS Status Store URL
microceph active local admin/controller.microceph

App Version Status Scale Charm Channel Rev Address Exposed Message
certificate-authority active 1 tls-certificates-operator latest/stable 22 10.152.183.23 no
cinder waiting 1 cinder-k8s 2023.1/stable 47 10.152.183.162 no installing agent
cinder-ceph waiting 1 cinder-ceph-k8s 2023.1/stable 38 10.152.183.90 no installing agent
cinder-ceph-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.196 no
cinder-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.83 no
cinder-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.19 no
glance active 1 glance-k8s 2023.1/stable 59 10.152.183.200 no
glance-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.64 no
glance-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.84 no
horizon active 1 horizon-k8s 2023.1/stable 56 10.152.183.46 no http://192.168.0.2/openstack-horizon
horizon-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.30 no
horizon-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.18 no
keystone active 1 keystone-k8s 2023.1/stable 125 10.152.183.216 no
keystone-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.164 no
keystone-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.168 no
neutron active 1 neutron-k8s 2023.1/stable 53 10.152.183.97 no
neutron-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.134 no
neutron-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.161 no
nova waiting 1 nova-k8s 2023.1/stable 48 10.152.183.240 no installing agent
nova-api-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.52 no
nova-cell-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.157 no
nova-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.34 no
nova-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.73 no
ovn-central active 1 ovn-central-k8s 23.03/stable 61 10.152.183.50 no
ovn-relay active 1 ovn-relay-k8s 23.03/stable 49 192.168.0.4 no
placement active 1 placement-k8s 2023.1/stable 43 10.152.183.48 no
placement-mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.242 no
placement-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.68 no
rabbitmq 3.9.13 active 1 rabbitmq-k8s 3.9/stable 30 192.168.0.3 no
traefik 2.10.4 active 1 traefik-k8s 1.0/candidate 148 192.168.0.2 no

Unit Workload Agent Address Ports Message
certificate-authority/0* active idle 10.1.94.135
cinder-ceph-mysql-router/0* active idle 10.1.94.159
cinder-ceph/0* blocked idle 10.1.94.163 (workload) Error in charm (see logs): cannot perform the following tasks:
- Start service "cinder-volume" (cannot sta...
cinder-mysql-router/0* active idle 10.1.94.169
cinder-mysql/0* active idle 10.1.94.154 Primary
cinder/0* blocked idle 10.1.94.174 (workload) Error in charm (see logs): cannot perform the following tasks:
- Start service "cinder-scheduler" (cannot ...
glance-mysql-router/0* active idle 10.1.94.164
glance-mysql/0* active idle 10.1.94.140 Primary
glance/0* active idle 10.1.94.173
horizon-mysql-router/0* active idle 10.1.94.149
horizon-mysql/0* active idle 10.1.94.151 Primary
horizon/0* active idle 10.1.94.161
keystone-mysql-router/0* active idle 10.1.94.136
keystone-mysql/0* active idle 10.1.94.150 Primary
keystone/0* active idle 10.1.94.157
neutron-mysql-router/0* active idle 10.1.94.167
neutron-mysql/0* active idle 10.1.94.148 Primary
neutron/0* active idle 10.1.94.172
nova-api-mysql-router/0* active idle 10.1.94.168
nova-cell-mysql-router/0* active idle 10.1.94.165
nova-mysql-router/0* active idle 10.1.94.166
nova-mysql/0* active idle 10.1.94.155 Primary
nova/0* error idle 10.1.94.176 hook failed: "amqp-relation-changed"
ovn-central/0* active idle 10.1.94.177
ovn-relay/0* active idle 10.1.94.171
placement-mysql-router/0* active idle 10.1.94.156
placement-mysql/0* active idle 10.1.94.142 Primary
placement/0* active idle 10.1.94.158
rabbitmq/0* active idle 10.1.94.162
traefik/0* active idle 10.1.94.144

Offer Application Charm Rev Connected Endpoint Interface Role
certificate-authority certificate-authority tls-certificates-operator 22 0/0 certificates tls-certificates provider
keystone keystone keystone-k8s 125 0/0 identity-credentials keystone-credentials provider
ovn-relay ovn-relay ovn-relay-k8s 49 0/0 ovsdb-cms-relay ovsdb-cms provider
rabbitmq rabbitmq rabbitmq-k8s 30 0/0 amqp rabbitmq provider
ubuntu@opstk2464:~$

failed on cinder-k8s installing agent waiting
failed on cinder-ceph installing agent waiting
on juju applications

cinder-ceph/0* blocked idle 10.1.94.163 (workload) Error in charm (see logs): cannot perform the following tasks: - Start service "cinder-volume" (cannot sta...
cinder/0* blocked idle 10.1.94.174 (workload) Error in charm (see logs): cannot perform the following tasks: - Start service "cinder-scheduler" (cannot ...
nova/0* error idle 10.1.94.176 hook failed: "amqp-relation-changed"

error on cinder-ceph/0 w/ cannot start cinder-volume
error on cinder/0 w/ cannot start cinder-scheduler
error on nova/0 w/ hook failed: "amqp-relation-changed" (varies)

But microceph is running

ubuntu@opstk2464:~$ sudo microceph status
MicroCeph deployment summary:
- opstk2464 (192.168.0.2)
  Services: mds, mgr, mon, osd
  Disks: 4

Accordidng to https://microstack.run/docs/inspect

Openstack-hypervisor is not bootstraped:
ubuntu@opstk2464:~$ juju status -m openstack-hypervisor
ERROR model sunbeam-controller:opstk2464/openstack-hypervisor not found
ubuntu@opstk2464:~$

ubuntu@opstk2464:~$ sudo systemctl status snap.openstack-hypervisor.*
ubuntu@opstk2464:~$

Microk8s set:
ubuntu@opstk2464:~$ sudo systemctl status snap.openstack-hypervisor.*
ubuntu@opstk2464:~$ sudo microk8s status
microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    dns # (core) CoreDNS
    ha-cluster # (core) Configure high availability on the current node
    helm # (core) Helm - the package manager for Kubernetes
    helm3 # (core) Helm 3 - the package manager for Kubernetes
    hostpath-storage # (core) Storage class; allocates storage from host directory
    metallb # (core) Loadbalancer for your Kubernetes cluster
    storage # (core) Alias to hostpath-storage add-on, deprecated
  disabled:
    cert-manager # (core) Cloud native certificate management
    community # (core) The community addons repository
    dashboard # (core) The Kubernetes dashboard
    host-access # (core) Allow Pods connecting to Host services smoothly
    ingress # (core) Ingress controller for external access
    mayastor # (core) OpenEBS MayaStor
    metrics-server # (core) K8s Metrics Server for API access to service metrics
    minio # (core) MinIO object storage
    observability # (core) A lightweight observability stack for logs, traces and metrics
    prometheus # (core) Prometheus operator for monitoring and logging
    rbac # (core) Role-Based Access Control for authorisation
    registry # (core) Private image registry exposed on localhost:32000

sudo microk8s inspect
(report is attached)

ubuntu@opstk2464:~$ sudo microk8s.kubectl get pods --namespace openstack
NAME READY STATUS RESTARTS AGE
modeloperator-797bd6575b-jplvt 1/1 Running 0 3h36m
certificate-authority-0 1/1 Running 0 3h35m
ovn-relay-0 2/2 Running 0 3h33m
keystone-0 2/2 Running 0 3h34m
horizon-mysql-router-0 2/2 Running 0 3h34m
horizon-mysql-0 2/2 Running 0 3h34m
placement-mysql-0 2/2 Running 0 3h34m
cinder-mysql-router-0 2/2 Running 0 3h33m
glance-mysql-router-0 2/2 Running 0 3h33m
glance-mysql-0 2/2 Running 0 3h34m
neutron-mysql-router-0 2/2 Running 0 3h33m
nova-cell-mysql-router-0 2/2 Running 0 3h33m
nova-mysql-router-0 2/2 Running 0 3h33m
keystone-mysql-router-0 2/2 Running 0 3h34m
nova-api-mysql-router-0 2/2 Running 0 3h33m
nova-mysql-0 2/2 Running 0 3h34m
cinder-ceph-mysql-router-0 2/2 Running 0 3h33m
placement-mysql-router-0 2/2 Running 0 3h34m
ovn-central-0 4/4 Running 0 3h32m
rabbitmq-0 2/2 Running 0 3h33m
traefik-0 2/2 Running 0 3h34m
horizon-0 2/2 Running 0 3h33m
cinder-mysql-0 2/2 Running 0 3h34m
placement-0 2/2 Running 0 3h33m
keystone-mysql-0 2/2 Running 0 3h34m
glance-0 2/2 Running 0 3h33m
neutron-0 2/2 Running 0 3h33m
nova-0 4/4 Running 0 3h32m
cinder-ceph-0 2/2 Running 0 3h33m
cinder-0 3/3 Running 0 3h32m
neutron-mysql-0 2/2 Running 0 3h34m
ubuntu@opstk2464:~$

ubuntu@opstk2464:~$ sudo microk8s.kubectl get pod --namespace openstack -o jsonpath="{.spec.containers[*].name}" cinder-0
charm cinder-api cinder-scheduler
(cinder-api-cinder0.log & cinder-scheduler_cinder0.log attached)

ubuntu@opstk2464:~$ sudo microk8s.kubectl get pod --namespace openstack -o jsonpath="{.spec.containers[*].name}" cinder-ceph-0
charm cinder-volume
(cinder-volume_cinder-ceph-0.log attached)

ubuntu@opstk2464:~$ sudo microk8s.kubectl get pod --namespace openstack -o jsonpath="{.spec.containers[*].name}" nova-0
charm nova-api nova-conductor nova-scheduler
(nova-api_nova-0.log + nova-conductor_nova-0.log + nova-scheduler_nova-0.log attached)

No locked terraform plans:
ubuntu@opstk2464:~$ sunbeam inspect plans
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Plan ┃ Locked ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ microceph-plan │ │
│ microk8s-plan │ │
│ openstack-plan │ │
│ sunbeam-machine-plan │ │
└──────────────────────┴────────┘

[note also tried with after 12 Teardown (https://ubuntu.com/openstack/tutorials) and wipe all osd SATA disks with:

ubuntu@opstk2464:~$ cat reset_disks.sh
#!/bin/bash

disks="sda sdb sdc sdd"

for d in $disks; do echo wipe disk /dev/$d;sudo wipefs -af /dev/$d; (echo gwq | sudo fdisk /dev/$d); done

### Deploy using edge channel:
sudo snap openstack --edge

Problems remained (after a few hours later...)

Revision history for this message
Manuel Eurico Paula (manuel-paula) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.