Multi-node sunbeam bootstrap on physical and libvirt machines fail
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Snap |
New
|
Undecided
|
Unassigned |
Bug Description
Hello, it's been 2 weeks not able to have success to bootstrp microstack by following:
1. Single node installation: https:/
2. Single node guided: https:/
3. Multi-node: https:/
after seeing the Microstack launch demo at https:/
I was trying to set the microstack in a newly acquired test physical machine according to:
https:/
my hardware specs is on attached file: phys_hw.txt, meets the requirements.
Physical host network configuration can be found on netplan.txt
Disks for OSD are wiped using:
#!/bin/bash
disks="sda sdb sdc sdd"
for d in $disks; do echo wipe disk /dev/$d;sudo wipefs -af /dev/$d; (echo gwq | sudo fdisk /dev/$d); done
Using multi-node procedure for:
== Control plane networking i have:
CIDR: 192.168.0.0/24
Gateway: 192.168.0.1
DHCP addr range: 192.168.
Control plane addrr range: 192.168.
Interface: br0
==External networking
CIDR: 192.168.2.0/24
Gateway: 192.168.2.1
DHCP addr range: 192.168.
Floating IP addrr range: 192.168.
Interface: br-ext
Steps execution:
1. sudo snap install openstack --channel 2023.1
(no issues)
2. sunbeam prepare-node-script | bash -x && newgrp snap_daemon
(no issues)
3. sunbeam -v cluster bootstrap --role control --role compute --role storage | tee -a multi_bootstrap
Management networks shared by hosts (CIDRs, separated by comma) (192.168.0.0/24): [default selection]
MetalLB address allocation range (supports multiple ranges, comma separated) (10.20.
Disks to attach to MicroCeph
(/dev/disk/
(hours later...)
DEBUG Application monitored for readiness: ['certificate-
[15:15:27] WARNING Timed out while waiting for model 'openstack' to be ready openstack.py:240
DEBUG Finished running step 'Deploying OpenStack Control Plane'. Result: ResultType.FAILED common.py:260
Error: Timed out while waiting for model 'openstack' to be ready
ubuntu@opstk2464:~$ juju models
Controller: sunbeam-controller
Model Cloud/Region Type Status Machines Cores Units Access Last connection
admin/controller* sunbeam/default manual available 1 12 4 admin just now
openstack sunbeam-
reports only openstack model is created.
juju status model openstack model:
ubuntu@opstk2464:~$ juju status -m openstack
Model Controller Cloud/Region Version SLA Timestamp
openstack sunbeam-controller sunbeam-
SAAS Status Store URL
microceph active local admin/controlle
App Version Status Scale Charm Channel Rev Address Exposed Message
certificate-
cinder waiting 1 cinder-k8s 2023.1/stable 47 10.152.183.162 no installing agent
cinder-ceph waiting 1 cinder-ceph-k8s 2023.1/stable 38 10.152.183.90 no installing agent
cinder-
cinder-mysql 8.0.34-
cinder-mysql-router 8.0.34-
glance active 1 glance-k8s 2023.1/stable 59 10.152.183.200 no
glance-mysql 8.0.34-
glance-mysql-router 8.0.34-
horizon active 1 horizon-k8s 2023.1/stable 56 10.152.183.46 no http://
horizon-mysql 8.0.34-
horizon-
keystone active 1 keystone-k8s 2023.1/stable 125 10.152.183.216 no
keystone-mysql 8.0.34-
keystone-
neutron active 1 neutron-k8s 2023.1/stable 53 10.152.183.97 no
neutron-mysql 8.0.34-
neutron-
nova waiting 1 nova-k8s 2023.1/stable 48 10.152.183.240 no installing agent
nova-api-
nova-cell-
nova-mysql 8.0.34-
nova-mysql-router 8.0.34-
ovn-central active 1 ovn-central-k8s 23.03/stable 61 10.152.183.50 no
ovn-relay active 1 ovn-relay-k8s 23.03/stable 49 192.168.0.4 no
placement active 1 placement-k8s 2023.1/stable 43 10.152.183.48 no
placement-mysql 8.0.34-
placement-
rabbitmq 3.9.13 active 1 rabbitmq-k8s 3.9/stable 30 192.168.0.3 no
traefik 2.10.4 active 1 traefik-k8s 1.0/candidate 148 192.168.0.2 no
Unit Workload Agent Address Ports Message
certificate-
cinder-
cinder-ceph/0* blocked idle 10.1.94.163 (workload) Error in charm (see logs): cannot perform the following tasks:
- Start service "cinder-volume" (cannot sta...
cinder-
cinder-mysql/0* active idle 10.1.94.154 Primary
cinder/0* blocked idle 10.1.94.174 (workload) Error in charm (see logs): cannot perform the following tasks:
- Start service "cinder-scheduler" (cannot ...
glance-
glance-mysql/0* active idle 10.1.94.140 Primary
glance/0* active idle 10.1.94.173
horizon-
horizon-mysql/0* active idle 10.1.94.151 Primary
horizon/0* active idle 10.1.94.161
keystone-
keystone-mysql/0* active idle 10.1.94.150 Primary
keystone/0* active idle 10.1.94.157
neutron-
neutron-mysql/0* active idle 10.1.94.148 Primary
neutron/0* active idle 10.1.94.172
nova-api-
nova-cell-
nova-mysql-
nova-mysql/0* active idle 10.1.94.155 Primary
nova/0* error idle 10.1.94.176 hook failed: "amqp-relation-
ovn-central/0* active idle 10.1.94.177
ovn-relay/0* active idle 10.1.94.171
placement-
placement-mysql/0* active idle 10.1.94.142 Primary
placement/0* active idle 10.1.94.158
rabbitmq/0* active idle 10.1.94.162
traefik/0* active idle 10.1.94.144
Offer Application Charm Rev Connected Endpoint Interface Role
certificate-
keystone keystone keystone-k8s 125 0/0 identity-
ovn-relay ovn-relay ovn-relay-k8s 49 0/0 ovsdb-cms-relay ovsdb-cms provider
rabbitmq rabbitmq rabbitmq-k8s 30 0/0 amqp rabbitmq provider
ubuntu@opstk2464:~$
failed on cinder-k8s installing agent waiting
failed on cinder-ceph installing agent waiting
on juju applications
cinder-ceph/0* blocked idle 10.1.94.163 (workload) Error in charm (see logs): cannot perform the following tasks: - Start service "cinder-volume" (cannot sta...
cinder/0* blocked idle 10.1.94.174 (workload) Error in charm (see logs): cannot perform the following tasks: - Start service "cinder-scheduler" (cannot ...
nova/0* error idle 10.1.94.176 hook failed: "amqp-relation-
error on cinder-ceph/0 w/ cannot start cinder-volume
error on cinder/0 w/ cannot start cinder-scheduler
error on nova/0 w/ hook failed: "amqp-relation-
But microceph is running
ubuntu@opstk2464:~$ sudo microceph status
MicroCeph deployment summary:
- opstk2464 (192.168.0.2)
Services: mds, mgr, mon, osd
Disks: 4
Accordidng to https:/
Openstack-
ubuntu@opstk2464:~$ juju status -m openstack-
ERROR model sunbeam-
ubuntu@opstk2464:~$
ubuntu@opstk2464:~$ sudo systemctl status snap.openstack-
ubuntu@opstk2464:~$
Microk8s set:
ubuntu@opstk2464:~$ sudo systemctl status snap.openstack-
ubuntu@opstk2464:~$ sudo microk8s status
microk8s is running
high-availability: no
datastore master nodes: 127.0.0.1:19001
datastore standby nodes: none
addons:
enabled:
dns # (core) CoreDNS
ha-cluster # (core) Configure high availability on the current node
helm # (core) Helm - the package manager for Kubernetes
helm3 # (core) Helm 3 - the package manager for Kubernetes
hostpath-
metallb # (core) Loadbalancer for your Kubernetes cluster
storage # (core) Alias to hostpath-storage add-on, deprecated
disabled:
cert-manager # (core) Cloud native certificate management
community # (core) The community addons repository
dashboard # (core) The Kubernetes dashboard
host-access # (core) Allow Pods connecting to Host services smoothly
ingress # (core) Ingress controller for external access
mayastor # (core) OpenEBS MayaStor
metrics-server # (core) K8s Metrics Server for API access to service metrics
minio # (core) MinIO object storage
observability # (core) A lightweight observability stack for logs, traces and metrics
prometheus # (core) Prometheus operator for monitoring and logging
rbac # (core) Role-Based Access Control for authorisation
registry # (core) Private image registry exposed on localhost:32000
sudo microk8s inspect
(report is attached)
ubuntu@opstk2464:~$ sudo microk8s.kubectl get pods --namespace openstack
NAME READY STATUS RESTARTS AGE
modeloperator-
certificate-
ovn-relay-0 2/2 Running 0 3h33m
keystone-0 2/2 Running 0 3h34m
horizon-
horizon-mysql-0 2/2 Running 0 3h34m
placement-mysql-0 2/2 Running 0 3h34m
cinder-
glance-
glance-mysql-0 2/2 Running 0 3h34m
neutron-
nova-cell-
nova-mysql-router-0 2/2 Running 0 3h33m
keystone-
nova-api-
nova-mysql-0 2/2 Running 0 3h34m
cinder-
placement-
ovn-central-0 4/4 Running 0 3h32m
rabbitmq-0 2/2 Running 0 3h33m
traefik-0 2/2 Running 0 3h34m
horizon-0 2/2 Running 0 3h33m
cinder-mysql-0 2/2 Running 0 3h34m
placement-0 2/2 Running 0 3h33m
keystone-mysql-0 2/2 Running 0 3h34m
glance-0 2/2 Running 0 3h33m
neutron-0 2/2 Running 0 3h33m
nova-0 4/4 Running 0 3h32m
cinder-ceph-0 2/2 Running 0 3h33m
cinder-0 3/3 Running 0 3h32m
neutron-mysql-0 2/2 Running 0 3h34m
ubuntu@opstk2464:~$
ubuntu@opstk2464:~$ sudo microk8s.kubectl get pod --namespace openstack -o jsonpath=
charm cinder-api cinder-scheduler
(cinder-
ubuntu@opstk2464:~$ sudo microk8s.kubectl get pod --namespace openstack -o jsonpath=
charm cinder-volume
(cinder-
ubuntu@opstk2464:~$ sudo microk8s.kubectl get pod --namespace openstack -o jsonpath=
charm nova-api nova-conductor nova-scheduler
(nova-api_
No locked terraform plans:
ubuntu@opstk2464:~$ sunbeam inspect plans
┏━━━━━━
┃ Plan ┃ Locked ┃
┡━━━━━━
│ microceph-plan │ │
│ microk8s-plan │ │
│ openstack-plan │ │
│ sunbeam-
└──────
[note also tried with after 12 Teardown (https:/
ubuntu@opstk2464:~$ cat reset_disks.sh
#!/bin/bash
disks="sda sdb sdc sdd"
for d in $disks; do echo wipe disk /dev/$d;sudo wipefs -af /dev/$d; (echo gwq | sudo fdisk /dev/$d); done
### Deploy using edge channel:
sudo snap openstack --edge
Problems remained (after a few hours later...)