[CI][C8][All releases][OVB] cloud-init update broke SSH to VMs

Bug #1971751 reported by Dariusz Smigiel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

overcloud-hardened-full image has been built from scratch.

Last known, working build of overcloud-hardened (2022-05-03 21:35:10) [1].
It was used for a successful deployment [2].
cloud-init installed [3]
> [ 11.008802] cloud-init[1027]: Cloud-init v. 21.1-15.el8 running 'init-local' at Wed, 04 May 2022 00:20:46 +0000. Up 10.16 seconds.

Build of undercloud-hardened (2022-05-04 06:11:22) [4], with failing deployment [5]. Running updated cloud-init [6].

With cloud-init 22.1, ssh keys are not getting delivered to starting VM.

[1]: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/50faeef35d14ba61a119dbbfe9f0680a3050d799/periodic-tripleo-centos-8-buildimage-overcloud-hardened-full-train/8c76f2f/overcloud-hardened-full.log
[2]: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/50faeef35d14ba61a119dbbfe9f0680a3050d799/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/c6638f8/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
[3]: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/50faeef35d14ba61a119dbbfe9f0680a3050d799/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/c6638f8/logs/baremetal_0-console.log

[4]: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-hardened-full-train/f853c7f/overcloud-hardened-full.log
[5]: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/b621f8c/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
[6]: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/b621f8c/logs/baremetal_0-console.log

Revision history for this message
Dariusz Smigiel (smigiel-dariusz) wrote :

The same issue can be spotted with all other releases, which use OVB.

summary: - [CI][C8][Train] cloud-init update broke SSH to VMs
+ [CI][C8][All] cloud-init update broke SSH to VMs
summary: - [CI][C8][All] cloud-init update broke SSH to VMs
+ [CI][C8][All releases][OVB] cloud-init update broke SSH to VMs
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart/+/840715

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart/+/840755

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ci (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ci/+/840766

Revision history for this message
Sandeep Yadav (sandeepyadav93) wrote :
Download full text (3.3 KiB)

ssh_genkeytypes in cloud-init config on the centos8 image we are using don't have proper values set.

The latest changes in cloud-init expect, cloud-init to create keys.

https://src.fedoraproject.org/rpms/cloud-init/c/b954b98a1c25b8db753dcd4545e2a72bbd0a2790

~~~
undercloud) [zuul@undercloud ~]$ wget https://images.rdoproject.org/CentOS-8-Stream-x86_64-GenericCloud.qcow2
--2022-05-06 03:09:09-- https://images.rdoproject.org/CentOS-8-Stream-x86_64-GenericCloud.qcow2
Resolving images.rdoproject.org (images.rdoproject.org)... 38.102.83.152
Connecting to images.rdoproject.org (images.rdoproject.org)|38.102.83.152|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1343368704 (1.3G)
Saving to: ‘CentOS-8-Stream-x86_64-GenericCloud.qcow2’

CentOS-8-Stream-x86_64-GenericCloud.qcow2 100%[========================================================================================================================================>] 1.25G 20.9MB/s in 37s

2022-05-06 03:09:46 (35.1 MB/s) - ‘CentOS-8-Stream-x86_64-GenericCloud.qcow2’ saved [1343368704/1343368704]

(undercloud) [zuul@undercloud ~]$ guestfish -a CentOS-8-Stream-x86_64-GenericCloud.qcow2

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
      ‘man’ to read the manual
      ‘quit’ to quit the shell

><fs> mount /dev/sda1 /
><fs> cat /etc/cloud/cloud.cfg
users:
 - default

disable_root: 1
ssh_pwauth: 0

mount_default_fields: [~, ~, 'auto', 'defaults,nofail,x-systemd.requires=cloud-init.service', '0', '2']
resize_rootfs_tmp: /dev
ssh_deletekeys: 1

~~~~

vs new image

~~~
(undercloud) [zuul@undercloud ~]$ wget https://cloud.centos.org/centos/8-stream/x86_64/images/CentOS-Stream-GenericCloud-8-20220125.1.x86_64.qcow2
--2022-05-06 03:15:22-- https://cloud.centos.org/centos/8-stream/x86_64/images/CentOS-Stream-GenericCloud-8-20220125.1.x86_64.qcow2
Resolving cloud.centos.org (cloud.centos.org)... 3.137.219.52, 2600:1f16:c1:5e02:ec1b:2c09:2525:64e0
Connecting to cloud.centos.org (cloud.centos.org)|3.137.219.52|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1490685440 (1.4G) [application/octet-stream]
Saving to: ‘CentOS-Stream-GenericCloud-8-20220125.1.x86_64.qcow2’

CentOS-Stream-GenericCloud-8-20220125.1.x86_64.qcow2 100%[========================================================================================================================================>] 1.39G 53.2MB/s in 18s

2022-05-06 03:15:40 (78.2 MB/s) - ‘CentOS-Stream-GenericCloud-8-20220125.1.x86_64.qcow2’ saved [1490685440/1490685440]

(undercloud) [zuul@undercloud ~]$ guestfish -a CentOS-Stream-GenericCloud-8-20220125.1.x86_64.qcow2

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
      ‘man’ to read the manual
      ‘quit’ to quit the shell

><fs> run
><fs> list-filesystems
/dev/sda1: xfs
><fs> mount /dev/sda1 /
><fs> cat /etc/cloud/cloud.cfg
users:
 - default

disable_root: 1
ssh_pwauth: 0

mount_default_fields: [~, ~, 'auto', 'defaults,nofail,x-syst...

Read more...

Revision history for this message
Sandeep Yadav (sandeepyadav93) wrote :

old c8 (on rdoproject)image have:-
~~~
ssh_genkeytypes: ~
~~~

vs new image on https://cloud.centos.org/ have

~~~
ssh_genkeytypes: ['rsa', 'ecdsa', 'ed25519']
~~~

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ci (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ci/+/840826

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ci (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ci/+/840826
Committed: https://opendev.org/openstack/tripleo-ci/commit/ccfe3f31e3d2104a34b6cf485e5b2235cab60338
Submitter: "Zuul (22348)"
Branch: master

commit ccfe3f31e3d2104a34b6cf485e5b2235cab60338
Author: Sandeep Yadav <email address hidden>
Date: Fri May 6 12:55:14 2022 +0530

    Update default centos-8 base image

    Train ovb job is failing, overcloud nodes are unreachable during
    deployment, sshd service failed to start because with no hostkey available.

    ~~~
    Unable to load host key: /etc/ssh/ssh_host_rsa_key
    Unable to load host key: /etc/ssh/ssh_host_ecdsa_key
    Unable to load host key: /etc/ssh/ssh_host_ed25519_key
    ~~~

    This started after cloud-init 22.1, based on change log it seems
    it is expected cloud-init to create these keys.

    ssh_genkeytypes var in /etc/cloud/cloud.cfg should have the correct
    values by default, But in the c8 image we are using - ssh_genkeytypes
    is not set properly.

    old c8 (on rdoproject)image have:-
    ~~~
    ssh_genkeytypes: ~
    ~~~

    vs new image on https://cloud.centos.org/ have

    ~~~
    ssh_genkeytypes: ['rsa', 'ecdsa', 'ed25519']
    ~~~

    Updating default image with this patch.

    [1] https://src.fedoraproject.org/rpms/cloud-init/c/b954b98a1c25b8db753dcd4545e2a72bbd0a2790
    [2] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-master/ec8fafc/logs/overcloud-controller-0/etc/cloud/cloud.cfg.txt.gz

    Closes-Bug: #1971751
    Change-Id: Idcfbdc632e35f2dda921b67b832bfe396bc248f3

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-common/+/841067

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart (master)

Change abandoned by "dasm <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart/+/840715

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-ci (master)

Change abandoned by "dasm <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ci/+/840766

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (master)

Change abandoned by "Cedric Jeanneret <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-common/+/841067

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.