Booting the install system does not always succeed, hence a remote ssh login is not always possible
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Canonical Foundations Team | ||
subiquity |
Invalid
|
Undecided
|
Unassigned | ||
livecd-rootfs (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[impact]
When there is a serial console configured, there was a unit cycle
serial-
(or something like that)
Depending on which unit systemd kills to resolve the cycle, this can result in cloud-init never completing which leads to the subiquity server waiting forever for it and nothing useful can be done (other than restarting and hoping for better luck next time).
Because subiquity itself waits for cloud-init (and this has been true for a long time now) there is no need for serial-
[regression potential]
This change results in shuffling the systemd units around a fair bit, but the new arrangement has been tested in devel for a few months now and works well there. It's also much more straightforward than the current setup.
[test case]
This is a bit tricky as it's an intermittent failure. Basically, boot the live installer with a serial console configured a bunch of times and (10?) check that the installer starts up properly each time.
[original description]
From time to time (sporadically and very rarely - maybe in one or two attempt out of twenty) I face a situation where the installer system (on s390x) does not boot-up completely.
This happened to me in the past already, but since it happened only one or twice I thought it's due to resource constraints on the system or so.
But since I faced it now again on LPAR (before it was on z/VM), I'm opening this ticket now.
In the latest case I used the focal daily live image from the 20th of July with installer 20.06.1 (but that also happened with previous versions).
The situation is like this:
In boot of the installer ends here (LPAR):
...
"[ 128.200711] cloud-init[1375]: The key's randomart image is:"
"[ 128.200735] cloud-init[1375]: +--[ED25519 256]--+"
"[ 128.200758] cloud-init[1375]: |o .....ooo |"
"[ 128.200781] cloud-init[1375]: |.= . . +. o. .|"
"[ 128.200804] cloud-init[1375]: |+ * . . * o.o . |"
"[ 128.200826] cloud-init[1375]: |.= o . = = = + |"
"[ 128.200849] cloud-init[1375]: |o + . S o + |"
"[ 128.200876] cloud-init[1375]: | + o = . . |"
"[ 128.200900] cloud-init[1375]: | o + . |"
"[ 128.200925] cloud-init[1375]: | o.=. E |"
"[ 128.200947] cloud-init[1375]: | .+.o+o. |"
"[ 128.200977] cloud-init[1375]: +----[SHA256]
"[ 138.898906] cloud-init[2217]: Cloud-init v. 20.2-45-
"1 running 'modules:config' at Wed, 22 Jul 2020 11:27:39 +0000. Up 138.77 seconds"
.
"[ 138.898966] cloud-init[2217]: Set the following 'random' passwords"
"[ 138.899001] cloud-init[2217]: installer:
or another example (z/VM):
...
¬ 93.463680| cloud-init¬1282|: +--¬ED25519 256|--+
¬ 93.463713| cloud-init¬1282|: !Eo=o .... !
¬ 93.463749| cloud-init¬1282|: !.Bo.o ... o !
¬ 93.463782| cloud-init¬1282|: !**.*... o = !
¬ 93.463818| cloud-init¬1282|: !*=O o. o . . !
¬ 93.463849| cloud-init¬1282|: !**++ S !
¬ 93.463886| cloud-init¬1282|: !§o+.. !
¬ 93.463918| cloud-init¬1282|: !+*o. !
¬ 93.463954| cloud-init¬1282|: !.o. !
¬ 93.463988| cloud-init¬1282|: !. !
¬ 93.464028| cloud-init¬1282|: +----¬SHA256|-----+
¬ 104.841438| cloud-init¬2004|: Cloud-init v. 20.2-45-
1 running 'modules:config' at Mon, 20 Jul 2020 10:46:38 +0000. Up 104.63 seconds
.
¬ 104.841490| cloud-init¬2004|: Set the following 'random' passwords
¬ 104.841516| cloud-init¬2004|: installer:
But it is not complete at this point.
A completed boot of the installer system ends like this:
"It is possible to connect to the installer over the network, which"
"might allow the use of a more capable terminal and can offer more languages"
"than can be rendered in the Linux console."
"To connect, SSH to installer@<IP address>."
"The password you should use is "ydnjdnciu"
"The host key fingerprints are:"
"RSA SHA256:
"ECDSA SHA256:
"ED25519 SHA256:
"Ubuntu 20.04 LTS ubuntu-server sclp_line0"
In such an above situation I also can't reach the subiquity UI:
fheimes@T570:~$ ssh-keygen -f "/home/
# Host s1lp14 found: line 165
/home/fheimes/
Original contents retained as /home/fheimes/
fheimes@T570:~$ ssh installer@s1lp14
The authenticity of host 's1lp14 (10.245.236.14)' can't be established.
ECDSA key fingerprint is SHA256:
Are you sure you want to continue connecting (yes/no/
Warning: Permanently added 's1lp14,
installer@s1lp14's password:
Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-40-generic s390x)
* Documentation: https:/
* Management: https:/
* Support: https:/
System information as of Wed Jul 22 11:28:32 UTC 2020
System load: 0.44 Memory usage: 4% Processes: 180
Usage of /home: unknown Swap usage: 0% Users logged in: 0
0 updates can be installed immediately.
0 of these updates are security updates.
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
Hence even gathering the logs is unfortunately not easily possible.
Related branches
- Dan Bungert: Approve
- Ubuntu Core Development Team: Pending requested
-
Diff: 88 lines (+22/-9)6 files modifieddebian/changelog (+10/-0)
live-build/ubuntu-server/includes.binary/overlay/etc/cloud/cloud.cfg (+1/-3)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/getty@tty1.service (+1/-0)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/serial-getty@.service.d/subiquity-serial.conf (+8/-1)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/serial-getty@sclp_line0.service.d/subiquity-serial.conf (+2/-4)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/snap.subiquity.subiquity-service.service.d/subiquity.conf (+0/-1)
- Ubuntu Core Development Team: Pending requested
-
Diff: 100 lines (+17/-30)6 files modifieddebian/changelog (+7/-0)
dev/null (+0/-23)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/getty@tty1.service (+1/-0)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/serial-getty@.service.d/subiquity-serial.conf (+8/-1)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/serial-getty@sclp_line0.service.d/subiquity-serial.conf (+1/-5)
live-build/ubuntu-server/includes.binary/overlay/usr/lib/systemd/system/snap.subiquity.subiquity-service.service.d/subiquity.conf (+0/-1)
Changed in ubuntu-z-systems: | |
assignee: | nobody → Canonical Foundations Team (canonical-foundations) |
summary: |
- Sporadically the installer system does not boot-up completely + Sporadically the installer system does not boot-up completely (or + doesn't start all services) |
Changed in ubuntu-z-systems: | |
importance: | Undecided → High |
Changed in ubuntu-z-systems: | |
status: | New → In Progress |
summary: |
- Sporadically the installer system does not boot-up completely (or - doesn't start all services) + Booting the install system does not always succeed, hence a remote ssh + login is not always possible |
Changed in subiquity: | |
status: | In Progress → Invalid |
Changed in livecd-rootfs (Ubuntu): | |
status: | New → Fix Released |
description: | updated |
Changed in livecd-rootfs (Ubuntu Focal): | |
status: | New → In Progress |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-focal removed: verification-needed verification-needed-focal |
Changed in ubuntu-z-systems: | |
status: | Fix Committed → Fix Released |
Just happened again on z/VM using the image suggested to test by QA Tracker from July 23rd.
Starting the installation a second time worked - means the installer booted to the end.