Scenario tests fail because of long boot time of CirrOS VM

Bug #1998916 reported by Lukas Piwowarski
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Confirmed
High
Unassigned

Bug Description

1) Describe the error and the expected behavior.
The test_minimum_basic_instance_hard_reboot_after_vol_snap_deletion scenario test fails here [1] because the creation of the cirros VM gets stuck (see attached logs - job-output.txt). The issue occurs in tempest's tempest-multinode-full-py3 job.

So far it is not certain whether the issue is caused by the test or by the job definition (e.g. using the wrong CirrOS image).

2) What OS did you run into the issue on?
Ubuntu 22.04

3) Specify the steps to reproduce the bug.
Execute the tempest-multionode-full-py3 job several times. The error will occur only from time to time.

[1] https://opendev.org/openstack/tempest/src/commit/96cd444cac4a0d2d1db619365f645a60c3de73a5/tempest/scenario/test_minimum_basic.py#L221

Revision history for this message
Lukas Piwowarski (lukas-piwowarski) wrote :
Martin Kopec (mkopec)
Changed in tempest:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Lukas Piwowarski (lukas-piwowarski) wrote :

The failure of the test_minimum_basic_instance_hard_reboot_after_vol_snap_deletion scenario test is probably caused by long boot time of CirrOS VM. The reason behind the long boot time is that the VM tries to unsuccessfully contact http://169.254.169.254/2009-04-04/instance-id [1] (check out "failed 1/20: up 26.25. request failed" in the attached log file).

Also, this issue probably does not influence only the test_minimum_basic_instane_hard_reboot_after_vol_snap_deletion test but also other tests (e.g.: test_minimum_basic_scenario).

[1] https://github.com/cirros-dev/cirros/blob/7f5471e27244b5f63194f2565b49006203e40b6e/src/lib/cirros/ds/ec2#L41

summary: - test_minimum_basic_instance_hard_reboot_after_vol_snap_deletion scenario
- test fails on ssh to CirrOS image
+ Scenario tests fail because of long boot time of CirrOS VM
Revision history for this message
Lukas Piwowarski (lukas-piwowarski) wrote :

I created a bug report in Neutron's launchpad: https://bugs.launchpad.net/neutron/+bug/1999400

Revision history for this message
yatin (yatinkarel) wrote :

I commented on the corresponding bug https://bugs.launchpad.net/neutron/+bug/1999400/comments/3 about the issue in OVN version included in ubuntu jammy.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tempest/+/873228

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.opendev.org/c/openstack/tempest/+/873228
Committed: https://opendev.org/openstack/tempest/commit/517563fde84fa84199f13fb984945b897bf50bee
Submitter: "Zuul (22348)"
Branch: master

commit 517563fde84fa84199f13fb984945b897bf50bee
Author: Martin Kopec <email address hidden>
Date: Thu Feb 9 09:39:44 2023 +0100

    Mark tempest-multinode-full-py3 as n-v

    The commit marks the job temporarily as non-voting to unblock the
    CI. There are multiple patches waiting to be merged which address
    other bugs (e.g. the timeout issues) and this job fails at ~50%
    rate which makes the merging of other patches comlicated.

    Related-Bug: #1998916
    Change-Id: I4ef3a6e5c4bbef93d355bfa42589fdb60db43663

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tempest/+/873704

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.opendev.org/c/openstack/tempest/+/873704
Committed: https://opendev.org/openstack/tempest/commit/2d2cfac5722fa5aa43998dea6b9d7fff97df368f
Submitter: "Zuul (22348)"
Branch: master

commit 2d2cfac5722fa5aa43998dea6b9d7fff97df368f
Author: yatinkarel <email address hidden>
Date: Tue Feb 14 16:29:42 2023 +0530

    Enable bridge flows and tcpdump in tempest multinode

    Enable br-int-flows and br-ex-tcpdump services in
    tempest-multinode-full-py3 job, these will help in
    debugging network issues.

    Related-Bug: #1998916
    Change-Id: I947a6e2a88d7ad38cc00aa694438cb3101030168

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.