Tasks handling nova-status upgrade check are too restrictive

Bug #1834647 reported by Mariusz Karpiarz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Medium
Mariusz Karpiarz
Stein
Fix Released
Medium
Unassigned
Train
Fix Released
Medium
Mariusz Karpiarz

Bug Description

When upgrading from Rocky to Stein with active Ironic nodes, `nova-status upgrade check` fails with this warning:

```
+---------------------------------------------------------------------+
| Check: Resource Providers |
| Result: Warning |
| Details: There are 1 compute resource providers and 4 compute nodes |
| in the deployment. Ideally the number of compute resource |
| providers should equal the number of enabled compute nodes |
| otherwise the cloud may be underutilized. See |
| https://docs.openstack.org/nova/latest/user/placement.html |
| for more details. |
+---------------------------------------------------------------------+
```
This command returns code 1, which causes the "Check nova upgrade status" task in `ansible/roles/nova/tasks/upgrade.yml` to fail with "non-zero return code".
Here is the relevant part of this file:

```
- name: Check nova upgrade status
  become: true
  command: docker exec -t nova_api nova-status upgrade check
  register: nova_upgrade_check_stdout
  when: inventory_hostname == groups['nova-api'][0]

- name: Upgrade status check result
  fail:
    msg:
      - "There was an upgrade status check warning or failure!"
      - "See the detail at https://docs.openstack.org/nova/latest/cli/nova-status.html#nova-status-checks"
  vars:
    first_nova_api_host: "{{ groups['nova-api'][0] }}"
  when: hostvars[first_nova_api_host]['nova_upgrade_check_stdout']['rc'] != 0
```

Now, there are two problems here. Firstly, on https://docs.openstack.org/nova/stein/cli/nova-status.html#upgrade only codes 2 and 255 are mentioned as fatal errors that should stop the process with code 1 being just a warning. Secondly, it doesn't make sense to run the next task (the "Upgrade status check result") to check the code returned by the nova-status command, because the first task will always fail if the command returns anything different than code 0.

My opinion is we should tell Ansible to ignore errors from the first task and then add more conditions to handle the returned code. It may also be worth adding a parameter controlling how strict the check should be, so if this parameter is set to true, only code 0 is allowed, otherwise both code 0 and 1 are acceptable.

Changed in kolla-ansible:
assignee: nobody → Mariusz Karpiarz (mkarpiarz)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/668177
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=c68ed4dd516e37844a3eb9cdd2089c985173571d
Submitter: Zuul
Branch: master

commit c68ed4dd516e37844a3eb9cdd2089c985173571d
Author: Mariusz <email address hidden>
Date: Fri Jun 28 13:33:39 2019 +0000

    Handle more return codes from nova-status upgrade check

    In a single controller scenario, the "Upgrade status check result"
    does nothing because the previous task can only succeed when
    `nova-status upgrade check` returns code 0. This change allows this
    command to fail, so that the value of returned code stored in
    `nova_upgrade_check_stdout` can then be analysed.

    This change also allows for warnings (rc 1) to pass.
    Closes-Bug: 1834647

    Change-Id: I6f5e37832f43f23604920b9d890cc505ca924ff9

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/676629

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/stein)

Reviewed: https://review.opendev.org/676629
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=dcd726a57d9602991ea166f24651e528f1677ac9
Submitter: Zuul
Branch: stable/stein

commit dcd726a57d9602991ea166f24651e528f1677ac9
Author: Mariusz <email address hidden>
Date: Fri Jun 28 13:33:39 2019 +0000

    Handle more return codes from nova-status upgrade check

    In a single controller scenario, the "Upgrade status check result"
    does nothing because the previous task can only succeed when
    `nova-status upgrade check` returns code 0. This change allows this
    command to fail, so that the value of returned code stored in
    `nova_upgrade_check_stdout` can then be analysed.

    This change also allows for warnings (rc 1) to pass.
    Closes-Bug: 1834647

    Change-Id: I6f5e37832f43f23604920b9d890cc505ca924ff9
    (cherry picked from commit c68ed4dd516e37844a3eb9cdd2089c985173571d)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 8.0.1

This issue was fixed in the openstack/kolla-ansible 8.0.1 release.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 9.0.0.0rc1

This issue was fixed in the openstack/kolla-ansible 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.