cell host discovery does not run when using --skip-deploy-identifier flag in a scale out

Bug #1831711 reported by Martin Schuppert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Undecided
Martin Schuppert

Bug Description

Recent changes for edge scenarios caused intended move of discovery from controller to bootstrap compute node, so now this task is triggered by deploy-identifier [1], meaning - with --skip-deploy-identifier flag used, discovery will not be triggered at all and as result causing failures in previously supported scenarios.

Instance create might fail with:
Error: Failed to perform requested operation on instance "vm", the instance has an error status: Please try again later [Error: Host '<hostname>' is not mapped to any cell].

Note: as a workaround run discovery, like:

1) ssh to a node of the overcloud, e.g. one of the controllers running nova_api

2) enter the container and run the cell v2 discovery to map the scaled out compute to the cell
[root@overcloud-controller-0 /]# docker exec -it -u root nova_api sh
()[root@overcloud-controller-0 /]$ nova-manage cell_v2 discover_hosts --by-service --verbose

[1] - https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/nova/nova-compute-container-puppet.yaml#L667

Changed in tripleo:
assignee: nobody → Martin Schuppert (mschuppert)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/663273

Changed in tripleo:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/663273
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f8779e5023c7b5dd54dd879d4f5431ed4742e9e1
Submitter: Zuul
Branch: master

commit f8779e5023c7b5dd54dd879d4f5431ed4742e9e1
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 5 10:33:25 2019 +0200

    Move nova cell v2 discovery to deploy_steps_tasks

    Recent changes for e.g edge scenarios caused intended move of discovery
    from controller to bootstrap compute node. The task is triggered by
    deploy-identifier to make sure it gets run on any deploy,scale, ... run.
    If deploy run is triggered with --skip-deploy-identifier flag, discovery
    will not be triggered at and as result causing failures in previously
    supported scenarios.
    This change moves the host discovery task to be an ansible
    deploy_steps_tasks that it gets triggered even if --skip-deploy-identifier
    is used, or the compute bootstrap node is blacklisted.

    Closes-Bug: #1831711

    Change-Id: I4bd8489e4f79e3e1bfe9338ed3043241dd605dcb

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/669802

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/669826

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/669827

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/stein)

Reviewed: https://review.opendev.org/669802
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=b82417126b53d602d3df28ce0d7d2b31a9e5f1c4
Submitter: Zuul
Branch: stable/stein

commit b82417126b53d602d3df28ce0d7d2b31a9e5f1c4
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 5 10:33:25 2019 +0200

    Move nova cell v2 discovery to deploy_steps_tasks

    Recent changes for e.g edge scenarios caused intended move of discovery
    from controller to bootstrap compute node. The task is triggered by
    deploy-identifier to make sure it gets run on any deploy,scale, ... run.
    If deploy run is triggered with --skip-deploy-identifier flag, discovery
    will not be triggered at and as result causing failures in previously
    supported scenarios.
    This change moves the host discovery task to be an ansible
    deploy_steps_tasks that it gets triggered even if --skip-deploy-identifier
    is used, or the compute bootstrap node is blacklisted.

    Closes-Bug: #1831711

    Change-Id: I4bd8489e4f79e3e1bfe9338ed3043241dd605dcb
    (cherry picked from commit f8779e5023c7b5dd54dd879d4f5431ed4742e9e1)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/669826
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=58b65c1da8a23f3b210896ca09bbd438de9a3227
Submitter: Zuul
Branch: stable/rocky

commit 58b65c1da8a23f3b210896ca09bbd438de9a3227
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 5 10:33:25 2019 +0200

    Move nova cell v2 discovery to deploy_steps_tasks

    Recent changes for e.g edge scenarios caused intended move of discovery
    from controller to bootstrap compute node. The task is triggered by
    deploy-identifier to make sure it gets run on any deploy,scale, ... run.
    If deploy run is triggered with --skip-deploy-identifier flag, discovery
    will not be triggered at and as result causing failures in previously
    supported scenarios.
    This change moves the host discovery task to be an ansible
    deploy_steps_tasks that it gets triggered even if --skip-deploy-identifier
    is used, or the compute bootstrap node is blacklisted.

    Closes-Bug: #1831711

     Conflicts:
     deployment/nova/nova-compute-container-puppet.yaml
     docker/services/nova-compute-common.yaml
     docker/services/nova-ironic.yaml
     docker_config_scripts/nova_cell_v2_discover_hosts.py

    Change-Id: I4bd8489e4f79e3e1bfe9338ed3043241dd605dcb
    (cherry picked from commit f8779e5023c7b5dd54dd879d4f5431ed4742e9e1)
    (cherry picked from commit b82417126b53d602d3df28ce0d7d2b31a9e5f1c4)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.1.0

This issue was fixed in the openstack/tripleo-heat-templates 11.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 10.6.1

This issue was fixed in the openstack/tripleo-heat-templates 10.6.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 9.4.1

This issue was fixed in the openstack/tripleo-heat-templates 9.4.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/669827
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=fe26cb764bdc64a5439abe66c2ce4f5cca4165ba
Submitter: Zuul
Branch: stable/queens

commit fe26cb764bdc64a5439abe66c2ce4f5cca4165ba
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 5 10:33:25 2019 +0200

    Move nova cell v2 discovery to deploy_steps_tasks

    Recent changes for e.g edge scenarios caused intended move of discovery
    from controller to bootstrap compute node. The task is triggered by
    deploy-identifier to make sure it gets run on any deploy,scale, ... run.
    If deploy run is triggered with --skip-deploy-identifier flag, discovery
    will not be triggered at and as result causing failures in previously
    supported scenarios.
    This change moves the host discovery task to be an ansible
    deploy_steps_tasks that it gets triggered even if --skip-deploy-identifier
    is used, or the compute bootstrap node is blacklisted.

    Closes-Bug: #1831711

     Conflicts:
     deployment/nova/nova-compute-container-puppet.yaml
     docker/services/nova-compute-common.yaml
     docker/services/nova-ironic.yaml
     docker_config_scripts/nova_cell_v2_discover_hosts.py

    Change-Id: I4bd8489e4f79e3e1bfe9338ed3043241dd605dcb
    (cherry picked from commit f8779e5023c7b5dd54dd879d4f5431ed4742e9e1)
    (cherry picked from commit b82417126b53d602d3df28ce0d7d2b31a9e5f1c4)
    (cherry picked from commit 58b65c1da8a23f3b210896ca09bbd438de9a3227)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.opendev.org/683067

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/683069

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/queens)

Reviewed: https://review.opendev.org/683067
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=8fbc9ec014530401f0e6a435be888a942be4563f
Submitter: Zuul
Branch: stable/queens

commit 8fbc9ec014530401f0e6a435be888a942be4563f
Author: Martin Schuppert <email address hidden>
Date: Thu Sep 19 10:54:19 2019 +0200

    [Queens] Add workflow to do cellv2 host discovery

    When the compute bootstrap node gets blacklisted or deployment is
    run with skip-deploy-identifier the discovery job in step_5
    won't run and requires a manual post deploy/scale step.

    This introduce a workflow to run the discovery via an ansible
    playbook.

    Change-Id: I54d42df162a6744806301d97bca5d94e5f380a2b
    Related-Bug: #1831711

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/683069
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=d8eefd0a756f52ace4b82a5124229a174914125d
Submitter: Zuul
Branch: stable/queens

commit d8eefd0a756f52ace4b82a5124229a174914125d
Author: Martin Schuppert <email address hidden>
Date: Thu Sep 19 11:03:45 2019 +0200

    [Queens] Run cellv2 host discovery via workflow or deploy_steps_tasks

    This is a queens only modified version of
    https://review.opendev.org/682644 due to the different deployment
    methods we have there.

    When the compute bootstrap node gets blacklisted or deployment is
    run with skip-deploy-identifier the discovery job we have in step_5
    won't run. This runs the the discovery via an ansible playbook to
    make sure it runs in this conditions.

    In case of default deploy method the discovery is run by the
    discovery workflow.

    In case of config-download the discovery is run via the
    deploy_steps_tasks.

    Change-Id: Icda4c1eb3e8c39a01547586dbc6f0407ce846c64
    Closes-Bug: #1831711
    Depends-On: I54d42df162a6744806301d97bca5d94e5f380a2b

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates queens-eol

This issue was fixed in the openstack/tripleo-heat-templates queens-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.