'Could not find kernel image' on 450 nodes [9.2]

Bug #1656269 reported by Sergey Galkin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Vladimir Kozhukalov
Mitaka
Fix Committed
High
Vladimir Kozhukalov
Newton
Fix Committed
High
Vladimir Kozhukalov

Bug Description

Steps to reproduce:
1. Install Fuel 9.0
2. Upgrade to 9.2 with repo http://mirror.fuel-infra.org/mos-repos/centos/mos9.0-centos7/snapshots/proposed-2017-01-10-102420/x86_64
3. Discover 450 nodes and try to deploy cluster with 350 nodes

During provisioning 95 nodes gone offline.

Consoles offline servers in loop output:

Could not find kernel image: /images/ubuntu_1404_x86_64/linux
Could not find kernel image: /images/ubuntu_1404_x86_64/linux
Could not find kernel image: /images/ubuntu_1404_x86_64/linux

snapshot available in http://mos-scale-share.mirantis.com/fuel-snapshot-2017-01-13_08-55-52.tar.gz

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

there are lines in astute.log

2017-01-12 14:34:07 INFO [14088] Starting OS provisioning for nodes: 776,777,778,779,780,781,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,800,801,802,803,804,805,806,808,809,810,813,814,816,817,818,819,820,821,822,823,8
24,825,826,828,831,832,833,834
...
2017-01-12 14:45:56 ERROR [14088] MCollective agents 'uploadfile' '788' didn't respond within the allotted time.

2017-01-12 14:45:56 ERROR [14088] fbe8c151-365e-4bc4-839c-12c4fd97b07b: file was not uploaded /tmp/provision.json on node 788: fbe8c151-365e-4bc4-839c-12c4fd97b07b: MCollective agents 'uploadfile' '788' didn't respond within the allotted time.

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: none → 9.2
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/420028

Changed in fuel:
importance: Undecided → High
status: New → Confirmed
assignee: Fuel Sustaining (fuel-sustaining-team) → Vladimir Kozhukalov (kozhukalov)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/mitaka)

Reviewed: https://review.openstack.org/420028
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=570049ca1fde98ee09952075f72bc4f28b9d8b71
Submitter: Jenkins
Branch: stable/mitaka

commit 570049ca1fde98ee09952075f72bc4f28b9d8b71
Author: Vladimir Kozhukalov <email address hidden>
Date: Fri Jan 13 18:16:21 2017 +0300

    Move not provisioned nodes to error status

    When there are lot of nodes to provision and we provision
    them by chunks, we could fail in the middle due to "Too many
    nodes failed to provision". If so, we need to append those
    nodes where we did not started provision at all to the list
    of failed nodes. Otherwise, those nodes will be reported
    as 'provisioned' with progress = 100 and rebooted.
    But for some reasons we bind all nodes before starting provision
    to debian-installer profile in cobbler, and being rebooted
    these not provisioned nodes will fail to boot, because since
    7.0 we put empty files where cobbler expects debian-installer
    kernel and initrd files. :-)

    Change-Id: I2a401b80614ee7dd5a10931b9b50bcff066f790f
    Closes-Bug: #1656269

tags: added: in-stable-mitaka
Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/420654

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/420660

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
status: Fix Committed → In Progress
milestone: 9.2 → 11.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/420654
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=64d62086e88460e29f5d9117a1f4c69d391d4bd0
Submitter: Jenkins
Branch: master

commit 64d62086e88460e29f5d9117a1f4c69d391d4bd0
Author: Vladimir Kozhukalov <email address hidden>
Date: Fri Jan 13 18:16:21 2017 +0300

    Move not provisioned nodes to error status

    When there are lot of nodes to provision and we provision
    them by chunks, we could fail in the middle due to "Too many
    nodes failed to provision". If so, we need to append those
    nodes where we did not started provision at all to the list
    of failed nodes. Otherwise, those nodes will be reported
    as 'provisioned' with progress = 100 and rebooted.
    But for some reasons we bind all nodes before starting provision
    to debian-installer profile in cobbler, and being rebooted
    these not provisioned nodes will fail to boot, because since
    7.0 we put empty files where cobbler expects debian-installer
    kernel and initrd files. :-)

    Change-Id: I2a401b80614ee7dd5a10931b9b50bcff066f790f
    Closes-Bug: #1656269
    (cherry picked from commit 570049ca1fde98ee09952075f72bc4f28b9d8b71)

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/newton)

Reviewed: https://review.openstack.org/420660
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=e05f66d12ef0fc9cdc71b5955a55978929a7cda8
Submitter: Jenkins
Branch: stable/newton

commit e05f66d12ef0fc9cdc71b5955a55978929a7cda8
Author: Vladimir Kozhukalov <email address hidden>
Date: Fri Jan 13 18:16:21 2017 +0300

    Move not provisioned nodes to error status

    When there are lot of nodes to provision and we provision
    them by chunks, we could fail in the middle due to "Too many
    nodes failed to provision". If so, we need to append those
    nodes where we did not started provision at all to the list
    of failed nodes. Otherwise, those nodes will be reported
    as 'provisioned' with progress = 100 and rebooted.
    But for some reasons we bind all nodes before starting provision
    to debian-installer profile in cobbler, and being rebooted
    these not provisioned nodes will fail to boot, because since
    7.0 we put empty files where cobbler expects debian-installer
    kernel and initrd files. :-)

    Change-Id: I2a401b80614ee7dd5a10931b9b50bcff066f790f
    Closes-Bug: #1656269
    (cherry picked from commit 570049ca1fde98ee09952075f72bc4f28b9d8b71)

Revision history for this message
Michael Semenov (msemenov) wrote :

9.2 is not certified with more than 200 nodes. So, set as Fix Released in 9.2.

Revision history for this message
Leontiy Istomin (listomin) wrote :

The issue affects provisioning 50 nodes.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-astute 11.0.0.0rc1

This issue was fixed in the openstack/fuel-astute 11.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.