Failed to deploy node : Unknown error

Bug #1613396 reported by Sergey Galkin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Vladimir Sharshov
Mitaka
Fix Released
High
Vladimir Sharshov

Bug Description

I have tried to deploy 168 nodes. During deployment 6 nodes went to offline. I have rebooted 6 nodes, they had returned to online and I have clicked to "Deploy Changes"
Deployment failed with
12:01:29 Failed to deploy node 'Untitled (aa:f0)': Unknown error
12:01:28 Failed to deploy node 'Untitled (ac:cc)': Unknown error
12:01:22 Failed to deploy node 'Untitled (a7:80)': Unknown error
12:01:21 Failed to deploy node 'Untitled (a9:7c)': Unknown error
12:01:21 Failed to deploy node 'Untitled (b4:14)': Unknown error
12:01:08 Failed to deploy node 'Untitled (98:10)': Node is not ready for deployment: mcollective has not answered
12:00:19 Failed to deploy node 'Untitled (b5:58)': Unknown error
12:00:19 Failed to deploy node 'Untitled (b5:3c)': Unknown error
12:00:19 Failed to deploy node 'Untitled (b0:cc)': Unknown error
12:00:19 Failed to deploy node 'Untitled (ad:dc)': Unknown error
12:00:19 Failed to deploy node 'Untitled (99:dc)': Unknown error
12:00:19 Failed to deploy node 'Untitled (9a:44)': Unknown error
12:00:19 Failed to deploy node 'Untitled (b5:04)': Unknown error
12:00:19 Failed to deploy node 'Untitled (99:ec)': Unknown error
12:00:19 Failed to deploy node 'Untitled (ab:e0)': Unknown error
12:00:19 Failed to deploy node 'Untitled (a8:78)': Unknown error
12:00:19 Failed to deploy node 'offline (b3:5c)': Unknown error
12:00:19 Failed to deploy node 'Untitled (b5:7c)': Unknown error
12:00:19 Failed to deploy node 'Untitled (b0:dc)': Unknown error
12:00:19 Failed to deploy node 'Untitled (14:30)': Node is not ready for deployment: mcollective has not answered
12:00:19 Failed to deploy node 'Untitled (c0:30)': Unknown error
12:00:19 Failed to deploy node 'Untitled (72:20)': Node is not ready for deployment: mcollective has not answered
12:00:18 Failed to deploy node 'Untitled (aa:d0)': Node is not ready for deployment: mcollective has not answered
12:00:18 Failed to deploy node 'Untitled (9a:90)': Node is not ready for deployment: mcollective has not answered
12:00:18 Failed to deploy node 'Untitled (aa:50)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b5:64)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b4:f4)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b4:1c)': Unknown error
12:00:18 Failed to deploy node 'Untitled (af:54)': Unknown error
12:00:18 Failed to deploy node 'Untitled (ad:30)': Unknown error
12:00:18 Failed to deploy node 'Untitled (af:94)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b1:14)': Unknown error
12:00:18 Failed to deploy node 'Untitled (ad:44)': Unknown error
12:00:18 Failed to deploy node 'Untitled (a9:d8)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b5:a4)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b1:98)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b5:20)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b5:80)': Unknown error
12:00:18 Failed to deploy node 'offline-2- (3a:30)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b2:30)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b2:c8)': Unknown error
12:00:18 Failed to deploy node 'Untitled (80:60)': Unknown error
12:00:18 Failed to deploy node 'Untitled (73:20)': Unknown error
12:00:18 Failed to deploy node 'Untitled (b2:70)': Unknown error
12:00:18 Failed to deploy node 'Untitled (ce:f0)': Unknown error
12:00:18 Failed to deploy node 'Untitled (ba:b0)': Unknown error
12:00:13 Failed to deploy node 'Untitled (85:a0)': Unknown error
12:00:13 Failed to deploy node 'Untitled (73:e0)': Unknown error
12:00:13 Failed to deploy node 'Untitled (a7:b0)': Unknown error
12:00:13 Failed to deploy node 'Untitled (9a:74)': Unknown error

Revision history for this message
Sergey Galkin (sgalkin) wrote :
Revision history for this message
Sergey Galkin (sgalkin) wrote :

Snapshot available on http://mos-scale-share.mirantis.com/fuel-snapshot-2016-08-15_12-20-55.tar.gz

[root@fuel ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 598
cat /etc/fuel_build_number:
 598
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6349.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-notify-9.0.0-1.mos8460.noarch
 fuel-ostf-9.0.0-1.mos936.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8743.noarch
 fuel-mirror-9.0.0-1.mos141.noarch
 fuel-openstack-metadata-9.0.0-1.mos8743.noarch
 rubygem-astute-9.0.0-1.mos750.noarch
 fuel-misc-9.0.0-1.mos8460.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-utils-9.0.0-1.mos8460.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 nailgun-mcagents-9.0.0-1.mos750.noarch
 fuel-library9.0-9.0.0-1.mos8460.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-migrate-9.0.0-1.mos8460.noarch
 python-packetary-9.0.0-1.mos141.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-nailgun-9.0.0-1.mos8743.noarch

Changed in fuel:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
milestone: none → 10.0
tags: added: area-python
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Bulat Gaifullin (bgaifullin)
Changed in fuel:
assignee: Bulat Gaifullin (bgaifullin) → Vladimir Sharshov (vsharshov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/mitaka)

Reviewed: https://review.openstack.org/374729
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=4cc48e33cbacd00a8c14ea63672484934ef8e34b
Submitter: Jenkins
Branch: stable/mitaka

commit 4cc48e33cbacd00a8c14ea63672484934ef8e34b
Author: Vladimir Sharshov (warpc) <email address hidden>
Date: Tue Oct 11 21:16:57 2016 +0300

    New version of puppet task engine

    Changes:

    - remove report from task engine;
    - remove old logic for hangs and 'idling' statuses;
    - increase code redability;
    - add code docs;
    - support retries in case of MClient errors for status
      and run actions;
    - replace timeout raise on usual code;
    - descrease waiting time for puppet run (from 120 to 10) and
      time between try (from 30 to 2);
    - mcollective retry descrease from 5 to 1. Now it will use
      puppet retries if failed during network/mcollective problem
      after 1 try.

    Closes-Bug: #1613396
    Change-Id: I98fe3df65ef335b03eceb2c401eba12cf68ee1c8

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/387327

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/387327
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=bca595a964e4f45798f393bb0bbc685605cae090
Submitter: Jenkins
Branch: master

commit bca595a964e4f45798f393bb0bbc685605cae090
Author: Vladimir Sharshov (warpc) <email address hidden>
Date: Tue Oct 11 21:16:57 2016 +0300

    New version of puppet task engine

    Changes:

    - remove report from task engine;
    - remove old logic for hangs and 'idling' statuses;
    - increase code redability;
    - add code docs;
    - support retries in case of MClient errors for status
      and run actions;
    - replace timeout raise on usual code;
    - descrease waiting time for puppet run (from 120 to 10) and
      time between try (from 30 to 2);
    - mcollective retry descrease from 5 to 1. Now it will use
      puppet retries if failed during network/mcollective problem
      after 1 try.

    Closes-Bug: #1613396
    Change-Id: I98fe3df65ef335b03eceb2c401eba12cf68ee1c8

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/400921

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/newton)

Reviewed: https://review.openstack.org/400921
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=5be2a5f281b23b518f9ca3be9dd1e8102f9322fd
Submitter: Jenkins
Branch: stable/newton

commit 5be2a5f281b23b518f9ca3be9dd1e8102f9322fd
Author: Vladimir Sharshov (warpc) <email address hidden>
Date: Tue Oct 11 21:16:57 2016 +0300

    New version of puppet task engine

    Changes:

    - remove report from task engine;
    - remove old logic for hangs and 'idling' statuses;
    - increase code redability;
    - add code docs;
    - support retries in case of MClient errors for status
      and run actions;
    - replace timeout raise on usual code;
    - descrease waiting time for puppet run (from 120 to 10) and
      time between try (from 30 to 2);
    - mcollective retry descrease from 5 to 1. Now it will use
      puppet retries if failed during network/mcollective problem
      after 1 try.

    Closes-Bug: #1613396
    Change-Id: I98fe3df65ef335b03eceb2c401eba12cf68ee1c8
    (cherry picked from commit bca595a964e4f45798f393bb0bbc685605cae090)

tags: added: in-stable-newton
Revision history for this message
Michael Semenov (msemenov) wrote :

Verified on last RC1 scale certification run.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-astute 11.0.0.0rc1

This issue was fixed in the openstack/fuel-astute 11.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.