[network_templates] Deployment fails if 'swift/api' and 'swift/replication' roles are assigned to the same L3 network

Bug #1548275 reported by Artem Panchenko
50
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Alex Schultz
8.0.x
Fix Released
High
Alexey Stupnikov
Mitaka
Fix Released
High
Alex Schultz

Bug Description

Environment deployment fails if 'swift/api' and 'swift/replication' roles are assigned to the same L3 network (services are listening on the same IP):

2016-02-19 15:53:29 ERROR [838] Error running RPC method granular_deploy: Failed to execute hook 'upload_cirros' command: cd / && ruby /etc/puppet/modules/osnailyfacter/modular/astute/upload_cirros.rb
...
2016-02-19T15:51:19.098799+00:00 err: 2016-02-19 15:51:19.070 4808 ERROR swiftclient [req-0e3f8d69-8c21-4b52-8637-52e982cd14d3 50dc7573961a40b1be8ba3f30cf12f10 04ddb5d036744f29af5908ca40328b62 - - -] Container HEAD failed: http://192.168.0.2:8080/v1/AUTH_04ddb5d036744f29af5908ca40328b62/glance 503 Service Unavailable
2016-02-19 15:51:19.070 4808 ERROR swiftclient Traceback (most recent call last):
2016-02-19 15:51:19.070 4808 ERROR swiftclient File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 1390, in _retry
2016-02-19 15:51:19.070 4808 ERROR swiftclient service_token=self.service_token, **kwargs)
2016-02-19 15:51:19.070 4808 ERROR swiftclient File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 760, in head_container
2016-02-19 15:51:19.070 4808 ERROR swiftclient http_response_content=body)
2016-02-19 15:51:19.070 4808 ERROR swiftclient ClientException: Container HEAD failed: http://192.168.0.2:8080/v1/AUTH_04ddb5d036744f29af5908ca40328b62/glance 503 Service Unavailable
2016-02-19 15:51:19.070 4808 ERROR swiftclient
...
<129>Feb 19 14:45:31 node-1 haproxy[12097]: Server swift/node-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in
queue.
<129>Feb 19 14:45:41 node-1 haproxy[12097]: Server swift/node-3 is DOWN, reason: Layer4 timeout, check duration: 10001ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
<129>Feb 19 14:45:41 node-1 haproxy[12097]: Server swift/node-2 is DOWN, reason: Layer4 timeout, check duration: 10001ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
...
root@node-1:~# ip netns exec haproxy curl -v 192.168.1.3:49001
* Rebuilt URL to: 192.168.1.3:49001/
* Hostname was NOT found in DNS cache
* Trying 192.168.1.3...
* Connected to 192.168.1.3 (192.168.1.3) port 49001 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 192.168.1.3:49001
> Accept: */*
>
nc: port number invalid: //192.168.0.2:5000
nc: port number invalid: //192.168.0.2:5000
nc: port number invalid: //192.168.0.2:5000
nc: port number invalid: //192.168.0.2:5000
HTTP/1.1 503 Service Unavailable

There is an error in swiftcheck service config (http://paste.openstack.org/show/487736/), the second arg must have "$ip:$port" format, but currently it has unexpected 'protocol' prefix:

server_args = http://192.168.1.1:8080 http://192.168.0.2:5000 5

Regression was added here: https://github.com/openstack/fuel-library/commit/9d11e9c6567638282813982888c070c39bde6e9e

Steps to reproduce:

1. Create new environment
2. Choose Neutron+VLAN, CinderLVM for volumes. Enable Sahara and Ceilometer.
3. Configure public network (L3 config, do not touch networks to interfaces assignments) and floating ranges through GUI.
4. Add 3 controller+mongo nodes
5. Add 2 compute + cinder nodes
6. Upload network template using CLI:

 ENV_ID=1
 curl -s https://raw.githubusercontent.com/stackforge/fuel-qa/stable/8.0/fuelweb_test/network_templates/hardware.yaml > /root/network_template_${ENV_ID}.yaml
 fuel --env ${ENV_ID} network-template -u

7.Create custom networks using CLI:

 GROUP_ID=1
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.0.0/24 --vlan 364 --name ceph
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.1.0/24 --vlan 367 --name database
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.2.0/24 --vlan 366 --name ha
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.3.0/24 --vlan 368 --name messaging
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.4.0/24 --vlan 369 --name mongo
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.5.0/24 --vlan 365 --name openstack
 fuel network-group --create --node-group ${GROUP_ID} --cidr 10.44.6.0/24 --vlan 363 --name services

8. Run network verification
9. Deploy environment

Expected result: cloud is deployed and passes OSTF

Actual result: deployment fails, swift backend is permanently marked as down in HAProxy

Diagnostic snapshot: https://drive.google.com/file/d/0BzaZINLQ8-xkR2p1RHU0dklfSGs/view?usp=sharing

Changed in fuel:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/283041

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Artem Panchenko (apanchenko-8)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-qa (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/283048

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/283048
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=dda39131b20dc970e9da4824a1e2491d3536a50b
Submitter: Jenkins
Branch: master

commit dda39131b20dc970e9da4824a1e2491d3536a50b
Author: Artem Panchenko <email address hidden>
Date: Mon Feb 22 13:42:44 2016 +0200

    Assign Swift network roles to the same endpoint

    There is an additional MOS service 'swiftcheck' which
    is set up on controllers only if 'swift/api' and
    'swift/replication' roles are assigned to the same L3
    network. Modified network template in order to cover
    such deployment case by automatic tests.

    Change-Id: I5f659d947212ca6bcf8c19580fe85dd279035f36
    Related-bug: #1548275

tags: added: swarm-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/295980

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (master)

Change abandoned by Artem Panchenko (<email address hidden>) on branch: master
Review: https://review.openstack.org/283041
Reason: mistakenly uploaded updated & rebased patch with different change-id: https://review.openstack.org/#/c/295980/

summary: - [network_temmplates] Deployment fails if 'swift/api' and
+ [network_templates] Deployment fails if 'swift/api' and
'swift/replication' roles are assigned to the same L3 network
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/301144

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/301144
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=99d9e97118943e0bb59ed33f9c99be47221fe281
Submitter: Jenkins
Branch: master

commit 99d9e97118943e0bb59ed33f9c99be47221fe281
Author: Artem Panchenko <email address hidden>
Date: Mon Apr 4 16:25:02 2016 +0300

    Fix scan target arg format for swiftcheck service

    As second argument swiftcheck utility takes IP and
    TCP port of Keystone service (joined by ":" symbol).
    Since in just checks TCP port connectivity using netcat
    it doesn't require a type of application layer protocol.

    Change-Id: Ie6fcbafde8a2458a2528ed67dc64492daac39bcf
    Closes-bug: #1548275

Changed in fuel:
status: In Progress → Fix Committed
Changed in fuel:
milestone: 9.0 → 10.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/302763

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/295980
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=766335df60c98b610748072b1f1663d482574a72
Submitter: Jenkins
Branch: master

commit 766335df60c98b610748072b1f1663d482574a72
Author: Artem Panchenko <email address hidden>
Date: Mon Apr 4 16:47:03 2016 +0300

    Fix logic for enabling 'swiftcheck' service

    Currently used logic for enabling swiftcheck service is
    not correct. Additional checks for Keystone availability
    from Swift node should be added to HAProxy if management VIP
    and Swift proxy IP addresses are from different L3 networks.

    Change-Id: I9513fc9da02abdc24cc61a60c33181bb0fc9235b
    Related-bug: #1548275

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/304199

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/302763
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=3cb46ce1f29f223cdb53fd8e1bee5ad192863d28
Submitter: Jenkins
Branch: stable/mitaka

commit 3cb46ce1f29f223cdb53fd8e1bee5ad192863d28
Author: Artem Panchenko <email address hidden>
Date: Mon Apr 4 16:25:02 2016 +0300

    Fix scan target arg format for swiftcheck service

    As second argument swiftcheck utility takes IP and
    TCP port of Keystone service (joined by ":" symbol).
    Since in just checks TCP port connectivity using netcat
    it doesn't require a type of application layer protocol.

    Change-Id: Ie6fcbafde8a2458a2528ed67dc64492daac39bcf
    Closes-bug: #1548275
    (cherry picked from commit 99d9e97118943e0bb59ed33f9c99be47221fe281)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/304199
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=5fe0ca3d3cd7385c5a7b61b84666e81847a08e35
Submitter: Jenkins
Branch: stable/mitaka

commit 5fe0ca3d3cd7385c5a7b61b84666e81847a08e35
Author: Artem Panchenko <email address hidden>
Date: Mon Apr 4 16:47:03 2016 +0300

    Fix logic for enabling 'swiftcheck' service

    Currently used logic for enabling swiftcheck service is
    not correct. Additional checks for Keystone availability
    from Swift node should be added to HAProxy if management VIP
    and Swift proxy IP addresses are from different L3 networks.

    Change-Id: I9513fc9da02abdc24cc61a60c33181bb0fc9235b
    Related-bug: #1548275
    (cherry picked from commit 766335df60c98b610748072b1f1663d482574a72)

tags: added: in-stable-mitaka
Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

Verified on 10.0-85.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

Looks like we also missed this in the swift proxy, which is causing other failures in network template related tests.

https://github.com/openstack/fuel-library/blob/master/deployment/puppet/openstack_tasks/manifests/swift/proxy.pp#L103

See Bug 1567870 and Bug 1569860

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/305424

Changed in fuel:
assignee: Artem Panchenko (apanchenko-8) → Alex Schultz (alex-schultz)
status: Confirmed → In Progress
Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/newton
Changed in fuel:
milestone: 9.0 → 10.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/305424
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=1d5f133fb7162d10b42bfa305e6d1a96cc15ed1b
Submitter: Jenkins
Branch: master

commit 1d5f133fb7162d10b42bfa305e6d1a96cc15ed1b
Author: Alex Schultz <email address hidden>
Date: Wed Apr 13 11:51:52 2016 -0600

    Update swift proxy logic for healthcheck service

    With network template updates, we need to properly configure the service
    healthcheck to match the expected HAproxy configuration. This change
    ensure the healthcheck service is properly configured when it needs to
    be.

    Change-Id: I1647c2a6142ea2f7fbe7eb8a5eda0deb962cdd6c
    Closes-Bug: #1548275

Changed in fuel:
status: In Progress → Fix Committed
tags: removed: swarm-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/308338

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/308338
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=a8ad83a3a5adada61e147bc5bbedc2f906777981
Submitter: Jenkins
Branch: stable/mitaka

commit a8ad83a3a5adada61e147bc5bbedc2f906777981
Author: Alex Schultz <email address hidden>
Date: Wed Apr 13 11:51:52 2016 -0600

    Update swift proxy logic for healthcheck service

    With network template updates, we need to properly configure the service
    healthcheck to match the expected HAproxy configuration. This change
    ensure the healthcheck service is properly configured when it needs to
    be.

    Change-Id: I1647c2a6142ea2f7fbe7eb8a5eda0deb962cdd6c
    Closes-Bug: #1548275
    (cherry picked from commit 1d5f133fb7162d10b42bfa305e6d1a96cc15ed1b)

Revision history for this message
Aleksey Zvyagintsev (azvyagintsev) wrote :

Hello, any plans for 8.0 back-porting ?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/312600

tags: added: customer-found
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

for mitaka verified on 285 iso

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/312600
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=74a367f8aff0d9b55be857e0ef9f8bec2c0ed884
Submitter: Jenkins
Branch: stable/8.0

commit 74a367f8aff0d9b55be857e0ef9f8bec2c0ed884
Author: Artem Panchenko <email address hidden>
Date: Mon Apr 4 16:47:03 2016 +0300

    Fix logic for enabling 'swiftcheck' service

    Currently used logic for enabling swiftcheck service is
    not correct. Additional checks for Keystone availability
    from Swift node should be added to HAProxy if management VIP
    and Swift proxy IP addresses are from different L3 networks.

    Change-Id: I9513fc9da02abdc24cc61a60c33181bb0fc9235b
    Closes-bug: #1548275
    (cherry picked from commit 766335df60c98b610748072b1f1663d482574a72)
    (cherry picked from commit 1d5f133fb7162d10b42bfa305e6d1a96cc15ed1b)

tags: added: on-verification
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :

Described cluster was successfully deployed on 8.0 MU2

tags: removed: on-verification
Revision history for this message
Anatolii Neliubin (aneliubin) wrote :

Could you please take a look at https://bugs.launchpad.net/fuel/+bug/1630261. We have a request from customer who has troubles because swiftcheck file is absent at the moment.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.