Heat autoscaling does not work

Bug #1576520 reported by Kyrylo Galanov
36
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Peter Razumovsky
8.0.x
Fix Released
High
Alexey Stupnikov
Mitaka
Fix Released
High
Peter Razumovsky

Bug Description

Detailed bug description:
 Autoscaling of heat cluster does not work. Most likely the root cause is ceilometer-->heat-cfn[public] alert rule is placed with HTTP protocol while HTTPS is required.

Details at http://paste.openstack.org/show/495714/

Steps to reproduce:
 Run fuel-qa "deploy_heat_ha" test group
Expected results:
 OSTF tests pass
Actual result:
 OSTF test "Check stack autoscaling" fails
Reproducibility:
 100%
Workaround:
 <put your information here>
Impact:
 <put your information here>
Description of the environment:
 Operation system: fuel-9.0-257-2016-04-28
 Versions of components: <put your information here>
 Reference architecture: <put your information here>
 Network model: <put your information here>
 Related projects installed: <put your information here>
Additional information:
 <put your information here>

Revision history for this message
Kyrylo Galanov (kgalanov) wrote :
Changed in fuel:
assignee: Fuel Library (Deprecated) (fuel-library) → Fuel Sustaining (fuel-sustaining-team)
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS Heat (mos-heat)
tags: added: area-mos
removed: area-library
Revision history for this message
Alexander Nagovitsyn (gluk12189) wrote :

/var/log/aodh/aodh-notifier.log

http://paste.openstack.org/show/496077/

I think bug reason is ceilimeter-aodh agent.

Revision history for this message
Alexander Nagovitsyn (gluk12189) wrote :
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Is this really critical? Does autoscaling work correctly when HTTP is used?

Changed in fuel:
milestone: 9.0 → 10.0
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Let's consult with Denis Egorenko on this, based on comments in #2

Revision history for this message
Alexander Nagovitsyn (gluk12189) wrote :

yes -autoscaling works correctly when using HTTP

Changed in fuel:
assignee: MOS Heat (mos-heat) → Peter Razumovsky (prazumovsky)
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

I suggest we downgrade this to High, as there is a workaround - use HTTP.

Changed in fuel:
importance: Critical → High
Revision history for this message
magicboiz (magicboiz) wrote :

Hi Roman

could your please give details on how to apply the workaround (use HTTP instead of HTTPS)??. Any config file to change? or any environment vars to change?

thx

Revision history for this message
Alexander Nagovitsyn (gluk12189) wrote :

simple way - deploy stack without TLS option

tags: added: swarm-blocker
Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Bug researching gives next results:

1. Heat creates urls for alarm_url (and other same attributes) with http, not https. So, SSL cannot correctly work.

2. Heat creates urls with ip, not hostname(public.fuel.local). It causes an issue because certificate in some project (in my case it was aodh) is signed for the public.fuel.local, not ip. So, aodh alarm notifying fails with "Hostname not matching".

Because of Heat creates url for alarms etc., based on config option heat_waitcondition_server_url (which currently equals to http://<ip>:<port>/...), the possible solution is change configuration of such option to correct (https://<hostname>:<port>/...).

I've successfully tested this solution on MOS 9.0 and master.

I've reported bug: https://bugs.launchpad.net/fuel/+bug/1582283

Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Next updates:

1. Currently aodh cannot handle case, when we give it url with ip (aodh certificate signed with domain);

2. Waitcondition cannot work, when we give it url with domain.

So, currently there is only one solution: extra option url for alarm url. This url will be build with https scheme and domain. waitcondition url will be build with https scheme and ip.

Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Possible solution at this moment is:

* define default config of heat_waitcondition_server_url with https scheme and domain instead of ip (only waitcondition); (this option will be used by aodh)

* define default config in section [clients_heat] url = https://<ip>:8004/v1/%(tenant_id)s. (this option will be used by waitcondition)

Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

We have solution for this issue, waiting for check it on env. Patch: https://review.openstack.org/#/c/317031/

Revision history for this message
Dina Belova (dbelova) wrote :

Marking as in progress due to the conversation with the mis-heat team. Peter is fully dedicated to solve the issues asap.

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Denis Egorenko (degorenko) wrote :
Revision history for this message
Peter Razumovsky (prazumovsky) wrote :
Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Now waiting for backport to mitaka release

Revision history for this message
Peter Razumovsky (prazumovsky) wrote :
Revision history for this message
Denis Egorenko (degorenko) wrote :
Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Backport for mitaka in branch, so fix commited.

tags: added: on-verification
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :

https://product-ci.infra.mirantis.net/job/9.0.system_test.ubuntu.services_ha/140/
#1​40 Jun 15, 2016 2:14 AM
services_ha on 9.0-mos-485

test "deploy_heat_ha" passed

tags: removed: on-verification
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

OK, looks like we have to deploy MOS with TLS and latest updates to reproduce this bug in 8.0 branch.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

Steps to reproduce for MOS8:
1. Deploy stack with ceilometer.
2. Run "Check stack autoscaling" test.

Expected result: green test

Actual result: "Time limit exceeded while waiting for keypair creation to finish."

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/353445

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/353445
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=5449a63af121f89b0e38016f1c5b84562127a304
Submitter: Jenkins
Branch: stable/8.0

commit 5449a63af121f89b0e38016f1c5b84562127a304
Author: Denis Egorenko <email address hidden>
Date: Mon May 16 20:54:35 2016 +0300

    Use public_hash for determining current protocol and address for Heat

    Heat parameters heat_metadata_server_url, heat_waitcondition_server_url,
    heat_watch_server_url require to be set with proper protocol and address
    in case of usage SSL.

    Change-Id: I7baa7b44db4237347ddadccb4537e0080ef62322
    Closes-bug: #1576520
    Related-bug: #1582283
    (cherry picked from commit 610564638f35a6609d7d84e3f5a7c7a48ff2e06e)

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 8.0 + MU3 updates.

Steps to verify:
1. Deploy cluster with ceilometer and with enabled TLS option.
2. Run "Check stack autoscaling" test.

Actual results:
"Check stack autoscaling" test is passed sometimes, but from time to time it's failed on the following steps:
5. Create a stack.
11. Wait for the 2nd instance to be terminated.
14. Wait for the stack to be deleted.
There is separate bug https://bugs.launchpad.net/fuel/+bug/1584190 which describes this issue.

Step "2. Create a keypair" was passed every time, so issue with SSL was fixed.

tags: removed: on-verification
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.