Galera go down after power off on primary controller
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
High
|
Sergii Golovatiuk |
Bug Description
"build_id": "2014-09-
"ostf_sha": "64cb59c681658a
"build_number": "8",
"auth_required": true,
"api": "1.0",
"nailgun_sha": "b8d8189cc37d6d
"production": "docker",
"fuelmain_sha": "d7ed7973034bde
"astute_sha": "f5fbd89d1e0e1f
"feature_groups": ["experimental"],
"release": "5.1",
"release_versions": {"2014.1.1-5.1": {"VERSION": {"build_id": "2014-09-
1. Create new environment (CentOS, HA mode)
2. Choose GRE segmentation
3. Choose Ceph for images and Ceph Rados
4. Choose Sahara, Murano, Ceilometer
5. Add 3 controllers, 1 compute, 1 cinder+mongo, 3 ceph, 2 mongo
6. Start deployment. It was successful
7. Start OSTF tests. It was successful
8. Power off second controller
9. Start OSTF tests. It was successful
10. Power on second controller
11. Power off primary controller
12. Start OSTF tests. It has failed with error: Keystone client is not available. Please, refer to OpenStack logs to fix this problem
13. Power on primary controller.
14. Start OSTF tests. It has failed with the same error
[root@node-31 ~]# keystone tenant-list
Authorization Failed: An unexpected error prevented the server from fulfilling your request. (OperationalError) (2013, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0") None None (HTTP 500)
[root@node-31 ~]# neutron net-list
{"error": {"message": "An unexpected error prevented the server from fulfilling your request. (OperationalError) (2013, \"Lost connection to MySQL server at 'reading initial communication packet', system error: 0\") None None", "code": 500, "title": "Internal Server Error"}}
[root@node-31 ~]# mysql -e 'show status like wsrep%'
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/
[root@node-32 ~]# mysql -e 'show status like wsrep%'
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/
[root@node-32 ~]# pcs status
Cluster name:
Last updated: Wed Sep 17 08:49:53 2014
Last change: Wed Sep 17 08:48:56 2014 via crm_attribute on node-33.domain.tld
Stack: classic openais (with plugin)
Current DC: node-33.domain.tld - partition with quorum
Version: 1.1.10-
3 Nodes configured, 3 expected votes
22 Resources configured
Online: [ node-31.domain.tld node-32.domain.tld node-33.domain.tld ]
Full list of resources:
vip__managemen
vip__public_old (ocf::mirantis:
p_openstack-
p_openstack-
Clone Set: clone_p_mysql [p_mysql]
Started: [ node-31.domain.tld node-32.domain.tld node-33.domain.tld ]
Master/Slave Set: master_
Masters: [ node-31.domain.tld ]
Slaves: [ node-32.domain.tld node-33.domain.tld ]
Clone Set: clone_p_haproxy [p_haproxy]
Started: [ node-31.domain.tld node-32.domain.tld node-33.domain.tld ]
p_openstack-
Clone Set: clone_p_
Started: [ node-31.domain.tld node-32.domain.tld node-33.domain.tld ]
Clone Set: clone_p_
Started: [ node-31.domain.tld node-32.domain.tld node-33.domain.tld ]
p_neutron-
p_neutron-l3-agent (ocf::mirantis:
Failed actions:
p_mysql_start_0 on node-31.domain.tld 'unknown error' (1): call=98, status=Timed Out, last-rc-change='Wed Sep 17 08:36:00 2014', queued=475002ms, exec=0ms
p_mysql_
p_mysql_start_0 on node-32.domain.tld 'unknown error' (1): call=131, status=Timed Out, last-rc-change='Wed Sep 17 08:36:00 2014', queued=475180ms, exec=2ms
Changed in fuel: | |
status: | Incomplete → Invalid |
Logs are here: https:/ /drive. google. com/a/mirantis. com/file/ d/0B6SjzarTGFxa SERZaHVKdW5kMUk /edit?usp= sharing