commit c6f0b71a8b5960a625f193f64fd9e313371864ab
Author: Vladimir Kuklin <email address hidden>
Date: Fri Jan 6 19:11:09 2017 +0300
Avoid false-positive split-brain detection for mysql-wss
With change Iaa4855d769fe1e0203fcfb9981413273e0e4dda2
we detect whether the node is running as a primary component
while it is not master. While it is a good solution, sometimes
we face race condition when the node which is a 'master' gets lower
sequence number due to other nodes updating their gtid and the same
time. Although it happens rarely and mostly on the slow or overloaded
environemnts, it leads to redundant mysql restarts and service
downtime for OpenStack APIs.
The proper fix would be to use master-slave resource and corresponding
script, but this is a far to big change for the bug under question.
The solution proposed checks if the node is a primary component during
start and monitor operations and also checks for number of currently
running primary components by setting and querying an additional
attribute `is_pc`. It triggers monitor failure only when the node
is not running with the 'master' GTID and is a primary component
and if there is more than one primary components.
Misc: fix functions return codes to reflect shell 'true'
and 'false' numeric values.
Reviewed: https:/ /review. openstack. org/418893 /git.openstack. org/cgit/ openstack/ fuel-library/ commit/ ?id=c6f0b71a8b5 960a625f193f64f d9e313371864ab
Committed: https:/
Submitter: Jenkins
Branch: stable/newton
commit c6f0b71a8b5960a 625f193f64fd9e3 13371864ab
Author: Vladimir Kuklin <email address hidden>
Date: Fri Jan 6 19:11:09 2017 +0300
Avoid false-positive split-brain detection for mysql-wss
With change Iaa4855d769fe1e 0203fcfb9981413 273e0e4dda2
we detect whether the node is running as a primary component
while it is not master. While it is a good solution, sometimes
we face race condition when the node which is a 'master' gets lower
sequence number due to other nodes updating their gtid and the same
time. Although it happens rarely and mostly on the slow or overloaded
environemnts, it leads to redundant mysql restarts and service
downtime for OpenStack APIs.
The proper fix would be to use master-slave resource and corresponding
script, but this is a far to big change for the bug under question.
The solution proposed checks if the node is a primary component during
start and monitor operations and also checks for number of currently
running primary components by setting and querying an additional
attribute `is_pc`. It triggers monitor failure only when the node
is not running with the 'master' GTID and is a primary component
and if there is more than one primary components.
Misc: fix functions return codes to reflect shell 'true'
and 'false' numeric values.
Change-Id: Id3ea32347ed37a 6efffd3ee85dfb3 110b2e8c8ca
Closes-bug: #1651982