[2023.1] deployment fail at mariadb : Restart MariaDB container
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
In Progress
|
Undecided
|
Unassigned | ||
Antelope |
In Progress
|
Undecided
|
Unassigned |
Bug Description
2023.1 kolla-ansible deployment fail at mariadb cluster deployment.
TASK [mariadb : Ensure MariaDB is running normally on bootstrap host] *******
changed: [r9-control3]
included: /usr/local/
RUNNING HANDLER [mariadb : Restart MariaDB container] *******
changed: [r9-control1]
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (10 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (10 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (9 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (9 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (8 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (8 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (7 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (7 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (6 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (6 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (5 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (5 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (4 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (4 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (3 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (3 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (2 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (2 retries left).
FAILED - RETRYING: [r9-control3]: Wait for MariaDB service port liveness (1 retries left).
FAILED - RETRYING: [r9-control2]: Wait for MariaDB service port liveness (1 retries left).
TASK [mariadb : Restart MariaDB container] *******
fatal: [r9-control2]: FAILED! => {"attempts": 10, "changed": false, "elapsed": 60, "msg": "Timeout when waiting for search string MariaDB in 10.33.0.12:3306"}
TASK [mariadb : Restart MariaDB container] *******
fatal: [r9-control3]: FAILED! => {"attempts": 10, "changed": false, "elapsed": 60, "msg": "Timeout when waiting for search string MariaDB in 10.33.0.13:3306"}
RUNNING HANDLER [mariadb : Wait for MariaDB service port liveness] *******
FAILED - RETRYING: [r9-control1]: Wait for MariaDB service port liveness (10 retries left).
FAILED - RETRYING: [r9-control1]: Wait for MariaDB service port liveness (9 retries left).
FAILED - RETRYING: [r9-control1]: Wait for MariaDB service port liveness (8 retries left).
----------- control3 mariadb.log
2023-06-15 8:31:46 0 [Note] WSREP: save pc into disk
2023-06-15 8:31:46 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3
2023-06-15 8:31:46 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2023-06-15 8:31:46 0 [Note] WSREP: STATE EXCHANGE: sent state msg: a0eae2d3-
2023-06-15 8:31:46 0 [Note] WSREP: STATE EXCHANGE: got state msg: a0eae2d3-
2023-06-15 8:31:46 0 [Note] WSREP: STATE EXCHANGE: got state msg: a0eae2d3-
2023-06-15 8:31:46 0 [Note] WSREP: STATE EXCHANGE: got state msg: a0eae2d3-
2023-06-15 8:31:46 0 [Warning] WSREP: Quorum: No node with complete state:
Version : 6
Flags : 0x1
Protocols : 2 / 10 / 4
State : NON-PRIMARY
Desync count : 0
Prim state : NON-PRIMARY
Prim UUID : 00000000-
Prim seqno : -1
First seqno : 1
Last seqno : 2
Commit cut : 2
Last vote : -1.0
Vote policy : 0
Prim JOINED : 0
State UUID : a0eae2d3-
Group UUID : 93636c74-
Name : 'r9-control1'
Incoming addr: '10.33.0.11:3306'
Version : 6
Flags : 00
Protocols : 2 / 10 / 4
State : NON-PRIMARY
Desync count : 0
Prim state : PRIMARY
Prim UUID : 9977cd66-
Prim seqno : 2
First seqno : -1
Last seqno : 2
Commit cut : 0
Last vote : -1.0
Vote policy : 0
Prim JOINED : 1
State UUID : a0eae2d3-
Group UUID : 93636c74-
Name : 'r9-control3'
Incoming addr: '10.33.0.13:3306'
Version : 6
Flags : 00
Protocols : 2 / 10 / 4
State : NON-PRIMARY
Desync count : 0
Prim state : JOINER
Prim UUID : 9977cd66-
Prim seqno : 2
First seqno : -1
Last seqno : 2
Commit cut : 0
Last vote : -1.0
Vote policy : 0
Prim JOINED : 1
State UUID : a0eae2d3-
Group UUID : 93636c74-
Name : 'r9-control3'
Incoming addr: '10.33.0.13:3306'
Version : 6
Flags : 00
Protocols : 2 / 10 / 4
State : NON-PRIMARY
Desync count : 0
Prim state : JOINER
Prim UUID : 9977cd66-
Prim seqno : 2
First seqno : -1
Last seqno : 2
Commit cut : 0
Last vote : -1.0
Vote policy : 0
Prim JOINED : 1
State UUID : a0eae2d3-
Group UUID : 93636c74-
Name : 'r9-control2'
Incoming addr: '10.33.0.12:3306'
2023-06-15 8:31:46 0 [Warning] WSREP: No re-merged primary component found.
2023-06-15 8:31:46 0 [Warning] WSREP: No bootstrapped primary component found.
2023-06-15 8:31:46 0 [ERROR] WSREP: /home/buildbot/
2023-06-15 8:31:46 0 [Note] WSREP: Quorum results:
version = 6,
component = NON-PRIMARY,
conf_id = -1,
members = 1/3 (joined/total),
act_id = -1,
last_appl. = 0,
protocols = -1/-1/-1 (gcs/repl/appl),
vote policy= 1,
group UUID = 00000000-
2023-06-15 8:31:46 0 [Note] WSREP: Flow-control interval: [28, 28]
2023-06-15 8:31:46 0 [Note] WSREP: Received NON-PRIMARY.
2023-06-15 8:31:47 2 [Warning] WSREP: Send action {0x7face001f6b0, 151, STATE_REQUEST} returned -107 (Transport endpoint is not connected)
2023-06-15 8:31:48 2 [Warning] WSREP: Send action {0x7face001f6b0, 151, STATE_REQUEST} returned -107 (Transport endpoint is not connected)
2023-06-15 8:31:48 0 [Note] WSREP: (99242c70-bced, 'tcp://
2023-06-15 8:31:49 2 [Warning] WSREP: Send action {0x7face001f6b0, 151, STATE_REQUEST} returned -107 (Transport endpoint is not connected)
2023-06-15 8:31:50 2 [Warning] WSREP: Send action {0x7face001f6b0, 151, STATE_REQUEST} returned -107 (Transport endpoint is not connected)
2023-06-15 8:31:51 2 [Warning] WSREP: Send action {0x7face001f6b0, 151, STATE_REQUEST} returned -107 (Transport endpoint is not connected)
...many times same log...
WSREP_SST: [ERROR] Possible timeout in receiving first data from donor in gtid stage: exit codes: 124 0 (20230615 08:36:34.835)
WSREP_SST: [ERROR] Cleanup after exit with status: 32 (20230615 08:36:34.836)
WSREP_SST: [ERROR] Removing /tmp/tmp.
WSREP_SST: [INFO] Removing the sst_in_progress file (20230615 08:36:34.840)
WSREP_SST: [INFO] Cleaning up temporary directories (20230615 08:36:34.842)
2023-06-15 8:36:34 0 [ERROR] WSREP: Process completed with error: wsrep_sst_
2023-06-15 8:36:34 0 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_
2023-06-15 8:36:34 3 [Note] WSREP: SST received
2023-06-15 8:36:34 3 [Note] WSREP: SST received: 00000000-
2023-06-15 8:36:35 2 [Warning] WSREP: Send action {0x7face001f6b0, 151, STATE_REQUEST} returned -107 (Transport endpoint is not connected)
-------------
----- control1
[root@r9-control1 _data]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2a08f29494da quay.io/
683a09ae90e2 quay.io/
343e1fcd8d32 quay.io/
50636a88ef13 quay.io/
6c0b0066481e quay.io/
f7af9e31e885 quay.io/
[root@r9-control1 _data]# ll /var/lib/
total 242188
-rw-rw---- 1 42434 42434 417792 Jun 15 08:31 aria_log.00000001
-rw-rw---- 1 42434 42434 52 Jun 15 08:31 aria_log_control
-rw-rw---- 1 42434 42434 16384 Jun 15 08:31 ddl_recovery-
-rw-rw---- 1 42434 42434 134219032 Jun 15 08:31 galera.cache
-rw-rw---- 1 42434 42434 113 Jun 15 08:31 grastate.dat
-rw-rw---- 1 42434 42434 264 Jun 15 08:31 gvwstate.dat
-rw-rw---- 1 42434 42434 946 Jun 15 08:31 ib_buffer_pool
-rw-rw---- 1 42434 42434 12582912 Jun 15 08:31 ibdata1
-rw-rw---- 1 42434 42434 100663296 Jun 15 08:31 ib_logfile0
-rw-rw---- 1 42434 42434 0 Jun 15 08:31 multi-master.info
drwx------ 2 42434 42434 4096 Jun 15 08:31 mysql
-rw-rw---- 1 42434 42434 27596 Jun 15 08:31 mysql-bin.000001
-rw-rw---- 1 42434 42434 2732 Jun 15 08:31 mysql-bin.000002
-rw-rw---- 1 42434 42434 365 Jun 15 08:31 mysql-bin.000003
-rw-rw---- 1 42434 42434 342 Jun 15 08:31 mysql-bin.000004
-rw-rw---- 1 42434 42434 365 Jun 15 08:31 mysql-bin.000005
-rw-rw---- 1 42434 42434 95 Jun 15 08:31 mysql-bin.index
-rw-rw---- 1 42434 42434 7 Jun 15 08:31 mysql-bin.state
srwxrwxrwx 1 42434 42434 0 Jun 15 08:31 mysql.sock
-rw-r--r-- 1 42434 42434 15 Jun 15 08:31 mysql_upgrade_info
drwx------ 2 42434 42434 20 Jun 15 08:31 performance_schema
drwx------ 2 42434 42434 8192 Jun 15 08:31 sys
-rw------- 1 42434 42434 3461 Jun 15 08:31 wsrep_recovery.ok
-rw-rw---- 1 42434 42434 41 Jun 15 08:31 xtrabackup_
[root@r9-control1 _data]# cat /var/lib/
# GALERA saved state
version: 2.1
uuid: 93636c74-
seqno: -1
safe_to_bootstrap: 0
[root@r9-control1 _data]# ss -nlp |grep maria
tcp LISTEN 0 4096 10.33.0.11:4567 0.0.0.0:* users:(
----- control2
[root@r9-control2 _data]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7bb5fb0bbdd4 quay.io/
481bb2854925 quay.io/
a1ae6e6c677c quay.io/
d978190ecfbd quay.io/
a8df55b63f33 quay.io/
232aa3d34809 quay.io/
[root@r9-control2 _data]# ll /var/lib/
total 131080
-rw-rw---- 1 42434 42434 134219032 Jun 15 08:31 galera.cache
-rw-rw---- 1 42434 42434 0 Jun 15 08:31 grastate.dat
-rw-rw---- 1 42434 42434 264 Jun 15 08:31 gvwstate.dat
-rw-rw---- 1 42434 42434 0 Jun 15 08:31 sst_in_progress
[root@r9-control2 _data]# ss -nlp |grep maria
tcp LISTEN 0 4096 10.33.0.12:4567 0.0.0.0:* users:(
----- control3
[root@r9-control3 _data]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
27da8eff47a7 quay.io/
fdc36b92a1f3 quay.io/
f065352bd406 quay.io/
1120e6b860bb quay.io/
0519f81452bd quay.io/
ae43216f7c43 quay.io/
[root@r9-control3 _data]# ll /var/lib/
total 241704
-rw-rw---- 1 42434 42434 16384 Jun 15 08:31 aria_log.00000001
-rw-rw---- 1 42434 42434 52 Jun 15 08:31 aria_log_control
-rw-rw---- 1 42434 42434 134219032 Jun 15 08:31 galera.cache
-rw-rw---- 1 42434 42434 0 Jun 15 08:31 grastate.dat
-rw-rw---- 1 42434 42434 264 Jun 15 08:31 gvwstate.dat
-rw-rw---- 1 42434 42434 12582912 Jun 15 08:31 ibdata1
-rw-rw---- 1 42434 42434 100663296 Jun 15 08:31 ib_logfile0
-rw-rw---- 1 42434 42434 351 Jun 15 08:31 mysql-bin.000001
-rw-rw---- 1 42434 42434 19 Jun 15 08:31 mysql-bin.index
-rw-rw---- 1 42434 42434 0 Jun 15 08:31 mysql-bin.state
srwxrwxrwx 1 42434 42434 0 Jun 15 08:31 mysql.sock
-rw------- 1 42434 42434 2621 Jun 15 08:31 wsrep_recovery.ok
[root@r9-control3 _data]# ss -nlp |grep maria
tcp LISTEN 0 4096 10.33.0.13:4568 0.0.0.0:* users:(
tcp LISTEN 0 4096 10.33.0.13:4567 0.0.0.0:* users:(
I can confirm that I got same issue when deploying 2023.1 with ubuntu distro.