percona-cluster crashes during cluster formation

Bug #1663245 reported by Narinder Gupta
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OPNFV
Fix Released
Critical
Unassigned
OpenStack Percona Cluster Charm
Fix Released
Medium
James Page
percona-cluster (Juju Charms Collection)
Invalid
Undecided
Unassigned

Bug Description

When i ran the below bundle to deploy the percona cluster I am finding the percona cluster crashes and does not recover.
http://paste.ubuntu.com/23943677/

please find the logs attached as well from all three cluster units.

Revision history for this message
Narinder Gupta (narindergupta) wrote :
Revision history for this message
Narinder Gupta (narindergupta) wrote :
Revision history for this message
Narinder Gupta (narindergupta) wrote :
Changed in opnfv:
importance: Undecided → Critical
Revision history for this message
Narinder Gupta (narindergupta) wrote :

I can reproduce this issue in multiple labs.

James Page (james-page)
affects: charms.openstack → percona-cluster (Juju Charms Collection)
Revision history for this message
Prakash Ramchandran (pramchan-5) wrote :

Tried downloading the log mysql1.tar to review was unable to untar to trace the logs. Next time when I tried was not able to downlaod and old one mysql1.tar have deleted. Any way keep me posted as this on critical path for opnfv JOID project...

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

It looks like, from the logs, that PXC (Percona Xtradb Cluster Server) started ok on mysql0, but crashes on both mysql1 and mysql2.

More info from hosts hosting lxcs for mysql1 and mysql2 is required. It is not clear from the logs, but it could be that memory is constrained in a way that crashes mysql.

Have you tried redeploying with innodb-buffer-pool-size=16M and max-connections=200?

Also, inside mysql data directory (/var/lib/percona-xtradb-cluster/( there is innobackup.prepare.log, can we also get that?

Revision history for this message
Narinder Gupta (narindergupta) wrote :

Those are hosted on LXD container and most host system has 128 GB of RAM. Current requirement to deploy bundle with innodb-buffer-pool-size=16M and max-connections=200 might not work but let me give a try and see.

I have not found innobackup.backup.log please find the log requested. pastebinit innobackup.backup.log
http://paste.ubuntu.com/23994180/

Revision history for this message
Narinder Gupta (narindergupta) wrote :

with the small numbers innodb-buffer-pool-size=16M and max-connections=200 it seems HA bundle got deployed successfully. do i need to experiment and see what should be idle values? What kind of issues we might face?

Revision history for this message
Narinder Gupta (narindergupta) wrote :

Also i ran the deployment on two different labs. One filed and another pass and issue is. Definitely looks like percona-cluster bug.

170214 17:27:35 innobackupex: Executing LOCK BINLOG FOR BACKUP...
DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /usr//bin/innobackupex line 3059.
innobackupex: Error:
Error executing 'LOCK BINLOG FOR BACKUP': DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /usr//bin/innobackupex line 3059.
170214 17:27:35 innobackupex: Waiting for ibbackup (pid=28146) to finish

pastebinit < /var/lib/percona-xtradb-cluster//innobackup.backup.log
http://paste.ubuntu.com/23996333/

Revision history for this message
Nobuto Murata (nobuto) wrote :

I also saw some crashes even with performance_schema=off, so LP: #1646795 and LP: #1401133 might be related.

Revision history for this message
Nobuto Murata (nobuto) wrote :

My quick workaround was to set sst-method=rsync in charm config to avoid innobackupex route.

Revision history for this message
Narinder Gupta (narindergupta) wrote : Re: [Bug 1663245] Re: percona-cluster crashes during cluster formation

I also can confirm that with rsync route I am not seeing this issue.

Thanks and Regards,
Narinder Gupta (PMP) <email address hidden>
Canonical, Ltd. narindergupta [irc.freenode.net]
+1.281.736.5150 narindergupta2007[skype]

Ubuntu- Linux for human beings | www.ubuntu.com | www.canonical.com

On Wed, Feb 15, 2017 at 12:38 AM, Nobuto Murata <email address hidden>
wrote:

> My quick workaround was to set sst-method=rsync in charm config to avoid
> innobackupex route.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1663245
>
> Title:
> percona-cluster crashes during cluster formation
>
> Status in OPNFV:
> New
> Status in percona-cluster package in Juju Charms Collection:
> New
>
> Bug description:
> When i ran the below bundle to deploy the percona cluster I am finding
> the percona cluster crashes and does not recover.
> http://paste.ubuntu.com/23943677/
>
> please find the logs attached as well from all three cluster units.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/opnfv/+bug/1663245/+subscriptions
>

James Page (james-page)
Changed in percona-cluster (Juju Charms Collection):
status: New → Invalid
James Page (james-page)
Changed in charm-percona-cluster:
importance: Undecided → Medium
Revision history for this message
James Page (james-page) wrote :

I recently landed a change into the percona-cluster charm to cap the amount of ram that will be allocated to the innodb buffer to 512M (unless the end user provides a explicit configuration value for innodb-buffer-size).

I think this bug is related to consumption of 50% of RAM by default prior to this change - if the system is doing anything else, there is no guarantee that is even possible.

Changed in charm-percona-cluster:
milestone: none → 17.05
status: New → Triaged
Revision history for this message
James Page (james-page) wrote :

Narinder - I'm going to mark this as Fix Committed on the assumption that this is resolve by the improved default memory management introduced in the current development charm.

If you still have access to the systems it would be nice to re-try with cs:~openstack-charmers-next/percona-cluster to see if this improves things.

Changed in charm-percona-cluster:
status: Triaged → Fix Committed
assignee: nobody → James Page (james-page)
Revision history for this message
Narinder Gupta (narindergupta) wrote :

James I can confirm that after that fix I am not able to reproduce this issue anymore. And mysql does not crash and all services comes up.

James Page (james-page)
Changed in charm-percona-cluster:
milestone: 17.05 → 17.08
Changed in opnfv:
status: New → Fix Released
James Page (james-page)
Changed in charm-percona-cluster:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.