pt-stalk fails when there is too many connection

Bug #1046966 reported by Jose
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
Undecided
Unassigned

Bug Description

is it possible to have a persistent connection to mysql?, cause sometimes it fails to connect/collect data when event is triggered.

 pt-stalk --threshold 900 --variable Threads_running
2012_09_06_17_41_41 Check results: Threads_running=584, matched=no, cycles_true=0
2012_09_06_17_41_42 Check results: Threads_running=780, matched=no, cycles_true=0
mysqladmin: connect to server at 'localhost' failed
error: 'Too many connections'
2012_09_06_17_41_57 Detected value is empty; something failed? Trigger exit status: 0
2012_09_06_17_41_57 Check results: Threads_running=, matched=no, cycles_true=0
mysqladmin: connect to server at 'localhost' failed
error: 'Too many connections'

Brian Fraser (fraserbn)
Changed in percona-toolkit:
importance: Undecided → Wishlist
status: New → Confirmed
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

This general bug is appearing more and more, and being mentioned by more people, what I would call "connection resiliency". For example: bug 1046483. It is, however, a big engineering task because the major tools can have lots of connections, some of which are "primary", others "auxiliary". And each possible point of failure needs to be carefully considered: is the original operation idempotent? I.e. can we re-connect, re-execute, and carry on? And are half failures/successes possible? Etc. etc. The general motivation is good though: don't let connection "hiccups" kill the whole tool, especially long-running tools.

tags: added: error-recovery pt-stalk
Changed in percona-toolkit:
importance: Wishlist → Undecided
Revision history for this message
Fernando Ipar (fipar) wrote :

Daniel:

I don't think it's the same error. bug 1046483 is about a connection being killed, while this one is about mysql and mysqladmin failing due to 'too many connections' errors that may happen if Threads_connected is too close (or over) max_connections.

I've linked a POC branch I've just made where collect checks the Threads_connected and max_connections, and if there are less than 10 available connections, does not send captures to the background. It worked on my manual simple tests (Just pt-stalk --no-stalk --iterations 1 , with enough connections, and with just 1 available), but again, it's just a POC :)

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1020

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.