retracers reach respawn limit

Bug #1620823 reported by Brian Murray
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Daisy
New
Undecided
Unassigned

Bug Description

I've heard that retracers some times die and reach their restart limit e.g.

kernel: [1248124.346493] init: retracer-amd64 main process (15176) terminated with status 1
kernel: [1248124.346549] init: retracer-amd64 main process ended, respawning
kernel: [1248160.938522] init: retracer-amd64 main process (15522) terminated with status 1
kernel: [1248160.938830] init: retracer-amd64 main process ended, respawning
kernel: [1248162.532329] init: retracer-amd64 main process (15534) terminated with status 1
kernel: [1248162.532785] init: retracer-amd64 main process ended, respawning
kernel: [1248164.044925] init: retracer-amd64 main process (15537) terminated with status 1
kernel: [1248164.045121] init: retracer-amd64 main process ended, respawning
kernel: [1248165.639615] init: retracer-amd64 main process (15538) terminated with status 1
kernel: [1248165.639794] init: retracer-amd64 main process ended, respawning
kernel: [1248167.109279] init: retracer-amd64 main process (15539) terminated with status 1
kernel: [1248167.109555] init: retracer-amd64 respawning too fast, stopped

Its been suggested to increase the respawn limit count.

Related branches

Revision history for this message
Brian Murray (brian-murray) wrote :

Looking at the log files for the retracer in the above log snippet it was failing due to "OSError: [Errno 12] Cannot allocate memory". I don't think increasing the respawn limit count will help out in this particular case.

Revision history for this message
Brian Murray (brian-murray) wrote :

This, cannot allocate memory, was also the case with another retracer-app that was down today.

Revision history for this message
Brian Murray (brian-murray) wrote :

Digging into it more it looks like the retracer processes are leaking memory.

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10838 root 20 0 154740 33280 6100 S 0.0 0.8 0:44.51 retracer.py
 9771 root 20 0 154812 32960 6108 S 0.0 0.8 0:18.53 retracer.py
 8355 root 20 0 92584 21316 4532 S 0.0 0.5 0:00.20 retracer.py
10413 root 20 0 92580 21316 4532 S 0.0 0.5 0:00.20 retracer.py

10838 and 9771 are i386 and amd64 retracers which having been retracing crashes while the other two processes have not.

Revision history for this message
Brian Murray (brian-murray) wrote :

So in the examples while we may have been trying to restart the amd64 its possible one of the other architecture retracers had consumed a bunch of memory thereby preventing the amd64 one from restarting.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.