Robustify against transient worker failures

Bug #1474734 reported by Martin Pitt
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Auto Package Testing
Fix Released
Medium
Martin Pitt

Bug Description

Workers can die on transient issues like bug 1474729. They should be auto-restarted when they don't run for an hour, with cron jobs and lock files (until we get systemd which makes this simpler). It should also send out a notification mail to the admins (me, Adam, Iain).

This requires setting up mail on the worker boxes.

Martin Pitt (pitti)
Changed in auto-package-testing:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Martin Pitt (pitti)
Martin Pitt (pitti)
description: updated
Revision history for this message
Martin Pitt (pitti) wrote :
Changed in auto-package-testing:
status: Triaged → In Progress
Revision history for this message
Martin Pitt (pitti) wrote :
Changed in auto-package-testing:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.