Launchpad itself

oopses do not gather environmental data(load, thread-cpu-time, ...)

Bug #243554 reported by Diogo Matsubara on 2008-06-27

	Status	Importance	Assigned to
Launchpad itself	Triaged	High	Unassigned
OOPS model	Triaged	High	Unassigned
python-oops-tools	Triaged	High	Unassigned

Bug Description

When timeouts occur, they can be caused by a) inefficient code or b) external influences.

We should gather enough data that we don't spend time debugging the wrong things.

Specifically we should gather:
- system load average
- number of cpucores (to normalise the load average)
- process memory & physical memory (to guesstimate whether we're hitting swap)
- *process* time since the request started. As each request is in a separate thread, the OS's system accounting can tell us whether 5 seconds of wall clock time was 5 seconds of CPU time, or 1 second of CPU time.

The canonical.mem.resident() and canonical.mem.memory() will help in implementing this. os.loadavg will give us load averages. We can grep /proc/cpu as bzr does for the cpu counts, and time.clock() will give us CPU usage ('per process', which may be equivalent to per-thread when we use it from a non main thread. Testing will be needed). If time.clock does not suffice, a small extension could call clock_gettime(CLOCK_THREAD_CPUTIME_ID, ....)

We are hitting many questions we cannot answer today as a result of not knowing these things.

Alternatively:
#RUSAGE_THREAD = 1 on my linux system - we'd want a C extension to get the right constant
resource.getrusage(1)ru_utime
should give us what we need.

See original description

Tags:

Joey Stanford (joey) on 2008-08-05

Changed in launchpad:
importance:	Undecided → High

Revision history for this message

Christian Reis (kiko) wrote on 2008-08-16:

AIUI Francis' team is in the best position to actually store this information, and he already has put work into capturing this data into data structures we can output in the OOPS dump.

Changed in oops-tools:
status:	New → Triaged

Francis J. Lacoste (flacoste) on 2008-08-20

Changed in launchpad:
importance:	Undecided → High
status:	New → Triaged

Robert Collins (lifeless) on 2010-09-09

summary:

- oops report should record information about the running process
+ oops report should record information about the running environment

Robert Collins (lifeless) on 2010-09-09

description:

updated

Robert Collins (lifeless) on 2010-09-09

description:

updated

Revision history for this message

Gary Poster (gary) wrote on 2010-09-09: Re: oops report should record information about the running environment

adding to Foundations kanban backlog.

Gary Poster (gary) on 2010-10-05

tags:

added: oops-infrastructure
removed: infrastructure oops-tools

Robert Collins (lifeless) on 2010-11-16

description:

updated

Revision history for this message

Robert Collins (lifeless) wrote on 2010-11-16:

This now looks doable without needing a C module at all(short term). Gary, could we look at slotting this in in the near future? I think it would pay itself back pretty quickly.

Revision history for this message

Robert Collins (lifeless) wrote on 2010-11-17:

See http://bugs.python.org/issue10440 for a request for the constant.

Robert Collins (lifeless) on 2011-03-10

summary:

- oops report should record information about the running environment
+ oopses do not gather environmental data(load, thread-cpu-time, ...)

Robert Collins (lifeless) on 2011-10-03

Changed in python-oops:
status:	New → Triaged
importance:	Undecided → High

Robert Collins (lifeless) on 2011-10-13

affects:

oops-tools → python-oops-tools

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

python-roundup #10440
[2:3] Edit

Bug watches keep track of this bug in other bug trackers.