Regiond memory leak when Prometheus metrics are enabled
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Medium
|
Victor Tapia | ||
2.6 |
Won't Fix
|
Undecided
|
Unassigned | ||
2.7 |
Won't Fix
|
Undecided
|
Unassigned | ||
2.8 |
Fix Committed
|
Undecided
|
Unassigned | ||
2.9 |
Fix Released
|
Medium
|
Victor Tapia | ||
3.0 |
New
|
Undecided
|
Unassigned |
Bug Description
After enabling the prometheus metrics endpoint with:
maas admin maas set-config name=prometheus
Every request allocates memory that is never released. The simplest reproducer is to run curl indefinitely in the background. For instance, run this script with nohup:
#!/bin/bash
while true; do
curl http://
done
Leaving this script running for a while will show how the RSS for the regiond processes increases constantly, only to be released when regiond is restarted (it's easier to see and track when there's only one worker).
Using objgraph I could see that a dict type and MmapedValue (from prometheus-
2021-04-29 15:45:27 regiond: [info] 127.0.0.1 GET /MAAS/metrics HTTP/1.1 --> 200 OK (referrer: -; agent: curl/7.68.0)
2021-04-29 15:45:33 stdout: [info] dict 224666
2021-04-29 15:45:33 stdout: [info] MmapedValue 193362
2021-04-29 15:45:33 stdout: [info] function 39308
2021-04-29 15:45:33 stdout: [info] tuple 32844
2021-04-29 15:45:53 regiond: [info] 127.0.0.1 GET /MAAS/metrics HTTP/1.1 --> 200 OK (referrer: -; agent: curl/7.68.0)
2021-04-29 15:45:59 stdout: [info] dict 224707
2021-04-29 15:45:59 stdout: [info] MmapedValue 193403
2021-04-29 15:45:59 stdout: [info] function 39308
2021-04-29 15:45:59 stdout: [info] tuple 32844
2021-04-29 15:46:45 regiond: [info] 127.0.0.1 GET /MAAS/metrics HTTP/1.1 --> 200 OK (referrer: -; agent: curl/7.68.0)
2021-04-29 15:46:50 stdout: [info] dict 224748
2021-04-29 15:46:50 stdout: [info] MmapedValue 193444
2021-04-29 15:46:50 stdout: [info] function 39308
2021-04-29 15:46:50 stdout: [info] tuple 32844
Related branches
- Adam Collard (community): Approve
- MAAS Lander: Approve
-
Diff: 36 lines (+12/-7)1 file modifiedsrc/maasserver/prometheus/stats.py (+12/-7)
- Adam Collard (community): Approve
- MAAS Lander: Approve
-
Diff: 35 lines (+10/-6)1 file modifiedsrc/maasserver/prometheus/stats.py (+10/-6)
- MAAS Lander: Approve
- MAAS Maintainers: Pending requested
-
Diff: 36 lines (+12/-7)1 file modifiedsrc/maasserver/prometheus/stats.py (+12/-7)
- MAAS Lander: Approve
- Adam Collard (community): Approve
-
Diff: 36 lines (+12/-7)1 file modifiedsrc/maasserver/prometheus/stats.py (+12/-7)
- MAAS Lander: Approve
- MAAS Maintainers: Pending requested
-
Diff: 36 lines (+12/-7)1 file modifiedsrc/maasserver/prometheus/stats.py (+12/-7)
- MAAS Lander: Needs Fixing
- Alberto Donato: Approve
-
Diff: 36 lines (+12/-7)1 file modifiedsrc/maasserver/prometheus/stats.py (+12/-7)
Changed in maas: | |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in maas: | |
assignee: | nobody → Victor Tapia (vtapia) |
status: | Triaged → In Progress |
Changed in maas: | |
milestone: | none → next |
status: | In Progress → Fix Committed |
Changed in maas: | |
milestone: | next → 3.0.1 |
Changed in maas: | |
milestone: | 3.0.1 → 3.2.0-beta1 |
Changed in maas: | |
status: | Fix Committed → Fix Released |
Hi Victor, could you please provide the changes you made to have objgraph print out that log?