2022-11-09 04:28:52 |
dongdong tao |
bug |
|
|
added bug |
2022-11-11 08:03:18 |
dongdong tao |
description |
Bluestore's onode cache might be completely disabled because of the entry leak happened in bluestore_cache_other mempool
upstream bug :
[1]https://tracker.ceph.com/issues/56424 |
[Impact]
This issue has been observed from ceph octopus 15.2.16.
Bluestore's onode cache might be completely disabled because of the entry leak happened in bluestore_cache_other mempool.
Below log shows the cache's maximum size had become 0:
------
2022-10-25T00:47:26.562+0000 7f424f78e700 30 bluestore.MempoolThread(0x564a9dae2a68) _resize_shards max_shard_onodes: 0 max_shard_buffer: 8388608
-------
The dump_mempools bluestore_cache_other had consumed most majority of the cache due to the leak while only 3 onodes (2 of them are pinned) are in the cache:
---------------
"bluestore_cache_onode": {
"items": 3,
"bytes": 1848
},
"bluestore_cache_meta": {
"items": 13973,
"bytes": 111338
},
"bluestore_cache_other": {
"items": 5601156,
"bytes": 224152996
},
"bluestore_Buffer": {
"items": 1,
"bytes": 96
},
"bluestore_Extent": {
"items": 20,
"bytes": 960
},
"bluestore_Blob": {
"items": 8,
"bytes": 832
},
"bluestore_SharedBlob": {
"items": 8,
"bytes": 896
},
--------------
This could cause the io experiencing high latency as the 0 sized cache will significantly increasing the need to fetch the meta data from rocksdb or even from disk.
Another impact is that this can significantly increase the possibility of hitting the race condition in Onode::put [2], which will crash the osds, especially in large cluster.
[Test Case]
1. Deploy a 15.2.16 ceph cluster
2. Create enough rbd images to spread all over the OSDs
3. Stressingthem with fio 4k randwrite workload in parallel until the OSDs got enough onodes in its cache (more than 60k onodes and you'll see the bluestore_cache_other is over 1 GB):
fio --name=randwrite --rw=randwrite --ioengine=rbd --bs=4k --direct=1 --numjobs=1 --size=100G --iodepth=16 --clientname=admin --pool=bench --rbdname=test
4. Shrink the pg_num to a very low number so that pgs per osd is around 1.
Once the shrink finished
5. Enable debug_bluestore=20/20, we can observe a 0 sized onode cache by grep max_shard_onodes. Also can observe the leaked bluestore_cache_other mempool via "ceph daemon osd.id dump_mempools"
[Potential Regression]
The patch correct the apparent wrong AU calculation of the bluestore_cache_other pool, it wouldn't increase any regression.
[Other Info]
The patch[1] had been backported to upstream Pacific and Quincy, but not Octopus.
Pacific is going to have it on 16.2.11 which is still pending.
Quincy already had it in 17.2.4
We'll need to backport this fix to Octopus.
[1]https://github.com/ceph/ceph/pull/46911
[2]https://tracker.ceph.com/issues/56382 |
|
2022-11-11 08:03:57 |
dongdong tao |
summary |
the leak in bluestore_cache_other mempool |
[SRU] the leak in bluestore_cache_other mempool |
|
2022-11-11 08:04:31 |
dongdong tao |
tags |
|
sts-sru-needed |
|
2022-11-11 08:04:36 |
dongdong tao |
tags |
sts-sru-needed |
seg sts-sru-needed |
|
2022-11-15 16:08:24 |
Chris MacNaughton |
nominated for series |
|
Ubuntu Focal |
|
2022-11-15 16:08:24 |
Chris MacNaughton |
bug task added |
|
ceph (Ubuntu Focal) |
|
2022-11-15 16:08:57 |
Chris MacNaughton |
bug task added |
|
cloud-archive |
|
2022-11-15 16:09:15 |
Chris MacNaughton |
nominated for series |
|
cloud-archive/victoria |
|
2022-11-15 16:09:15 |
Chris MacNaughton |
bug task added |
|
cloud-archive/victoria |
|
2022-11-15 16:09:15 |
Chris MacNaughton |
nominated for series |
|
cloud-archive/ussuri |
|
2022-11-15 16:09:15 |
Chris MacNaughton |
bug task added |
|
cloud-archive/ussuri |
|
2022-11-16 07:40:50 |
dongdong tao |
attachment added |
|
focal-15.2.17-debdiff https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1996010/+attachment/5630952/+files/focal-15.2.17-debdiff |
|
2022-11-16 08:01:55 |
dongdong tao |
attachment added |
|
15.2.17-os-bluestore-fix-AU-accounting-in-bluestore_cache_ot.patch https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1996010/+attachment/5630955/+files/15.2.17-os-bluestore-fix-AU-accounting-in-bluestore_cache_ot.patch |
|
2022-11-16 08:24:49 |
Ubuntu Foundations Team Bug Bot |
tags |
seg sts-sru-needed |
patch seg sts-sru-needed |
|
2022-11-16 08:24:55 |
Ubuntu Foundations Team Bug Bot |
bug |
|
|
added subscriber Ubuntu Sponsors Team |
2022-12-16 03:02:46 |
dongdong tao |
attachment added |
|
pacific-debdiff https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1996010/+attachment/5635806/+files/pacific-debdiff |
|
2022-12-16 03:03:12 |
dongdong tao |
affects |
cloud-archive |
xena |
|
2022-12-16 03:11:51 |
dongdong tao |
affects |
xena |
cloud-archive |
|
2022-12-16 03:13:24 |
dongdong tao |
nominated for series |
|
cloud-archive/xena |
|
2022-12-16 03:13:24 |
dongdong tao |
bug task added |
|
cloud-archive/xena |
|
2022-12-16 03:13:24 |
dongdong tao |
nominated for series |
|
cloud-archive/wallaby |
|
2022-12-16 03:13:24 |
dongdong tao |
bug task added |
|
cloud-archive/wallaby |
|
2022-12-16 03:13:42 |
dongdong tao |
bug task deleted |
cloud-archive/victoria |
|
|
2023-01-06 10:06:02 |
Launchpad Janitor |
ceph (Ubuntu): status |
New |
Confirmed |
|
2023-01-06 10:06:02 |
Launchpad Janitor |
ceph (Ubuntu Focal): status |
New |
Confirmed |
|
2023-01-19 20:05:18 |
Chris MacNaughton |
nominated for series |
|
Ubuntu Kinetic |
|
2023-01-19 20:05:18 |
Chris MacNaughton |
bug task added |
|
ceph (Ubuntu Kinetic) |
|
2023-01-19 20:05:18 |
Chris MacNaughton |
nominated for series |
|
Ubuntu Jammy |
|
2023-01-19 20:05:18 |
Chris MacNaughton |
bug task added |
|
ceph (Ubuntu Jammy) |
|
2023-01-19 20:05:18 |
Chris MacNaughton |
nominated for series |
|
Ubuntu Lunar |
|
2023-01-19 20:05:18 |
Chris MacNaughton |
bug task added |
|
ceph (Ubuntu Lunar) |
|
2023-06-09 12:36:26 |
Robie Basak |
bug |
|
|
added subscriber Corey Bryant |
2023-06-13 13:55:19 |
Ponnuvel Palaniyappan |
ceph (Ubuntu Lunar): status |
Confirmed |
Fix Released |
|
2023-06-13 13:55:25 |
Ponnuvel Palaniyappan |
ceph (Ubuntu Kinetic): status |
New |
Fix Released |
|
2023-06-13 13:55:28 |
Ponnuvel Palaniyappan |
ceph (Ubuntu Jammy): status |
New |
Fix Released |
|
2023-06-13 13:55:42 |
Ponnuvel Palaniyappan |
cloud-archive/xena: status |
New |
Fix Released |
|
2023-06-13 13:55:49 |
Ponnuvel Palaniyappan |
cloud-archive/wallaby: status |
New |
Fix Released |
|
2023-06-13 13:57:53 |
Ponnuvel Palaniyappan |
nominated for series |
|
cloud-archive/yoga |
|
2023-06-13 13:57:53 |
Ponnuvel Palaniyappan |
bug task added |
|
cloud-archive/yoga |
|
2023-06-13 13:58:29 |
Ponnuvel Palaniyappan |
cloud-archive/yoga: status |
New |
Fix Released |
|
2023-06-20 11:06:58 |
dongdong tao |
attachment added |
|
ceph_15.2.17-0ubuntu0.20.04.5.debdiff https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1996010/+attachment/5680964/+files/ceph_15.2.17-0ubuntu0.20.04.5.debdiff |
|
2023-06-20 11:15:36 |
dongdong tao |
tags |
patch seg sts-sru-needed |
seg sts-sru-needed |
|
2023-06-23 00:16:25 |
Michael Hudson-Doyle |
removed subscriber Ubuntu Sponsors |
|
|
|
2023-09-22 08:28:16 |
James Page |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2023-09-22 09:44:19 |
Ubuntu Archive Robot |
bug |
|
|
added subscriber James Page |
2023-10-19 18:53:38 |
Andreas Hasenack |
ceph (Ubuntu): status |
Confirmed |
Fix Released |
|
2023-10-19 19:10:09 |
Andreas Hasenack |
ceph (Ubuntu Focal): status |
Confirmed |
Fix Committed |
|
2023-10-19 19:10:12 |
Andreas Hasenack |
bug |
|
|
added subscriber SRU Verification |
2023-10-19 19:10:17 |
Andreas Hasenack |
tags |
seg sts-sru-needed |
seg sts-sru-needed verification-needed verification-needed-focal |
|