Replicator/reconstructor can't rehash partitions on full drives
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Confirmed
|
Medium
|
Unassigned |
Bug Description
If a disk is completely full, but is missing a tombstone, the reconstructor will try and push one over using some combination of REPLICATE and DELETE (both of which fail):
Sep 3 00:46:43 localhost object-server: ERROR __call__ error with DELETE /d4/484/
Sep 3 00:46:43 localhost object-server: ERROR __call__ error with REPLICATE /d4/484/
This may not be a bug, per se, it just seems like if it's doing two operations here it should be smarter and only do one and not the 2nd if the no space left error is occurring.
Note that nothing exists for AUTH_user1/
Changed in swift: | |
status: | New → Incomplete |
summary: |
- Reconstructor has some troubles with tombstones on full drives + Replicator/reconstructor can't rehash partitions on full drives |
A side effect of this bug is that full EC drives never become unfull.
I have a node with 8 drives, and I took 5 of them out of the ring in order to force more data onto the others. This filled up the other 3. Then I put the 5 back into the ring. 12 hours later the data distribution looks like:
ubuntu@ ip-172- 30-3-43: ~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8115168 1524844 6155048 20% /
none 4 0 4 0% /sys/fs/cgroup
udev 2018460 12 2018448 1% /dev
tmpfs 404684 500 404184 1% /run
none 5120 0 5120 0% /run/lock
none 2023420 60 2023360 1% /run/shm
none 102400 0 102400 0% /run/user
/dev/xvdb 8378368 6138984 2239384 74% /srv/node/d0
/dev/xvdf 8378368 8378316 52 100% /srv/node/d4
/dev/xvdd 8378368 5780160 2598208 69% /srv/node/d2
/dev/xvdc 8378368 5820984 2557384 70% /srv/node/d1
/dev/xvdg 8378368 8378348 20 100% /srv/node/d5
/dev/xvde 8378368 5988340 2390028 72% /srv/node/d3
/dev/xvdh 8378368 6922720 1455648 83% /srv/node/d6
/dev/xvdi 8378368 8378328 40 100% /srv/node/d7
The 3 "full" drives are still full, due to the errors noted above in this ticket. The ring has rebalanced so the # of partitions assigned to each drive are approximately equal.