On 04/15/2010 03:55 AM, ceg wrote:
> Upon disappearance, a real failure, mdadm --fail or running an array
> degraded: mdadm -E shows *missing* disks marked as "removed". (What
> you probably referred to all the time.) Even though nobody actually
> issued "mdadm --removed" on them. (What I referred to.)
Exactly, when mounting an array in degraded mode with missing disks,
mdadm marks the missing disks as removed. It probably should only mark
them as faulty or something less severe than removed.
> All would be clearer if * mdadm -E would report "missing" instead of
> removed (which sounds like it really got "mdadm --removed")
There already exists a faulty state. It might be appropriate to use that.
> That is a good point! If confliciting changes can be detected by
> this, why does mdadm not use this conflicting information (when parts
> of an array are claiming each other to be failed) to just report
> "conflicting changes" and refuse to --add without --force? (You see I
> am back asking to report and require --force to make it clear to
> users/admin that it is not only some bug/hickup in the hot-plug
> mechanism that made it fail, but -add is a manual operation that
> implies real data-loss in this case, not as in others when it will
> only sync an older copy instead of a diverged one.)
That seems to be the heart of the bug. If BOTH disks show the second
disk as removed, then mdadm will not use the second disk, but when the
metadata on the second disk says disk 2 is fine, and it's disk 1 that
has been removed, it happily adds the disk. It should not trust the
wrong metadata on the second disk and refuse to use it unless it can
safely coerse it into agreement with the active metadata in the array
taken from the first disk.
If the second disk says both disks are fine, then the array state of
disk 2 can be changed to active/needs sync, and the metadata on both
disks can be updated to match and the resync started.
If the second disk says that the first disk has been
removed/failed/missing, then you can not reconcile them since failing
the first disk would fail the array, and activating the second disk
could destroy data. In this case the second disk should be marked as
removed and its metadata updated. This will make sure that if you
reboot and the second disk is detected first, that it will not be
activated. In other words, as soon as you have a boot that does see
both disks after they have been independently degraded and modified, ONE
of them will be chosen as the victor, and used from then on, and the
other will be removed until the admin has a chance to investigate and
decide to manually add it back, thus destroying any changes on that disk
that were made during the boot with only that disk available.
On 04/15/2010 03:55 AM, ceg wrote:
> Upon disappearance, a real failure, mdadm --fail or running an array
> degraded: mdadm -E shows *missing* disks marked as "removed". (What
> you probably referred to all the time.) Even though nobody actually
> issued "mdadm --removed" on them. (What I referred to.)
Exactly, when mounting an array in degraded mode with missing disks,
mdadm marks the missing disks as removed. It probably should only mark
them as faulty or something less severe than removed.
> All would be clearer if * mdadm -E would report "missing" instead of
> removed (which sounds like it really got "mdadm --removed")
There already exists a faulty state. It might be appropriate to use that.
> That is a good point! If confliciting changes can be detected by
> this, why does mdadm not use this conflicting information (when parts
> of an array are claiming each other to be failed) to just report
> "conflicting changes" and refuse to --add without --force? (You see I
> am back asking to report and require --force to make it clear to
> users/admin that it is not only some bug/hickup in the hot-plug
> mechanism that made it fail, but -add is a manual operation that
> implies real data-loss in this case, not as in others when it will
> only sync an older copy instead of a diverged one.)
That seems to be the heart of the bug. If BOTH disks show the second
disk as removed, then mdadm will not use the second disk, but when the
metadata on the second disk says disk 2 is fine, and it's disk 1 that
has been removed, it happily adds the disk. It should not trust the
wrong metadata on the second disk and refuse to use it unless it can
safely coerse it into agreement with the active metadata in the array
taken from the first disk.
If the second disk says both disks are fine, then the array state of
disk 2 can be changed to active/needs sync, and the metadata on both
disks can be updated to match and the resync started.
If the second disk says that the first disk has been failed/ missing, then you can not reconcile them since failing
removed/
the first disk would fail the array, and activating the second disk
could destroy data. In this case the second disk should be marked as
removed and its metadata updated. This will make sure that if you
reboot and the second disk is detected first, that it will not be
activated. In other words, as soon as you have a boot that does see
both disks after they have been independently degraded and modified, ONE
of them will be chosen as the victor, and used from then on, and the
other will be removed until the admin has a chance to investigate and
decide to manually add it back, thus destroying any changes on that disk
that were made during the boot with only that disk available.