On 4/22/2010 5:08 AM, ceg wrote: > Phillip, before suggesting something I try to think through the issue, > and the same I try with feedback. > > But after several attempts to explain that changing metadata and > removing the "failed" status (of allready running parts) in the > superblocks of the conflicting parts that are plugged-in (but not to be > added to the running array) breaks hot-plugging, I sadly still can't > recognize any consideration of the bad effects your approach would have > for many users. That's because it DOESN'T break hot-plugging. I have explained why. > And if I think about it, your metadata updates may not have the overall > effect you may expect. When the modified part is plugged in > during future boots, it can get run degraded again, the metadata > is then back to what it was before, and it can again be used normally. > So the metadata updates just breaks hotplugging and you could not > explain a case where continous unintentional flip-flopping would occur > and updating metadata would help. No, the second disk will not be run degraded again; that is the whole point of correcting the wrong metadata. If the second disk is the only one there on the next boot, it will show that disk 2 is failed so it can't be used, and mdadm can't find disk 1, so the array can not be started. > Correct, that is unrelated to the metadata problem, I commented on it > because setting this up has its pitfalls (like UUID dupes and this bug > requiring --zero-superblock to prevent it from biting) and it would much > facilitate comparing, copying etc. in a hot-plug environment. As I said before, it does not require --zero-superblock. Once disk2 is failed and removed from the array, you can create a new array using that disk. mdadm will warn you that the disk appears to already be part of an array, but you can tell it to continue and it will put disk2 in a new array, with a new uuid, and you can mount it and inspect it. Once you are done with it you can move it back to the original array and a full resync will be done. > It's even simpler once you can see that fixing metadata creates more > issues than are actually there and updating metadata would really be > able solve. I have shown why this is wrong. > If it happens that both segments get available with > conflicting changes, one needs to be chosen (first one is already > there). But if you update the metadata on this occasion (disabling one segment), > from this moment on the raid system will not keep the > system running as designed, and like it did before both segments came up > together once. (You would change/break behavior.) Yes, and this change is entirely intentional because if you don't do this, then you can unintentionally continue to further diverge the two disks without noticing, causing further damage. Imagine a server that boots and decides it can't find disk2, so it goes degraded. It has a cron job that fetches email from a pop server and deletes them once they have been downloaded. The server reboots and this time can only find disk1. Now the cron job again, fetches and deletes some mail. Now some of your mail is on disk1, and some is on disk2, and you are running without redundancy. You reboot and both disks are found. You can't use both because they have become divergent, so you have to choose one. If you don't update the metadata on the second disk, then when you reboot and they happen to be detected in the reverse order, then you flip-flop back and forth each boot and your mail gets further and further split. Now let's say that you have a spare disk. When disk2 can't be found, the spare kicks in and the array is rebuilt using the spare. Now if you reboot and notice that disk2 is wrong, but don't update its metadata, then you can run for a while and continue downloading mail to a properly functioning redundant array. Then you reboot and this time disk2 happens to be detected first. You activate the degraded and more out of date array and are now running without redundancy and will appear to be missing more mail. The flip flopping MUST be avoided if possible.