Comment 16 for bug 925280

Revision history for this message
Christian Heimes (heimes) wrote :

I *did* unplug the cable while the system was running and the RAIDs didn't recover automatically.

But I'll better start from the beginning:
I'm in the process of preparing replacement disks and a new OS installation for a production server. The current disks are several years old and are reaching end of lifetime. The new disks are three SSDs with SATA interface. I need only some small disks for the OS, temporary files and caches because the majority of data is stored on several fibre channel enclosures.

I've created four equal partitions (not counting the container for logical partitions) on all three SSDs:

* /boot with RAID 1
* / with RAID 5
* swap with RAID 5
* data partition for caches with RAID 5

During my tests I pulled the SATA cable from one of the SSDs to test the behaviour of the RAID sets and the SMTP server. The system noticed the missing disk within seconds and sent out notification emails as expected. Then I tested how the system handles a reboot with a missing disk. GRUB loaded successfully but the system did not (which is an entirely different issue I need to investigate later). I plugged the disk back, the OS came up fine but it didn't add the formerly disconnected disk to the RAID sets.

I tried mdadm --add and mdadm --re-add without success. mdadm --details clearly showed that the system was aware that the disconnected partitions used to belong to the RAID sets because they had the same UUID. On Ubuntu 10.04 LTS I never had to zero out the superblock to re-join a disconnected disk.

IMHO a user shall expect that mdadm rejoins a formerly disconnected RAID member as soon as possible without any user interaction.