21 Aug 2014, 20:14
Synology NAS (DS1813+) degraded array for md0 and md1 after rebuildI recently purchased a Synology DS1813+ to replace my troubled Drobo-FS. The migration process was long and arduous, consisting of a handful of rebuilds on both the DS side as well as the Drobo side as I shuffled data and moved disks.
During my final rebuild on the DS side (which is an SHR-2 array), I experienced a drive failure in Bay 3 (a Seagate 3TB Barracuda) which resulted in a hard-lock of the device requiring a reboot. When the DS came back up, the drive was available to to DiskStation Manager (DSM), however it wasn’t part of the array, and no amount of mdadm
fiddling would re-add it, so through DSM I requested a rebuild of the array to that disk.
Unfortunately, part way through the rebuild, that disk failed again and dropped out of the array and out of the OS as well; it wasn’t found anywhere. Moments after that, the disk in Bay 2 (another Seagate Barracuda, 2TB) dropped from the OS.
At this point I initiated a rebuild with a 4TB Western Digital Red drive that I had configured as a hot stand-spare.
26 hours later, the rebuild finished, though my array was still degraded due to the lack of two drives at this point. I rebooted the DS, it picked up Bay 2 again, and everything was happy. Almost.
DSM reported that the DS was in good condition, however cat /proc/mdstat
had something else to say:
~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md4 : active raid6 sdh7[4] sdd7[0] sdg7[3] sdf7[2]
1953485568 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
md3 : active raid6 sdb6[7] sdh6[6] sdc6[1] sdg6[5] sdf6[4] sdd6[2]
3906971136 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
md2 : active raid6 sdb5[8] sdh5[7] sda5[0] sdg5[6] sdf5[5] sdd5[3] sdc5[2]
4860138240 blocks super 1.2 level 6, 64k chunk, algorithm 2 [7/7] [UUUUUUU]
md1 : active raid1 sdh2[4] sdg2[6] sdf2[5] sdd2[3] sdc2[2] sdb2[1] sda2[0]
2097088 blocks [8/7] [UUUUUUU_]
md0 : active raid1 sdb1[2] sdh1[1] sda1[0] sdc1[4] sdd1[3] sdf1[5] sdg1[6]
2490176 blocks [8/7] [UUUUUUU_]
unused devices: <none>
~ #
Yes, it would seem as though my rebuild missed md0
and md1
. I found that very curious, because they were part of the rebuild process when I was nervously querying cat /proc/mdstat
.
After a day and a half of nervously inspecting partitions, configurations, and mdadm
’s output, I discoverd that md0
and md1
aren’t my devices in that they don’t hold any of my data. When I queried pvdisplay
, they weren’t listed in any of LVM
’s volumes, and when mounting them, they appeared to contain replicas of the OS (which I do suppose makes sense).
I was able to address the issue by issuing mdadm --grow -n 7 /dev/md[01]
which caused those two arrays to “grow” (in this case, shrink) by one device. That happened immediately, and a subsequent cat /proc/mdstat
showed fully happiness across the board:
~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md4 : active raid6 sdd7[0] sdg7[3] sdf7[2] sdh7[4]
1953485568 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
md3 : active raid6 sdh6[6] sdg6[5] sdf6[4] sdb6[7] sdd6[2] sdc6[1]
3906971136 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
md2 : active raid6 sda5[0] sdg5[6] sdf5[5] sdb5[8] sdd5[3] sdc5[2] sdh5[7]
4860138240 blocks super 1.2 level 6, 64k chunk, algorithm 2 [7/7] [UUUUUUU]
md1 : active raid1 sda2[0] sdb2[1] sdc2[2] sdd2[3] sdf2[5] sdg2[6] sdh2[4]
2097088 blocks [7/7] [UUUUUUU]
md0 : active raid1 sda1[0] sdb1[2] sdc1[4] sdd1[3] sdf1[5] sdg1[6] sdh1[1]
2490176 blocks [7/7] [UUUUUUU]
unused devices: <none>
Now, with one bay empty, I just have to wait on my last 4TB WD Red to arrive to be configured as a replacement hot-spare, and I’ll be in business!