I recently replaced a 1TB drive (Seagate Barracuda) in my Synology DS-1813+’s SHR-2 array with a 4TB drive (Western Digital Red). During that process, I had another drive which was on it’s last leg (another Seagate Barracuda, 4TB this time) die. I replaced the 4TB Seagate with a 6TB Western Digital Red drive. After everything was finished rebuilding and expanding, I was left a very small change in the capacity of the volume. For having added 5TB to the array, I was seeing about a 1TB change in capacity. That didn’t seem right to me.
So I asked Reddit. The answer there was “Well, SHR hides the complexity of RAID, bla bla bla.” So I asked Synology support. The answer from them was “Well, the calculator on the site is only for new arrays, what you’ll actually see when expanding is bla bla bla.” Neither of those answers were reasonable to me, so I started digging.
As it turned out, when looking at
cat /proc/mdstat, I saw (something similar to, recreated from memory) this:
~ # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md5 : active raid6 sdf8 sda8 sdd8 sde8 sdh8 sdg8 3906585344 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU] bitmap: 0/1 pages [0KB], 65536KB chunk md4 : active raid6 sde7 sdd7(S) sda7(S) sdg7 sdf7 sdh7 1953485568 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/4] [UUUU] bitmap: 0/1 pages [0KB], 65536KB chunk md3 : active raid6 sdh6 sdd6(S) sda6(S) sdg6 sdf6 sdb6 sde6 sdc6 5860456704 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU] bitmap: 0/1 pages [0KB], 65536KB chunk md2 : active raid6 sde5 sda5(S) sdg5 sdf5 sdb5 sdd5 sdc5 sdh5 5832165888 blocks super 1.2 level 6, 64k chunk, algorithm 2 [7/7] [UUUUUUU] bitmap: 0/1 pages [0KB], 65536KB chunk md1 : active raid1 sda2 sdb2 sdc2 sdd2 sde2 sdf2 sdg2 sdh2 2097088 blocks [8/8] [UUUUUUUU] md0 : active raid1 sda1 sdb1 sdc1 sdd1 sde1 sdf1 sdg1 sdh1 2490176 blocks [8/8] [UUUUUUUU] unused devices: <none>
At first glance, everything looked fine. I ran an
lsblk, and everything seemed fine there too. I checked
mdadm --examine /dev/md[0,1,2,3,4,5] and all of that seemed reasonable. Except, not quite.
The results from
mdadm --examine /dev/md[2,3,4] showed that several of the partitions had been added to the array as spares, and if you look closely at the
cat /proc/mdstat above, that’s confirmed by looking at the devices that in arrays – some of them have an
(S) after them, also indicating spare. You’ll also notice from that output that bitmaps were enabled which I had done from a previous rebuild operation.
I believe what happened was that, because I had left bitmaps on, the Synology (actually, mdadm), wasn’t able to successfully execute the
mdadm --grow /dev/md[2,3,4] --raid-devices=N (where N is the new number of devices) after it had successfuly performed the (for example)
mdadm --add /dev/md2 /dev/sda5. Because of that, the devices were only added as spares and not integrated in to the array, and the subsequent
resize2fs command had no additional capacity to resize to.
What I ended up doing was
mdadm --grow /dev/md[2,3,4] --bitmap=none, and then for each of the
mdadm --grow /dev/mdX --raid-devices=N, X being the
md device, and
N being the number of devices currently in the array plus the number marked as spare.
After each of those commands completed, DSM happily reported that I could expand the space. I wanted to get all of the devices in the array before I expand the first, so I finished all of those first. And then, through DSM, I expand the space. Doing this, I was able to recover nearly 5TB of “lost” capacity to the volume.