Is it possible to shrink raid5 software array in a safe way?

Ask Question

Asked 5 years, 8 months ago

Modified 5 years, 8 months ago

Viewed 377 times

Is it possible to shrink ext4 filesystem and the underlying raid5 array in a safe way?

I would like to shrink my 15 TB / 6 drive raid array that contains an ext4 filesystem.

Before I do that on live system, I decided to give it a try on a test environment. I wrote a script that simulates raid+filesystem lifecycle (assemble, mkfs, resize2fs, shrink, ...) but in some cases cases it corrupts the filesystem. The script was run on two different distros (one of them was Centos-8).

I tried to understand the failures and unless I am missing something, mdadm, during raid shrink process (mdadm --grow) knows nothing about the ext4 filesystem and it seems it is not possible to help this tool behave properly.

In my scenario, a script to simulates a process:

selects a random number num_devices (between 5 and 10) is chosen - that determines number of devices in our test array
selects random number device_size (between 300 and 350) - size (in MiB) of a single device
creates and assembles /dev/md0 - an RAID 5 array (in my case it was 0.90 metadata) - size of an array is array_size=($num_devices-1)*$device_size
creates ext4 filesystem on /dev/md0 and mounts it to /mnt
copies a reference file (in my case it was one of the kernel images from /boot) $num_devices times to /mnt (to have some data to validate filesystem integrity) - the filesystem has about 80% of free space avaialable
the filesystem is unmounted, fscked (e2fsck -f) then shrunk (either resize2fs -M for minimum size or reisze2fs /dev/md0 {calculated_size}), and fscked again
the script waits for mdadm rebuild process to finish (by looking at /proc/mdstat)
new array size is calculated: new_array_size=($num_devices-2)*$device_size
hard disk failure is simulated by mdadm --manage /dev/md0 --fail /dev/loop3 followed by mdadm --manage /dev/md0 --remove /dev/loop3
waits for the reshape process to finish

Once the reshape process is finished, /dev/loop3 is marked as removed and another loop device (e.g. /dev/loop2) is marked as spare.

the process determines the spare, and re-adds it to an array (mdadm --manage /dev/md0 --remove /dev/loop2 followed by mdadm --manage /dev/md0 --add /dev/loop2)
script waits for raid rebuild the to finish (watching /proc/mdstat)

At this moment the corrupt occurs:

filesystem is mounted again at /mnt
md5 checksum comparision between reference file and copies on a shrunk filesystem either succeds or fails for 1-2 files
filesystem is unmounted, fscked (e2fsck -f), grown to maximum (resize2fs) and fscked again
corruptions is still present

Am I doing something wrong or raid5 shrink process is really unsupported? or is 0.90 metadata the reason?

asked Apr 6, 2020 at 0:10

Adam Golebiowski

512 bronze badges

You described the procedure. For users who want to replicate your results the script code would be useful (not instead of the description, aside of). Maybe even the culprit is some quirk in the script itself. Can we get the code?

Kamil Maciorowski
– Kamil Maciorowski

2020-04-06 05:31:07 +00:00
Commented Apr 6, 2020 at 5:31
What kind of storage is backing each of your loop devices? A disk partition? A file created with mkfile or fallocate?

Mark Plotnick
– Mark Plotnick

2020-04-06 06:14:13 +00:00
Commented Apr 6, 2020 at 6:14
@KamilMaciorowski: here's a script I am using to test for corruption: gist.github.com/adamgolebiowski/… please note it should be run on a test environment

Adam Golebiowski
– Adam Golebiowski

2020-04-06 09:32:19 +00:00
Commented Apr 6, 2020 at 9:32
@MarkPlotnick: I am using loop files created either by dd or fallocate -z

Adam Golebiowski
– Adam Golebiowski

2020-04-06 09:33:18 +00:00
Commented Apr 6, 2020 at 9:33

Add a comment |

0 You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Is it possible to shrink raid5 software array in a safe way?

0

You must log in to answer this question.

Hot Network Questions

Is it possible to shrink raid5 software array in a safe way?

0

You must log in to answer this question.

Related

Hot Network Questions