0

Is it possible to shrink ext4 filesystem and the underlying raid5 array in a safe way?

I would like to shrink my 15 TB / 6 drive raid array that contains an ext4 filesystem.

Before I do that on live system, I decided to give it a try on a test environment. I wrote a script that simulates raid+filesystem lifecycle (assemble, mkfs, resize2fs, shrink, ...) but in some cases cases it corrupts the filesystem. The script was run on two different distros (one of them was Centos-8).

I tried to understand the failures and unless I am missing something, mdadm, during raid shrink process (mdadm --grow) knows nothing about the ext4 filesystem and it seems it is not possible to help this tool behave properly.

In my scenario, a script to simulates a process:

  1. selects a random number num_devices (between 5 and 10) is chosen - that determines number of devices in our test array
  2. selects random number device_size (between 300 and 350) - size (in MiB) of a single device
  3. creates and assembles /dev/md0 - an RAID 5 array (in my case it was 0.90 metadata) - size of an array is array_size=($num_devices-1)*$device_size
  4. creates ext4 filesystem on /dev/md0 and mounts it to /mnt
  5. copies a reference file (in my case it was one of the kernel images from /boot) $num_devices times to /mnt (to have some data to validate filesystem integrity) - the filesystem has about 80% of free space avaialable
  6. the filesystem is unmounted, fscked (e2fsck -f) then shrunk (either resize2fs -M for minimum size or reisze2fs /dev/md0 {calculated_size}), and fscked again

  7. the script waits for mdadm rebuild process to finish (by looking at /proc/mdstat)

  8. new array size is calculated: new_array_size=($num_devices-2)*$device_size
  9. hard disk failure is simulated by mdadm --manage /dev/md0 --fail /dev/loop3 followed by mdadm --manage /dev/md0 --remove /dev/loop3
  10. waits for the reshape process to finish

Once the reshape process is finished, /dev/loop3 is marked as removed and another loop device (e.g. /dev/loop2) is marked as spare.

  1. the process determines the spare, and re-adds it to an array (mdadm --manage /dev/md0 --remove /dev/loop2 followed by mdadm --manage /dev/md0 --add /dev/loop2)
  2. script waits for raid rebuild the to finish (watching /proc/mdstat)

At this moment the corrupt occurs:

  1. filesystem is mounted again at /mnt
  2. md5 checksum comparision between reference file and copies on a shrunk filesystem either succeds or fails for 1-2 files
  3. filesystem is unmounted, fscked (e2fsck -f), grown to maximum (resize2fs) and fscked again
  4. corruptions is still present

Am I doing something wrong or raid5 shrink process is really unsupported? or is 0.90 metadata the reason?

4
  • You described the procedure. For users who want to replicate your results the script code would be useful (not instead of the description, aside of). Maybe even the culprit is some quirk in the script itself. Can we get the code? Commented Apr 6, 2020 at 5:31
  • What kind of storage is backing each of your loop devices? A disk partition? A file created with mkfile or fallocate? Commented Apr 6, 2020 at 6:14
  • @KamilMaciorowski: here's a script I am using to test for corruption: gist.github.com/adamgolebiowski/… please note it should be run on a test environment Commented Apr 6, 2020 at 9:32
  • @MarkPlotnick: I am using loop files created either by dd or fallocate -z Commented Apr 6, 2020 at 9:33

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.