tests/10ddf-fail-create-race: test handling of fail/create race

If a disk fails and simulaneously a new array is created, a race
condition may arise because the meta data on disk doesn't reflect
the disk failure yet. This is a test for that case.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
This commit is contained in:
mwilck@arcor.de 2013-08-06 23:38:01 +02:00 committed by NeilBrown
parent 2bcf1873d0
commit 82c8e664cc
1 changed files with 66 additions and 0 deletions

View File

@ -0,0 +1,66 @@
# This test creates a RAID1, fails a disk, and immediately
# (simultaneously) creates a new array. This tests for a possible
# race where the meta data reflecting the disk failure may not
# be written when the 2nd array is created.
. tests/env-ddf-template
mdadm --zero-superblock $dev8 $dev9 $dev10 $dev11 $dev12 $dev13
mdadm -CR $container -e ddf -l container -n 2 $dev11 $dev12
#$dir/mdadm -CR $member0 -l raid1 -n 2 $container -z 10000 >/tmp/mdmon.txt 2>&1
mdadm -CR $member0 -l raid1 -n 2 $container -z 10000
check wait
fail0=$dev11
mdadm --fail $member0 $fail0 &
# The test can succeed two ways:
# 1) mdadm -C member1 fails - in this case the meta data
# was already on disk when the create attempt was made
# 2) mdadm -C succeeds in the first place (meta data not on disk yet),
# but mdmon detects the problem and sets the disk faulty.
if mdadm -CR $member1 -l raid1 -n 2 $container; then
echo create should have failed / race condition?
check wait
set -- $(get_raiddisks $member0)
d0=$1
ret=0
if [ $1 = $fail0 -o $2 = $fail0 ]; then
ret=1
else
set -- $(get_raiddisks $member1)
if [ $1 = $fail0 -o $2 = $fail0 ]; then
ret=1
fi
fi
if [ $ret -eq 1 ]; then
echo ERROR: failed disk $fail0 is still a RAID member
echo $member0: $(get_raiddisks $member0)
echo $member1: $(get_raiddisks $member1)
fi
tmp=$(mktemp /tmp/mdest-XXXXXX)
mdadm -E $d0 >$tmp
if [ x$(grep -c 'state\[[01]\] : Degraded' $tmp) != x2 ]; then
echo ERROR: non-degraded array found
mdadm -E $d0
ret=1
fi
if ! grep -q '^ *0 *[0-9a-f]\{8\} .*Offline, Failed' $tmp; then
echo ERROR: disk 0 not marked as failed in meta data
mdadm -E $d0
ret=1
fi
rm -f $tmp
else
ret=0
fi
[ -f /tmp/mdmon.txt ] && {
cat /tmp/mdmon.txt
rm -f /tmp/mdmon.txt
}
[ $ret -eq 0 ]