Commit Graph

686 Commits

Author SHA1 Message Date
Dan Williams 313a4a82f1 ping_manager() to prevent 'add' before 'remove' completes
It is currently possible to remove a device and re-add it without the
manager noticing, i.e. without detecting a mdstat->devcnt
container->devcnt mismatch.  Introduce ping_manager() to arrange for
mdmon to run manage_container() prior to mdadm dropping the exclusive
open() on the container.  Despite these precautions sysfs_read() may
still fail.  If this happens invalidate container->devcnt to ensure
manage_container() runs at the next event.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:43 -07:00
Dan Williams 4795982e68 sysfs: detect disks that are in the process of being removed
When removing a disk there is a window where the 'slot' attribute of
md/dev-$name will return -EBUSY to read attempts.  When this happens
look at the the 'block' link, if it is removed then we can be sure the
device has been removed, versus some other error.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:43 -07:00
Dan Williams 4065aa816a monitor: clean up some debug messages
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:43 -07:00
Dan Williams 93f7cacab3 mdmon: resume rebuild
If we started a degraded array that was previously rebuilding we may
have enough information to resume the rebuild without a trip through the
monitor.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:43 -07:00
Dan Williams e553d2a458 imsm: allow a failed disk to be readded
Allow the following sequence to rebuild the array
mdadm --fail /dev/md/r1 /dev/disk
mdadm --remove /dev/imsm /dev/disk
mdadm --add /dev/imsm /dev/disk

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams 1770662bca 'mdadm --wait-clean' wait for array to be marked clean
For use in distro shutdown scripts with a RAID root file system.
Returns immediately if the array is 'readonly', or not an externally
managed array.  It is up to the distro's scripts to make sure no new
writes hit the device after this returns 'true'.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams c94709e83f Add ping_monitor() to mdadm --wait
The action we are waiting for may not be complete until the monitor has
had a chance to take action on the result.

The following script can now remove the device on the first attempt,
versus a few attempts with the original Wait():
#!/bin/bash
#export MDADM_NO_MDMON=1
export IMSM_DEVNAME_AS_SERIAL=1
./mdadm -Ss
./mdadm --zero-superblock /dev/loop[0-3]
echo 2 > /proc/sys/dev/raid/speed_limit_max
./mdadm --create /dev/imsm /dev/loop[0-3] -n 4 -e imsm -a md
./mdadm --create /dev/md/r1 /dev/loop[0-3] -n 4 -l 5 --force -a mdp
./mdadm --fail /dev/md/r1 /dev/loop3
./mdadm --wait /dev/md/r1
x=0
while  ! ./mdadm --remove /dev/imsm /dev/loop3 > /dev/null 2>&1
do
        x=$((x+1))
done
echo "removed after $x attempts"
./mdadm --add /dev/imsm /dev/loop3

Include 2 small cleanups:
* remove the almost open coded fd2devnum() in Wait() by introducing a
  new utility routine stat2devnum()
* teach connect_monitor() to parse the container device from a subarray
  string

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams 0c0c44db5a monitor: don't mark dirty on resync complete
...instead look at array state to determine if the array is consistent

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams d797a0621f monitor: mark clean on active-idle
This also handles the case where 'clean' is set directly.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams 8ed3e5e1bf Honor safemode_delay at Create() and Incremental() time
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams 301406c9fd imsm: use ->getinfo_super() in ->container_content()
* allows container_content() to pick up the safemode_delay
* removes some duplicate code
* fixes an endian bug setting info->array.chunk_size

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams a67dd8cc58 Allow metadata handlers to communicate desired safemode delay via mdinfo
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:42 -07:00
Dan Williams d253482527 Makefile: Add mdmon header dependencies
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams 1f24f03530 imsm: fix up serial handling
* Trim trailing and leading whitespace
* Allow unterminated serial numbers up to MAX_RAID_SERIAL_LEN

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams f9ba0ff124 imsm: only use the device name as a fallback when IMSM_DEVNAME_AS_SERIAL=1
Also ensure that the serial buffer is initialized.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams 0c046afd06 imsm: rectify map handling
The secondary map is used to reflect the migration state of the array
i.e.  from dev->vol.map[1] to dev->vol.map[0].  Ensure a rebuilding /
initializing array is marked in the second map, while normal status is
reflected in the first map.  Also mark rebuilding drives with
IMSM_ORD_REBUILD.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams 24565c9a99 imsm: fix imsm_delete()
* fix breakage from last merge (infinite loop in imsm_process_update())
* add ability to delete by index

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams b10b37b839 imsm: use IMSM_ORD_REBUILD instead of USABLE flag
IMSM_ORD_REBUILD is the 'insync' flag in MD terms.  USABLE is a flag to
opt-in disks for use with the Windows driver.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams be73972fac imsm: introduce set_imsm_ord_tbl_ent()
Collapse all the open coded occurrences.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams fb49eef264 imsm: cleanup arguments to imsm_check_degraded
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams ff077194a1 imsm: cleanup get_imsm_disk_idx(), unify with get_imsm_ord_tbl_ent()
Save some unnecessary calls to get_imsm_map() by teaching
get_imsm_disk_idx() to retrieve the map.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:41 -07:00
Dan Williams 3e372e5a72 imsm: fix up compare_super_imsm() to match family_num for populated mpb's
This allows spares to be associated with any family while not allowing
disks from different families to be assembled.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:40 -07:00
Dan Williams e0783b419d imsm: fix up spare handling holdover in update_create_array
We used to leave SPARE_DISK unset to indicate it was available to be
assimilated into other arrays.  Now we explicitly check the size.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:55:40 -07:00
Dan Williams 8796fdc4cd imsm: mark failures like the Matrix driver
* Truncate the first character of the serial number
* Set 'scsi_id' to all f's
* Expect to find disk entries with unmatchable serial numbers, i.e.
  expect get_imsm_disk() to return NULL in some situations
* Allow discrepencies between mpb->num_disks and len(super->disks)

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:55:34 -07:00
Dan Williams 4d7b1503a7 imsm: provide for a larger mpb buffer when necessary
Ensure that the mpb buffer is large enough to hold the extra imsm_map's
of migrating arrays and dynamically created raid devices.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:55:34 -07:00
Dan Williams fb9bf0d3e7 imsm: fix logic inversion in get_imsm_ord_tbl_ent()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:55:30 -07:00
NeilBrown 94a20f0c80 Fix alignment for backup of reshape data.
Since we introduced O_DIRECT for device access we need
properly aligned buffers and IO requests.  The reshape code
missed out on the conversion.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 17:55:15 +10:00
NeilBrown e9dd159873 Allow an externally managed array to be marked readonly
If the metadata_version is
    -mdXXX/whatever
rather than
    /mdXXX/whatever

then the array is readonly and should be left alone by mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 17:55:15 +10:00
NeilBrown 3c558363a1 Factor out test for subarray version string.
We are about to change the syntax of the version string
for 'subarray's.  So factor out the test into a single function.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 17:55:15 +10:00
Dan Williams 6c386dd368 imsm: allow container assembly in the presence of failed disks
For example, this allows one to still say mdadm -A /dev/sd[b-e] even
though /dev/sde has replaced /dev/sdd.  Otherwise mdadm will say:

	mdadm: superblock on /dev/sdd doesn't match others - assembly aborted

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-19 17:55:15 +10:00
NeilBrown 567df5fd0a Fix bug with ddf if devices have different sizes.
We cannot use the header of the 'best' device to find the
sections on the other devices!!


Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 17:55:15 +10:00
NeilBrown 2cc2983d80 Provide ddf support for adding a device to an active container.
Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 17:55:15 +10:00
Dan Williams 43dad3d6fb mdadm: add device to a container
Adding a device updates the container and then mdmon takes action upon
noticing a change in devices.  This reuses the container version of
add_to_super to create a new record for the device.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 17:19:51 +10:00
Dan Williams 7bc1962f8c mdmon: remove devices from container
Once the monitor thread has kicked a drive from all managed arrays mdadm
-r is permitted.  We are guaranteed that the drive is marked failed at
this point, so allow the drive to be re-added as a spare.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-19 14:55:12 +10:00
Dan Williams ae6aad8239 imsm: delete kicked disks
When we have determined that a disk is no longer of any value, remove
it from the data structure.   This is now safe because the manager
will back off while any metadata update is pending in the monitor.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:55:10 +10:00
NeilBrown 0b5ec75e01 Fix mdstat_wait_fd
It didn't necessarily wait for the fd.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:55:07 +10:00
NeilBrown 3c00ffbe98 Make metadata updates from manage to monitor 'synchronous'
A metadata update may modify the data structure of the metadata
including freeing things, so it is not safe of the manager to touch
the metadata while an update is pending in the monitor.
So When an update has been submitted, don't do anything else in the
manager until it is complete.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:55:03 +10:00
NeilBrown 01f157d74a Extra option for set_array_state: you choose dirty or clean.
When we first start an array, it might be good to start recovery
straight away.  That requires setting the array to 'dirty', but
only the metadata handler can know if that is required or not.
So have a third possible 'consistent' option to set_array_state.
Either 'no' or 'yes' or 'you choose'.

Return value indicates what was chosen.

'1' (no) should be chosen unless there is a good reason.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:54:55 +10:00
Dan Williams 9296754385 mdmon: handle failures versus readauto arrays
Transition readauto arrays to active before failing drives.

Hmm... why do we keep reblocking / renotifying in the readonly case?
Need to bottom out on this, but not right now.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-15 10:58:43 -07:00
Dan Williams f1d267661d mdmon: allow degraded arrays to be monitored
manage_new is too strict in the face of failed devices.  Teach it to
monitor degraded arrays.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-15 10:58:43 -07:00
Dan Williams fcb844757f imsm: include not synced disks in imsm_count_failed
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-15 10:58:42 -07:00
Dan Williams 7eef045331 imsm: use disk_ord_tbl to identify rebuilding disks
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-15 10:57:19 -07:00
Dan Williams 9a1608e5d0 imsm: fix up assembly of disks that are not in-sync
1/ Do not assemble !in_sync or failed devices in container_content.
2/ Prevent activation of failed or configured devices in activate_spare.
3/ Be sure to avoid dirty degraded if the array was shutdown cleanly.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams 6a3e913ee9 imsm: fix create by mdmon-update
imsm_dev dynamically grows, so dev_idx needs to be moved up in the
definition to avoid getting clobbered.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams e74255d907 imsm: write_super return 0 on success
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams a48ac0a8d6 imsm: update mpb_size in write_super_imsm
With dev->vol.map and mpb->disk entries entering and leaving the parameter
block write_super_imsm needs to update the size before writeback.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams 272906ef49 mdmon: use activate spare for re-add
Disks that are not in-sync or failed are not assembled into member
arrays by mdadm.  Teach mdmon to resolve this situation by checking for
spares at start.  imsm_activate_spare() is updated to prefer devices
that can be re-added versus new spares.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:46 -07:00
Dan Williams 3393c6af8b imsm: fix handling of the 'migr_state' and 'migr_type' bits
The option-rom and the Matrix driver mark resyncs/rebuilds with the
migrate state bits.  Update sizeof_imsm_dev to allow allocation of
imsm_dev entries large enough to grow if migr_state is later set.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:05:20 -07:00
Dan Williams a965f303c7 imsm: add get_imsm_map and sizeof_imsm_map
retrieve map entries from a imsm_dev, and cleanup imsm_copy_dev

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-11 01:16:24 -07:00
Dan Williams 828408ebef imsm: drop 'external' from imsm_examine_brief
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-11 01:16:24 -07:00