Commit Graph

917 Commits

Author SHA1 Message Date
Dan Williams 7675959b0f mdmon: man page
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 140d3685fb mdmon: fix missed 'clean' event
mdmon may miss events because it re-reads state after read_and_act.  The
additional read is used to determine dirty status before allowing a
sigterm to proceed.  Since read_and_act is in the best position to
determine 'dirty' status and its return value is not used, modify it to
return true if the array is dirty.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams efb30e7f1e imsm: auto layout
In support of auto-layout:

1/ collect and merge all extents to find the largest common-start free region
2/ verify that we meet the "all volumes must use the same set of disks"
2/ mark the disks to be added in add_to_super_imsm_volume

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 18fde300fe Create: fixup 'insert_point', dependent on 'subdevs', for auto-layout
'subdevs' is read from the container in the auto-layout case so reset
subdevs dependent default values.  'insert_point' without this
change is always 2 blocking creation of arrays with > 2 raid disks.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams a9b8734a23 Create: wait_for container creation
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 85f9b5f798 Manage: permit '--remove detached' for containers
Skip the unique holder check in the detached case... pretty sure no one is
holding on to it if open() returns ENXIO.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 04a8ac089c mdmon: record added disks
Prevent duplicate disks from being sent to the monitor thread.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 7da80e6faa mdmon: fix removed disk handling
Use SKIP_GONE_DEVS when reading the container, and correct some confused
logic in manage_new().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams dab4a5134e sysfs: allow sysfs_read to detect and drop removed disks
All operations that rely on loading from an existing container (like
--add) will fail after a disk has been removed.  Provide an option to
skip missing / offline disks rather than abort.  We attempt to do this
in the load_super_{imsm,ddf}_all cases when mdmon is running i.e. we
already have a consitent version of the metadata running in the system.
Otherwise, we fail as normal and let the administrator fix up the
container.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:56 -07:00
Dan Williams db575f3b9e imsm: retry load_imsm_mpb if we suspect mdmon has made modifications
If the checksum verification fails and mdmon is running we retry the
load to get a consistent snapshot of the mpb.  Found by
tests/08imsm-overlap.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:56 -07:00
Dan Williams ecf45690f2 imsm: verify single sector mpb checksums
If the mpb is only one sector do not skip the checksum verification.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:56 -07:00
Dan Williams 0556e1a2b1 imsm: fix mark_failure / introduce mark_missing
Actually, rename mark_failure to mark_missing and then implement the
correct mark_failure which according to new documentation is to:

1/ Set the FAILED status bit
2/ Set IMSM_ORD_REBUILD to mark the disk out of sync
3/ Set map->failed_disk_num if this is the first failure detected
   failure (it is ~0 otherwise)

Previously the assumption was that IMSM_ORD_REBUILD only appeared in
map[1], so all routines that care about out-of-sync disks need to be
updated.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:56 -07:00
Dan Williams 620b171338 imsm: introduce get_imsm_disk_slot
Implement a common disk index to disk slot routine and replace open
coded versions.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:56 -07:00
Dan Williams df4746577e imsm: fix activate spare to ignore foreign disks
A foreign disk is one that all other drives believe is not-in-sync but
does not have the 'failed' status bit set.

This also reverts, because that commit is addressing the wrong problem.
Ideally mdmon would kick "non-fresh" drives like the kernel does at
native-md activation time, but that is too awkward to implement at the
moment because mdadm owns container manipulations.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-23 23:06:24 -07:00
Dan Williams 7a70e8aa8d imsm: fixup container spare uuids by default
Spares in the imsm case are marked with the "match-all" uuid of
ffffffff-ffffffff-ffffffff-ffffffff.  When performing incremental
assembly we need to associate such devices with a populated container
uuid.  Also when performing --detail on a container with only spares
present we can make an attempt to return a real uuid.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-23 23:06:24 -07:00
Dan Williams 689c9bf3c3 imsm: fix missing initializations of the per-disk extents pointer
Fixes a glibc assertion when trying to free a pointer that was not
malloc'd.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-23 23:06:24 -07:00
Dan Williams ddaf4ce2da test: fix a call to udevsettle
udevsettle is deprecated, use udevadm settle

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-23 23:06:24 -07:00
Dan Williams cceebc67f1 imsm: provide a simulated option-rom for regression tests
IMSM_NO_PLATFORM turns off checks that should be tested, so provide a
IMSM_TEST_OROM variable to allow testing the orom constraints in the
mdadm regression suite.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-23 14:26:10 -07:00
Dan Williams 5a03814040 imsm: block creation of devices with identical names
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-02 15:01:13 -07:00
Dan Williams 78757ce8a5 imsm: don't check raid1 chunk size
mdadm -C /dev/md/r1d2n1s0-5 -amd -l1  --size 5242880 -n 2 /dev/sdb /dev/sdc  -R -f -v -c 64
mdadm: chunk size ignored for this level
mdadm: super0.90 cannot open /dev/sdb: Device or resource busy
mdadm: super1.x cannot open /dev/sdb: Device or resource busy
mdadm: platform does not support a chunk size of: 0
mdadm: device /dev/sdb not suitable for any style of array

Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Tested-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-02 10:55:31 -07:00
NeilBrown 6c40598f59 Merge branch 'master' into devel-3.0 2009-02-02 11:09:09 +11:00
NeilBrown b47dff6675 Fix possible crash if bitmap metadata is bad.
We really should never divide by 0.

Thanks to "Jon Nelson" <jnelson-linux-raid@jamponi.net>
for finding the problem.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-02-02 11:06:38 +11:00
NeilBrown 0083584d5e Document 'max' option to --grow --size in --help output.
Suggestion from Christian Hudon <chrish@debian.org>

Signed-off-by: NeilBrown <neilb@suse.de>
2009-02-02 10:58:08 +11:00
Dustin Kirkland 089485cbe4 Typo in earlier patch : asprintf -> vasprintf
Signed-off-by: NeilBrown <neilb@suse.de>
2009-02-02 10:54:23 +11:00
NeilBrown a123619211 Fix the used device size in mdadm -D output.
As get_component_size() returns the number of used sectors of a device
we need halve before pringing as K, and shift the value by 9, not 10,
before passing to human_size.

Thanks to Andre Noll <maan@systemlinux.org> for identifying problem
(and a slightly different version of this patch)

Signed-off-by: NeilBrown <neilb@suse.de>
2009-02-02 10:03:20 +11:00
Bernhard Reutner-Fischer 2df1f26911 mdadm fix compilation for uClibc
2008-12-08  Bernhard Reutner-Fischer  <rep.dot.nop@gmail.com>

	* Makefile (dadm.uclibc): Remove misspelled and unneeded rule.
	* md5.h: Include stdint.h for uClibc.
	* mdadm.h: uClibc defines __UCLIBC__. If uClibc has LFS off
	then use lseek instead of lseek64.

Signed-off-by:  Bernhard Reutner-Fischer  <rep.dot.nop@gmail.com>
2009-02-02 09:53:51 +11:00
Dan Williams caf8d23175 imsm: fix failed disks are allowed back into the container
Failed disks do not have valid serial numbers which means we will not
pick up the 'failed' status bit from the metadata entry.  Check for
dl->index == -2 to prevent failed disks from being incorporated into the
container.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-23 15:45:34 -07:00
Dan Williams 5615172f1d Create: warn when a metadata format's platform components are missing
If the metadata handler can not find its platform support components
then there is no way for it to verify that the raid configuration will
be supported by the option-rom.  Provide a generic method for metadata
handlers to warn the user that the array they are about to create may
not work as intended with a given platform.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:51 -07:00
Dan Williams a20d2ba5f3 imsm: enforce "all member disks must be members of all arrays"
This is a key orom-compatibility constraint.  A nice side effect is that
it precludes the corner case of 'create' racing against 'spare activate'
since the create will fail to convert a spare into an array member.  At
create time we check if this is the first member array in the container
if it is than all disks are possible candidates, if it is not then only
current members are permitted.

A bit hairier is spare-activation handling in the presence of this
constraint.  It is difficult because spare handling is per array.  The
approach taken is to:

1/ check that a new spare can cover all defined arrays in the container
2/ ensure that partially assimilated spares are the first candidates
   when looking for a spare region to activate.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:51 -07:00
Dan Williams 1c556e92ba imsm: enforce num_disks constraints
RAID1 == 2 disks
RAID5 >= 3 disks
RAID10 == 4 disks

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 35f81cbbc5 imsm: rename vprintf macro to pr_vrb
Don't redefine standard library calls unecessarily...

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams a18a888ea7 Create: allow per-metadata default layouts
Let handlers specifiy their own defaults, specifically needed for the
imsm-raid5 case where mdadm defaults to 'ls' and imsm to 'la'.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 5746141e3f mdmon: make switchroot an undecorated option
Simplify the usage from:
	mdmon [--switch-root dir] /device/name/for/container
to...
	mdmon /device/name/for/container [target_dir]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 66afdfa977 Assemble: fix busy detection
Use mddev_busy() as GET_ARRAY_INFO can succeed on 'clear' arrays.

Ran into this after an encountering a case where mdadm -Ss ended in
segfault (missing check for NULL return from map_by_devnum() in
sles11:Manage.c).  So, tried to stop the array by hand with echo clear >
md/array_state, after which I could not reassemble since GET_ARRAY_INFO
was succeeding.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 1ffd2840df mdmon: support scanning for containers
When the given container is '/proc/mdstat' then launch an mdmon instance
per container found in /proc/mdstat.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 6f4098a6fd mdmon: expand permissible container device names
Allow any path that dereferences to an md device to be used in addition
to the current symbolic md device names.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 03cd4cc810 imsm: imsm_read_serial check for zero-length response
VMWare virtual disks successfully run the inquiry but return a zero response.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:33:56 -07:00
Dan Williams be2c0e387b imsm: fix dev_open return value handling
dev_open returns an fd

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 00:29:34 -07:00
Dan Williams c1363b408f mdmon: fix missing ->subarray initialization
This can cause mdmon to fail at startup.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-13 15:46:05 -07:00
NeilBrown 78fbcc1031 Merge branch 'master' into scratch-3.0
Conflicts:

	Assemble.c
	config.c
2009-01-08 09:31:28 +11:00
Dustin Kirkland 1a0ee0baf0 Fail overtly when asprintf fails to allocate memory
.. rather that causing a less-obvious violation of segments.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-08 09:25:33 +11:00
NeilBrown 89a10d84cb Free mdstat data structures properly.
In one case we called 'free' instead of 'mdstat_free'.
In others we didn't free at all.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-08 09:25:31 +11:00
NeilBrown 45b662b611 Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/mdadm into devel-3.0 2008-12-18 16:58:25 +11:00
NeilBrown 8a659c3321 Merge branch 'master' into devel-3.0
Conflicts:

	Assemble.c
	Incremental.c
	Kill.c
	ReadMe.c
	inventory
	mapfile.c
	mdadm.8
	mdadm.spec
	mdassemble.8
2008-12-18 16:56:13 +11:00
NeilBrown 3a56f223e9 map: rebuild map if it doesn't exist.
It is possible for some arrays to be created e.g. by initrd, and so
not get mentioned in /var/run/mdadm/map.
As "-I" depends on things being listed in 'map', we create it by
scanning all devices if it doesn't exist.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-12-18 16:23:46 +11:00
NeilBrown acee8e8964 Assemble: set stripe_cache_size properly when restarting a reshape.
Reshape with large chunk size can require a large stripe_cache.
We make this work when starting the reshape but not when
restarting at assemble time.  So fix that.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-12-18 14:24:41 +11:00
NeilBrown 4e9a6ff778 Assemble: don't assume array is 'clean' unless all devices think it is.
This is only significant for --assemble --force where some old
devices might be included into the array.  If anything looks like
it isn't clean, the kernel will not allow a degraded array to be started.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-12-18 14:11:59 +11:00
NeilBrown 22eba51216 Kill: Don't use O_EXCL when --force is used.
We really want --zero-super --force to zero the superblock in
all situations.  So don't open with O_EXCL - trust the user.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-12-18 14:04:45 +11:00
Dan Williams 0c5c7b470e imsm: test overlapping creates
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-12-08 16:59:19 -07:00
Dan Williams 2952742d32 workaround a hald interaction and quiet cleanup
The 'udevadm settle' call appears to resolve:

mdadm: failed to stop array /dev/md127: Device or resource busy
Perhaps a running process, mounted filesystem or active volume group?

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-12-08 16:59:18 -07:00