Commit Graph

2605 Commits

Author SHA1 Message Date
NeilBrown b31df43682 intel,ddf: don't require partitions when ignore_hw_compat is set.
Partitions are a hw-compat issue.

This allows e.g "--examine" to be used on image files.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-16 13:24:07 +10:00
NeilBrown a21e848a55 Create: over-ride "start_ro" setting when creating an array.
If module parameter start_ro is set, arrays start readonly.
This is OK when assembling, but is very surprising when creating
an array as the resync won't start.
So over-ride the setting (unless --read-only was given) make
arrays RW when created.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-15 11:40:27 +10:00
NeilBrown 701d5b4ab5 Suppress error messages from systemctl.
We call systemctl to see if systemd will run mdmon for us.
If it cannot, we run mdmon directly, so we aren't interested
in the error message.
So redirect stderr to /dev/null.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-15 11:10:54 +10:00
NeilBrown 5b905a7ec5 man pages: remove references to raidtools.
raidtools is so ancient now that it is uninteresting.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-15 11:07:17 +10:00
NeilBrown eca944fa9c create_mddev: add support for /dev/md_XXX non-numeric names.
With the 'devnm' infrastructure fixed, it is quite easy to support
names like "md_home" for md arrays.
The currently defaults to "off" and can be enabled in mdadm.conf with
  CREATE names=yes
This is incase other tools get confused by the new names.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-15 11:03:25 +10:00
NeilBrown 83785d301f Incremental: remove partitions when assembling.
We remove partitions for --create and --assemble, but not for
--incrmental.
So fix that ommision.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-14 12:06:27 +10:00
NeilBrown 8baab049ce Create: fix bug with --data-offset.
Test for VARIABLE_OFFSET was wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 17:26:37 +10:00
NeilBrown 16e7a4b9a2 Add some built files to .gitignore.
Now everything made by "make everything" is suitably ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 17:11:42 +10:00
NeilBrown 0cf8322999 Always test return value of posix_memalign.
FORTIFY_SOURCE likes this, and it is good practice.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 17:09:55 +10:00
NeilBrown 5a23a06ea4 mdassemble - fix new compile-time problems.
Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 17:05:16 +10:00
NeilBrown e6fc80a895 Detail: report on inactive arrays.
Array can be inactive when e.g. -I is in the process of assembling them.
This change allows --detail to report limited information about
these arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 16:57:10 +10:00
NeilBrown b3908491d6 Detail: fix --brief --verbose
This pair of options should give a --brief listing including devices=
information.  But recent changes to flag passing broke this.
So fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 14:57:41 +10:00
NeilBrown 8adabef587 Remove open-coded use_udev().
Manage_runstop has an open-coded version of use_udev() which is no
longer correct.  So make it use use_udev() explicitly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 13:03:25 +10:00
NeilBrown 743eaf8b70 misc_scan: don't trust the mapping file too much for device names.
misc_scan assumes that any device name found in the 'mapping' file
is usable.  Usually it is but sometimes not, such as for inactive
devices.
Depending on it isn't really robust, when a name is found, check that
it exists. If not, fall back on map_dev.

This will allow "--detail --scan" to notice inactive devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 12:56:38 +10:00
NeilBrown 6b63c1a457 Incrmental: tell udevs to unmount when array looks to have disappeared.
If a device is removed which appears to be busy in an md array, then
it is very like the array cannot be used.
We currently try to stop it, but that could fail if udisks had
automatically mounted it.
So tell udisks to unmount it, but ignore any error.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 12:07:40 +10:00
NeilBrown 7df8a7b971 mdadm.conf.5: document the use of quotation characters in mdadm.conf
single or double quotes protect spaces and double or single quotes.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 11:28:15 +10:00
NeilBrown 64a78416e3 Manage: support --fail set-X and --remove set-X
A RAID10 array can have 'sets' of devices which are reported by
--detail.
They can now be collectively failed or removed.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 11:17:50 +10:00
NeilBrown 276be5147e Wait: also wait if an action is about to start.
If a sync/recover action is about to start but hasn't actually begun
yet, /proc/mdstat won't show it, but md/sync_action will (it checks
MD_RECOVERY_NEEDED).
So when /proc/mdstat seems to say nothing is happening, double check
with md/sync_action.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-01 10:23:40 +10:00
NeilBrown 79b2ed4f24 tests: zero devices before --adding them.
Linux 3.10 will allow more "--add" to be handled as "--re-add".
To be sure the tests work correctly we sometimes need to zero
the device to ensure it really is an --add that happens.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-01 09:24:11 +10:00
Jes Sorensen bf3a33b35c mdmon: Add missing option documentation to --help output
Document that -a is equivalent to --all, as well as --foreground / -F

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-26 08:45:05 +10:00
mwilck@arcor.de 3f188b1081 DDF: fix bug in compare_super_ddf
Fix bug in previous patch
"DDF: compare_super_ddf: merge local info of other superblock"

Just discovered this bug in my last patch set - unfortunately, just after
you committed it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-24 16:33:46 +10:00
mwilck@arcor.de a6592497cd tests/10ddf-create: omit log output check
The test script was counting output lines - its expectations
don't match the current code any more. Remove this pointless
test.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:32 +10:00
mwilck@arcor.de 6b374ba368 monitor: treat unreadable array_state as clean
Failure to read array_state can only mean the array has been
deleted by the kernel; it is not an indication that the array
is dirty.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:32 +10:00
mwilck@arcor.de 40ae6f5f8e monitor: read_and_act: handle race conditions for resync_start
When arrays are stopped, sysfs attributes may be deleted by
the kernel, and attempts to read these attributes will fail.

Setting resync_start to 0 is wrong in this case, because it
may make is_resync_complete() erroneously return
FALSE for a clean array. It is better to leave resync_start
untouched (the previously read value for this array).

Otherwise set_array_state() will pass thewrong state information
to the metadata handler, which will write it to disk, and at
the next restart an unnecessary recovery is started for the
array.

It is also possible that resync_start is actually *not* deleted
yet when read_and_act is running, and an apparently valid
value of "0" is read from it, with the same effect as described
above. This happens if the kernel has already called md_clean()
on the array (setting recovery_cp = 0), but the delayed removal
of "resync_start" hasn't happened yet. Therefore, in "clear"
state, "resync_start" shouldn't be read at all.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:32 +10:00
mwilck@arcor.de 2b60d2890f monitor: don't call pselect() on deleted sysfs files
It makes no sense to listen for events on files that have
been deleted. This happens when arrays are stopped and the
kernel removes the associated sysfs structures.

Calling pselect() on the deleted attributes may cause a storm
of wake events.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:32 +10:00
mwilck@arcor.de 7d5a7ff3da DDF: add code to debug state changes
The 10ddf-create test case fails sporadically because wrong meta
data is written, making the array appear inconsistent when it's
restarted. Added code to aid debugging this.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:32 +10:00
mwilck@arcor.de bedbf68a08 DDF: brief_detail_super_ddf: print correct UUID for subarrays
Commit c1ea5a98 caused brief_detail_super_ddf() to be called
for subarrays. But the UUID printed was always the one of the
container. This is wrong and actually worse than printing no UUID
at all, and causes the DDF test case (10ddf-create) to fail.

This patch adds code to determine the MD UUID of a subarray correctly.
The hard part is to figure out for which subarray the function is
called. Moved that to an extra function.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:31 +10:00
mwilck@arcor.de dc9e279c13 DDF: __write_init_super_ddf: just use seq number of active header
It's not necessary to check for 0xffffffff, which is a valid
sequential number.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:31 +10:00
mwilck@arcor.de dacf3dc5d4 DDF: __write_ddf_structure: Fix wrong reference to ddf->primary
Should reference "header" instead here.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:31 +10:00
NeilBrown 2fdf559d74 Manage_runstop: call flush_mdmon if O_EXCL fails on stopping mdmon array.
When stopping an mdmon array, at reshape might be being aborted
which inhibets O_EXCL.  So if that is possible, call flush_mdmon
to make sure mdmon isn't still busy.

Reported-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-22 17:05:33 +10:00
Przemyslaw Czarnowski 79b68f1b48 imsm: monitor: do not finish migration if there are no failed disks
Transition from "degraded" to "recovery" made in OROM is slightly different
than the same transision in mdadm. Missing disk is not removed from list of
raid devices, but just from map. Therefore mdadm should not end migration
basing on existence of list of missing disks but should rely on count of
failed disks.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Tested-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-22 16:21:17 +10:00
Pawel Baldysiak 4edb8530e8 Add updating component_size to manager thread of mdmon
Mdmon does not update component_size now. It is wrong because in case
of size's expansion component_size is changed by mdadm but mdmon does not
reread its new value and uses a wrong, old one. As a result the metadata
is incorrect during size's expansion. It contains no information that
resync is in progress (there is no checkpoint too). The metadata is
as if resync has already been finished but it has not.

Component_size will be set to match information in sysfs. This value
will be updated by manager thread in manage_member() function.
Now mdmon uses the correct, current value of component_size and the
correct metadata (containing information about resync and checkpoint)
is written.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-08 11:32:53 +10:00
NeilBrown 5e73b02409 Ensure mddev_dev struct always zeroed on allocation.
There are a number of fields which should not
be left uninitialised.  e.g. attempt_re_add can get
confused if ->writemostly is not set correctly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 11:53:51 +11:00
NeilBrown 748952f73e Create: default to bitmap=internal for large arrays.
Here, "large" means components are 100G or more.  It is
usually beneficial to have write-intent bitmaps on such arrays.
They can be suppressed with --bitmap=none

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 10:36:21 +11:00
NeilBrown 8af530b07f Enhance incremental removal.
When asked to incrementally-remove a device, try marking the array
read-auto first.  That will delay recording the failure in the
metadata until it is really relevant.
This way, if the device are just unplugged when the array is not
really in use, the metadata will remain clean.

If marking the default as faulty fails because it is EBUSY, that
implies that the array would be failed without the device.  As the
device has (presumably gone) - that means the array is dead.  So try
to stop it.  If that fails because it is in use, send a uevent to
report that it is gone.  Hopefully whoever mounted it will now let go.

This means that if  you plug in some devices and they are
auto-assembled, then unplugging them will auto-deassemble relatively
cleanly.

To be complete, we really need the kernel to disassemble the array
after the last close somehow.  Maybe if a REMOVE has failed and a STOP
has failed and nothing else much has happened, it could safely stop
the array on last close.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 09:46:34 +11:00
NeilBrown 401f095c39 mdadm.8: Detail use for IMSM_NO_PLATFORM environment variable.
Suggested-by: Marcin Tomczak <marcin.tomczak@intel.com>
2013-03-04 17:25:36 +11:00
mwilck@arcor.de c1ea5a9809 Detail.c: call load_container for container subarrays
Without calling load_container at this point, the
info structure may be missing some important information.
In particular, information about secondary DDF RAID levels
may be wrong if information is only read from a single disk.

If this fails, fall back to the previous code.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:15:51 +11:00
mwilck@arcor.de 4eefd651f0 DDF: compare_super_ddf: merge local info of other superblock
If a match is found in compare_super_ddf, check the other SB
for local DDF information (VD config records, physical disk data)
which is not available in the current superblock, and add it
if needed.

This is important for the mdmon - when disks are added to a
auto read-only array, they must be present in the DDF structure
in order to guarantee consistent writeback of metadata to all
disks.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:15:06 +11:00
mwilck@arcor.de 2d21069764 DDF: add sanity checks in compare_super_ddf
Besides container GUID, also check seqnum, physical and virtual
disk numbers, and check match between local and global sections.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:14:17 +11:00
mwilck@arcor.de e3c2a365e9 DDF: __write_init_super_ddf: use correct VD conf
When writing back the DDF structure, make sure that on each disk
we write the configs that include this disk even if a secondary
RAID level is present. Otherwise the secondary RAID will not be
read correctly any more when we open the device next time.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:13:21 +11:00
mwilck@arcor.de 4e5870181a DDF: container_content_ddf: handle RAID layout for RAID10
This patch adds basic handling for the special case of RAID10.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:10:38 +11:00
mwilck@arcor.de a5c7adb310 DDF: container_content_ddf: check for secondary RAID
Check for supportable secondary RAID configurations.
There is currently only one: RAID 10, if the stripe
sizes and Basic volume sizes are all equal.

With this patch, mdadm will not try to start unsupported
secondary RAID level configurations any more.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:08:46 +11:00
mwilck@arcor.de 8a38db8674 DDF: container_content_ddf: change array disk search loop
When searching for container elements, loop over the known phys
disks rather than the elements of the current configuration.

This patch changes nothing in the logic or return value of the code.
It just prepares extended logic for handling RAID10.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:07:09 +11:00
mwilck@arcor.de 3dc821b091 DDF: load_ddf_local: store VD conf for other BVDs
Store VD config for other BVDs in the other_bvds array.
This allows handling secondary RAID levels in container_content_ddf.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:03:44 +11:00
mwilck@arcor.de 8ec5d68536 DDF: added other_bvd to struct vcl
The VD config structures of different BVDs in the same SVD may be
different. This pointer stores the other BVDs.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 15:59:38 +11:00
mwilck@arcor.de 0175cbf62c DDF: increase seq number when writing meta data
Cleanly increase the seq number when the DDF structures are
written, instead of always setting it back to 1.

Also, make sure that the sequential number of all headers and
VD conf records is the same.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 14:29:59 +11:00
mwilck@arcor.de 097bcf0057 DDF: use existing locations for primary and secondary DDF structure
Some RAID BIOSes apparently use hard-coded LBA offsets (presumably
from the end of the disk) for the primary and secondary DDF
structure, ignoring the values given in the DDF anchor. This is
broken BIOS behavior, but it will cause any changes made by MD
(e.g. setting the init_state flag after a full initialization)
to be "forgotten" after the next reboot.

This patch fixes this by using the exiting LBA locations if
available. Verified that this fixes MD+LSI Mega Software RAID
BIOS.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 14:19:50 +11:00
mwilck@arcor.de 7f798aca5b DDF: cleanly save the secondary DDF structure
So far, mdadm only saved the header of the secondary structure.
With this patch, the full secondary DDF structure is saved
consistently, too. Some vendor DDF implementations need it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 14:19:50 +11:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
NeilBrown fdcad551e9 Grow: fix problem with reshaping RAID4 to RAID0.
As 'layout' doesn't map neatly from RAID4 to RAID5, we need to
set it correctly for RAID4.
Also, when no reshape is needed we should set re->level to the final
desired level.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:02:21 +11:00