This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.
For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than hiding this in the 'st', return it explicitly.
In the one case we still need it, copy it into st where needed.
This will disappear in a future patch.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than hiding this arg in the 'st' structure, pass it explicitly.
This is a first step to getting rid of 'subarray' from 'supertype'.
The strcpy in open_subarray should have better error checking, but it
will disappear soon so there is little point.
Signed-off-by: NeilBrown <neilb@suse.de.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.
We should never include a device in an array if it thinks that the
primary device has failed.
This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.
Signed-off-by: NeilBrown <neilb@suse.de>
Detail: report reshape and check as well as resync and recovery
Wait: if the resync is pending or delayed, wait for that too.
Signed-off-by: NeilBrown <neilb@suse.de>
If an --add is requested and a re-add looks promising but fails or
cannot possibly succeed, then don't try the add. This avoids
inadvertently turning devices into spares when an array is failed but
the devices seem to actually work.
Signed-off-by: NeilBrown <neilb@suse.de>
Allow disk-policy to be computed given the path and
disk type explicitly. This can be used when hunting through
/dev/disk/by-path for something interesting.
Signed-off-by: NeilBrown <neilb@suse.de>
To support incorpating a new bare device into a collection of arrays -
one partition each - mdadm needs a modest understanding of partition
tables.
The main needs to be able to recognise a partition table on one device
and copy it onto another.
This will be done using pseudo metadata types 'mbr' and 'gpt'.
Signed-off-by: NeilBrown <neilb@suse.de>
A device can be in a number of domains.
The domains of an array is the union of the domains of all devices.
A device is allowed to join an array when its set of domains is a
subset of the array's domains.
Signed-off-by: NeilBrown <neilb@suse.de>
Policy can be stated as lines in mdadm.conf like:
POLICY type=disk path=pci-0000:00:1f.2-* action=ignore domain=onboard
This defines two distinct policies which apply to any disk (but not
partition) device reached through the pci device 0000:00:1f.2.
The policies are "action=ignore" which means certain actions will
ignore the device, and "domain=onboard" which means all such devices
as treated as being united under the name 'onboard'.
This patch just adds data structures and code to read and
manipulate them. Future patches will actually use them.
Signed-off-by: NeilBrown <neilb@suse.de>
It turns out that /lib/init/rw doesn't exist in early boot
like I thought. So give up on that idea and just use
/dev/.mdadm/ for files that must persist from early-boot
to regular boot.
Signed-off-by: NeilBrown <neilb@suse.de>
We now have 3 directory definitions: mdmon directory for its pid and
sock files (compile time define, not changable at run time), mdmonitor
directory which is for the mdadm monitor mode pid file (can only be
passed in via command line at the time mdadm is invoked in monitor mode),
and the directory for the mdadm incremental assembly map file (compile
time define, not changable at run time). Only the mdadm map file still
hunts multiple locations, and the number of locations has been reduced
to /var/run and the compile time specified location. Re-use of similar
sounding defines that actually didn't denote their actual usage at
compile time made it more difficult for a person to know what affect
changing the compile time defines would have on the resulting programs.
This patch renames the various defines to clearly identify which item
the define affects. It also reduces the number of various directories
which will be searched for these files as this has lead to confusion
in mdadm and mdmon in terms of which files should take precedence when
files exist in multiple locations, etc. It's best if the person
compiling the program intentionally and with planning selects the
right directories to be used for the various purposes. Which directory
is right depends on which items you are talking about and what boot
loader your system uses and what initramfs generation program your
system uses. Because of the inter-dependency of all these items it
would typically be up to the distribution that mdadm is being integrated
into to select the correct values for these defines.
Signed-off-by: Doug Ledford <dledford@redhat.com>
GET_ARRAY_INFO always succeeds on an inactive container, so we need to
be a bit more diligent about adding a disk to an active container.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
--test can be given in Manage mode.
This can be used when there is an attempt to fail or remove 'faulty',
'failed' or 'detached' devices, or to re-add 'missing' devices.
If no devices were failed, removed, or re-added, then mdadm will
exit with status '2'.
Signed-off-by: NeilBrown <neilb@suse.de>
This can be used for hot-unplug. When a device has been remove,
udev can call
mdadm --incremental --fail sda
and mdadm will find the array holding sda and remove sda from
the array.
Based on code from Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This allows finding the array which contains a given component.
Components are named using the kernel-internal string name such
as "sda1" or "hdb".
Don't return member arrays, only the contain that contains them.
Also tidy up the parsing of 'inactive' arrays in /proc/mdstat.
If we see 'inactive' we need to set 'in_devs' immediately as there
is no level coming.
Signed-off-by: NeilBrown <neilb@suse.de>
Allow the name of the array stored in the metadata to be updated. In
some cases the metadata format may not be able to support this rename
without modifying the UUID. In these cases the request will be blocked.
Otherwise we allow the rename to take place, even for active arrays.
This assumes that the user understands the difference between the kernel
node name, the device node symlink name, and the metadata specific name.
Anticipating further need to modify subarrays in-place, introduce the
->update_subarray() superswitch method. A future potential use
case is setting storage pool (spare-group) identifiers.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)
A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use. When mdadm -I is called
on the new disk we fail the call to sysfs_read(). I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive. So just make SKIP_GONE_DEVS the default behaviour.
Also fix up remaining unchecked usages of the sysfs_read() return value.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Make create check with the appropriate meta data handler and see what the
largest chunk size is supported. The current 512K default is not supported
by existing imsm OROM.
[dan.j.williams@intel.com: trim the upper limit to 512k for future oroms]
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Support for deleting a subarray out of a container. When all subarrays
are deleted the component devices are converted back into spares, a
--zero-superblock is still needed to kill the remaining metadata at this
point. This operation is blocked when the subarray is active and may
also be blocked by the metadata handler when deleting the subarray might
change the uuid of other active subarrays. For example, with imsm,
deleting subarray 'n' may change the uuid of subarrays with indexes > n.
Deleting a subarray needs to be a container wide event to ensure
disks that record the modified subarray list perceive other disks that
did not receive this change as out of date.
Notes:
The st->subarray parsing in super-intel.c and super-ddf.c is updated to
be more strict now that we are reading user supplied subarray values.
Offline container modification shares actions that mdmon typically
handles so promote is_container_member() and version_to_superswitch()
(formerly find_metadata_methods()) to generic utility functions for the
cases where mdadm performs the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is needed for imsm where:
1/ we want to report raid_disks as zero to allow mdadm -As to
incorporate all spares
2/ we can't determine stale disks by looking at the event counts.
3/ we can't see per-subarray expectations with the info returned from
the container level ->getinfo_super()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
These metadata are not expected on partitions, and they have
no way of differentiation whether which is correct if they
are found both on the device and on the last partition.
So if the device is a partition, refuse to read the metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm prevents creation when device names are duplicated on the command
line, but leaves the partially created array intact. Detect this case
in the error code from add_to_super() and cleanup the partially created
array. The imsm handler is updated to report this conflict in
add_to_super_imsm_volume().
Note that since neither mdmon, nor userspace for that matter, ever saw an
active array we only need to perform a subset of the cleanup actions.
So call ioctl(STOP_ARRAY) directly and arrange for Create() to cleanup
the map file rather than calling Manage_runstop().
Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently "mdadm -As" will process the entries in the config
file in order. If any array is a component or member of a preceding
array, that array will not be assembled.
So if there are any failures during assembly, retry those arrays,
and look until everything is assembled, or nothing more can
be assembled.
Signed-off-by: NeilBrown <neilb@suse.de>
Monitoring /proc/mounts and creating a .pid file as soon as /var/run
is writable is racy. Most distros clean all non-directories from
/var/run early in boot and if mdmon races with this it could
lose the files as soon as they are created.
Instead require that "mdmon --takeover" be run after /var is writable.
Signed-off-by: NeilBrown <neilb@suse.de>
/var/run probably doesn't persist from early boot.
So if necessary, store in in /lib/init/rw or somewhere else
that does persist.
Signed-off-by: NeilBrown <neilb@suse.de>
Unlike native md checkpointing some data about the geometry and type of
the migration process is coded into curr_migr_unit. Provide logic to
convert between md/{resync_start|recovery_start} and imsm/curr_migr_unit.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Minimal changes needed to permit reassembling partially recovered
external metadata arrays. The biggest logical change is that
->container_content() can now surface partially rebuilt members rather
than omitting them from the disk list.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Replace occurrences of ~0ULL to make it clear we are talking about maximal
resync/recovery position.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The patch increases the capacity of buffers used to store
sysfs path names. Originally the buffers were too small to
hold the canonical representation of sysfs path (in case
of a SAS device, especially a device installed behind an
expander).
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When creating an array, check if the devices have partition
tables and print a warning if the table or the partitions might be
destroyed by array creation.
Signed-off-by: NeilBrown <neilb@suse.de>
- When --kill-superblock is used with --metadata, find every
different superblock if there are several and kill them all.
- When creating a new array, kill off any old metadata. The code
to do this was already present but has become broken over time.
Signed-off-by: NeilBrown <neilb@suse.de>