Commit Graph

161 Commits

Author SHA1 Message Date
NeilBrown 72ca9bcff3 Allow data-offset to be specified per-device for create
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...

The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown af4348ddd1 Add data_offset arg to ->validate_geometry.
This is needed to return correct available size.  It isn't
really used yet.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 387fcd593c Add data_offset arg to ->avail_size
This is currently only useful for 1.x metadata and will allow an
explicit --data-offset request on command line.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown aab15415ed Manage: fix checks for removal from a container.
We must only remove from a container if the device isn't a
member of any member array.
To check we look at the 'holders' directory in sysfs.

We currently skip that check if ->devname is "detached", however
that can never be true since the change that introduced
add_detached().

Also sysfs_unique_holder returns status in 'errno' which isn't
entirely safe as e.g. closedir() is probably allowed to clear it.

So make sysfs_unique_holder return an unambigious value, and us
it to decide what to report.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-24 12:26:03 +10:00
NeilBrown 9cf9a1de36 Manage: zero metadata before adding to 'external' array.
'external' arrays don't support --re-add yet so old metadata is no
value, and 'ddf' gets confusing in mdmon if old metadata is found.
So for now, zero out any old metadata found before adding a spare to
an externally-managed array.

Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-15 09:51:20 +10:00
Lukasz Dorau 6d43efb59b Manage.c: fix make everything compilation error
This patch fixes the following make everything compilation error:
Manage.c: In function ‘Manage_add’:
Manage.c:538: error: ‘dev_st’ may be used uninitialized in this function
make: *** [mdadm.Os] Error 1

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-14 09:55:29 +10:00
NeilBrown d070235d3f Manage_subdevs: factor out Manage_delete
Now Manage_subdevs is now small enough to be manageable.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:21 +10:00
NeilBrown 38aeaf3af6 Manage_subdevs: split most of 'add' handling into Manage_add.
This makes Manage_subdevs smaller, and makes the error-path handling
for Manage_add much cleaner and probably less buggy.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:21 +10:00
NeilBrown abe94694da Manage: split out attempt_re_add.
The indent level is way too deep here, and this is a well defined
task, so split it out to a separate function.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:21 +10:00
NeilBrown 46d475beb4 Manage_subdev: give 'st' a better name and narrower focus.
'st' is use to examine the metadata on the device being added
to see if a 're-add' is possible.  However it is loaded long before
the 're-add' attempt is made.

So move the 'load_super' closer to were it is used - allowing us to
discard a number of 'free_super' call - and rename it to 'dev_st'
to emphasize that it related to the current device.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:20 +10:00
NeilBrown 7bd04da926 Manage: minor cosmetic fixes.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:20 +10:00
NeilBrown 1d9976430c Manage: simplify device searches in Manage_subdevs
We currently have rather hard-to-follow loop to iterate
through all the matches for 'missing' or 'faulty' or 'detached'.

Simplify it by creating a list of possible devices for each
of those and splicing the new list into the device list.

This removes the need for 'jnext' and 'next' and various other hacks.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:22:16 +10:00
NeilBrown ba728be72f Convert 'quiet' to 'not verbose' in various places.
If we change some functions to accept 'verbose', where <0 means to be
quiet, in place of 'quiet', then we will be able to merge
'quiet' and 'verbose' together for simplicity.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:18:09 +10:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown c8e1a230b7 Remove re_add flag in favour of new disposition.
Instead of
   disposition == 'a'  re_add == 1
use
   disposition == 'A'

to record that a re-add was requested.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown 0a999759b5 Relax restrictions on when --add is permitted.
The restriction that --add was not allowed on a device which
looked like a recent member of an array was overly harsh.

The real requirement was to avoid using --add when the array had
failed, and the device being added might contain necessary
information which can only be incorporated by stopping and
re-assembling with --force.

So change the test to reflect the need.

Reported-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-18 14:19:49 +10:00
NeilBrown 480f356641 Raid limit of 1024 when scanning for devices.
When we can for devices using GET_DISK_INFO we currently
limit to 1024.  But some arrays can have more than this.
So raise it to 4096 and make the constant a #define.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-18 09:06:02 +10:00
NeilBrown 3556c2fafb Fix typo: wan -> want
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-04 14:02:00 +10:00
NeilBrown 9f58469128 Manage: freeze recovery while adding multiple devices.
If the kernel supports it, freeze recovery over multiple adds,
so that they can all be added to the array at the same time and
be recovered in parallel.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 16:15:03 +11:00
NeilBrown bcbb3112d2 Manage: replace 'return 1' with 'goto abort'.
This will allow exit processing in next patch

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 16:07:02 +11:00
NeilBrown c69ffac0d6 Manage: allow --re-add to failed array.
If both "legs" of a RAID1 (or equivalent in RAID10) fail, then one
of the becomes available again it maybe appropriate to re-add the
failed device(s).
So remove the restriction that an array must has 'enough' devices
before being re-added, and if there is no-where to read a superblock
from for matching, then assume the kernel will do necessary checks.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-07 15:25:57 +11:00
Jes Sorensen 1471b8b14b Manage_ro(): Check pointer rather than dereferencing it
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-03 08:09:41 +11:00
Jes Sorensen bccd8153fa Manage_runstop(): Avoid memory leak
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen b73e45ae6a Managa_ro(): free() mdi before exiting
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen 093d918759 Manage_subdevs(): avoid leaking super
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen d9ca03e9c3 remove_devices(): readlink returns -1 on error
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Doug Ledford 16715c01f7 Fix readding of a readwrite drive into a writemostly array
If you create a two drive raid1 array with one device writemostly, then
fail the readwrite drive, when you add a new device, it will get the
writemostly bit copied out of the remaining device's superblock into
it's own.  You can then remove the new drive and readd it as readwrite,
which will work for the readd, but it leaves the stale WriteMostly1 bit
in devflags resulting in the device going back to writemostly on the
next assembly.

The fix is to make sure that A) when we readd a device and we might have
filled the st->sb info from a running device instead of the device being
readded, then clear/set the WriteMostly1 bit in the super1 struct in
addition to setting the disk state (ditto for super0, but slightly
different mechanism) and B) when adding a clean device to an array (when
we most certainly did copy the superblock info from an existing device),
then clear any writemostly bits.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-19 13:06:38 +10:00
NeilBrown 11b391ece9 Discourage large devices from being added to 0.90 arrays.
0.90 arrays can only use up to 4TB per device.  So when a larger
device is added, complain a bit.  Still allow it if --force is given
as there could be a valid use.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-08 13:05:31 +10:00
NeilBrown 9e6d929127 Check all member devices in enough_fd
The loop over all member devices in enough_fd could easily stop
before it had found all devices.  This would cause --re-add to
fail incorrectly.

So change the loop to be based on the reported number of devices
in the device - with a safe-guard limit of 1024.

Change some other loops to be more careful too.

Reported-by: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-05-23 17:21:35 +10:00
NeilBrown 873eec468c Manage: minor fix to add/re-add handling.
If using an old kernel we should still check if a re-add might be
intended, so we can refuse and require a '--zero' first if it is not
possible.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-05-10 16:20:25 +10:00
NeilBrown 51d9a2ce33 Merge branch 'master' into devel-3.2
Conflicts:
	Incremental.c
	Manage.c
	ReadMe.c
	inventory
	mdadm.8.in
	mdadm.spec
	mdassemble.8
	mdmon.8
2011-03-24 12:00:55 +11:00
NeilBrown fb0d4b9ca2 --stop: separate 'is busy' test for 'did it stop properly'.
Stopping an md array requires that there is no other user of it.
However with udev and udisks and such there can be transient other
users of md devices which can interfere with stopping the array.

If there is a transient users, we really want "mdadm --stop" to wait a
little while and retry.
However if the array is genuinely in-use (e.g. mounted), then we
don't want to wait at all - we want to fail immediately.

So before trying to stop, re-open device with O_EXCL.  If this fails
then the device is probably in use, so give up.

If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly
a transient failure, so try again for a few seconds.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-23 15:42:24 +11:00
Adam Kwolek c0f8269d57 FIX: Add spare throws exception (v2)
sync_metadata() requires st->sb to be loaded, otherwise exception is
generated.  This fails expansion, because spares cannot be added.

metadata update uses tst instead st pointer, it is better than
loading anchor for st as I proposed previously.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-20 15:47:17 +11:00
Krzysztof Wojcik 1ae42d9d99 Retry writing 'inactive' state during stopping array
Issue observed:
Sporadicaly stopping arrays using "mdadm -Ss" command does not succeded.
Cause:
Writting "inactive" to the array state not succeded- array is busy
(accessed by udev, blkid etc.)
Resolution:
If writing 'inactive' fails, wait and retry again (because it is possibly
a transient failure)

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-18 12:42:17 +11:00
Adam Kwolek 983fff45a1 FIX: ping_monitor() usage causes memory leaks
When for ping_monitor() input devnum2devname() is used,
received string pointer should be passed to free() for memory release.
It is not made in several places. This use case should have function
to avoid memory leak.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-18 12:32:16 +11:00
NeilBrown d6221e667f Manage: fix the mess I made in earlier patch.
When I separated the 'native metadata' case more cleanly from the
"external metadata" case for adding a drive, I left some 'external'
code in the 'native' case, and didn't copy it to the 'external' case.

When - in the external case - we add to super, we much check for
mdmon first, so we know whether to do the metadata update ourselves
or not, then afterwards call either flush_metadata_updates (to send
to mdmon) or sync_metadata (to do it directly).

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-18 12:31:45 +11:00
NeilBrown eb0af52689 --stop: separate 'is busy' test for 'did it stop properly'.
Stopping an md array requires that there is no other user of it.
However with udev and udisks and such there can be transient other
users of md devices which can interfere with stopping the array.

If there is a transient users, we really want "mdadm --stop" to wait a
little while and retry.
However if the array is genuinely in-use (e.g. mounted), then we
don't want to wait at all - we want to fail immediately.

So before trying to stop, re-open device with O_EXCL.  If this fails
then the device is probably in use, so give up.

If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly
a transient failure, so try again for a few seconds.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-17 13:35:10 +11:00
NeilBrown 88b496c269 Merge branch 'master' into devel-3.2
Conflicts:
	Manage.c
	managemon.c
	super-ddf.c
	super-intel.c
2011-03-15 15:35:04 +11:00
NeilBrown 02c39ab1d5 Manage/external: for external metadata, add_to_super needs lock on container.
add_to_super could use information from the current superblock (ddf
does), so add_to_super for external metadata should be called with
the O_EXCL lock held on the container to ensure the update is complete
before any other process tries to make any changes (like adding
another device to array).

Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-15 14:48:20 +11:00
NeilBrown d6508f0cfb Manage: be more careful about --add attempts.
If an --add is requested and a re-add looks promising but fails or
cannot possibly succeed, then don't try the add.  This avoids
inadvertently turning devices into spares when an array is failed but
the devices seem to actually work.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-10 17:25:40 +11:00
Czarnowska, Anna 0081eb007c modified message on failure to read metadata in Manage
Loading container may fail if e.g. one of the disks in container
has been detached but udev has not realized the change.
Addition to such array will fail because reading superblock
from one of disks in array fails.
Current message is a bit confusing.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-21 16:43:41 +11:00
NeilBrown 47573b0015 Fix regression with removing 'failed' and 'detached' devices.
If a request to remove all 'failed' or 'detached' devices chooses to
remove the first device, it will not actually try the removal and will
skip any following devices.

This fixes it.

Reported-by:  Rémi Rérolle <rrerolle@lacie.com>
Tested-by:  Rémi Rérolle <rrerolle@lacie.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-15 10:45:42 +11:00
NeilBrown 5b660791b4 Fix regression with removing 'failed' and 'detached' devices.
If a request to remove all 'failed' or 'detached' devices chooses to
remove the first device, it will not actually try the removal and will
skip any following devices.

This fixes it.

Reported-by:  Rémi Rérolle <rrerolle@lacie.com>
Tested-by:  Rémi Rérolle <rrerolle@lacie.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-15 10:45:01 +11:00
NeilBrown 562e70e4c4 Call free_super before attempting to add a new device
Now that write_init_super doesn't close fds any more, we need
to call free_super before the ADD_NEW_DISK ioctl.
Also call free_super before some error returns, for cleanliness.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-31 13:53:35 +11:00
NeilBrown 1cc7f4feb9 Don't close fds in write_init_super
We previously closed all 'fds' associated with an array in
write_init_super .. sometimes, and sometimes at bad times.
This isn't neat and free_super is a better place to close them.

So make sure free_super always closes the fds that the metadata
manager kept hold of, and stop closing them in write_init_super.

Also add a few more calls to free_super to make sure they really do
get closed.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-25 07:56:53 +11:00
Adam Kwolek 73cb8d43f4 Add spares to raid0 in mdadm
When user wants to add spares to container with raid0 arrays only
it is not possible to update metadata due to lack of running mdmon.
To allow for this direct metadata update by mdadm is used in such case.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-06 18:42:53 +11:00
Anna Czarnowska d52bb542d4 move_spare function modified and moved to Manage.c
It will also be needed for Incremental.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-05 14:34:32 +11:00
NeilBrown 833bb0f8f6 Allow --update=devicesize with --re-add
This is useful with 1.1 and 1.2 metadata to update the metadata if
the device size has changed.
The same functionality can be achieved by writing to the device size
in sysfs after re-adding normally, but in some cases this might be
easier.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-09 13:06:29 +11:00
Dan Williams 9ea5a25217 Manage: allow manual control of external raid0 readonly flag
mdadm --readwrite <subarray> will clear the external readonly flag ('-'
to '/'), but only for redudant arrays.  Allow raid0 arrays as well so
the user has a simple helper to control this flag.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 15:08:19 +11:00