Commit Graph

233 Commits

Author SHA1 Message Date
NeilBrown 6c90491f44 Incremental: don't be distracted by partition table when calling try_spare.
Currently a partition table on a device makes "mdadm -I" think
the array has a particular metadata type and so will only
add it to an array of that (partition table) type .. which doesn't
make any sense.

So tell guess_super to only look for 'array' metadata.

Reported-by: Caspar Smit <c.smit@truebit.nl>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-05 16:21:42 +11:00
NeilBrown 8832342d3a Assemble/Incremental: don't hold O_EXCL on mddev after assembly.
As soon as the array is assembled, udev or systemd might run
fsck and mount it.  So we need to drop O_EXCL promptly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-12-05 10:35:16 +11:00
NeilBrown b11fe74db0 Incremental: improve support for "DEVICE" based restriction in mdadm.conf
--incremental currently fails if the device name passed does not
textually match the names permitted by the DEVICE line in mdadm.conf.
This is problematic when "mdadm -I" is run by udev as the name given
can be a temp name.

This patch makes two improvements:
1/ We generate a list of all existing devices that match the names
  in mdadm.conf, and allow rdev based matching
2/ We allows extra aliases to be provided on the command line, and
  perform textual matching on those.  This is particularly suitable
  for udev usages as ${DEVLINKS} can be provided even though the links
  make not yet be created.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-12-03 14:01:24 +11:00
NeilBrown 9ca39acb3e Incremental: add --export handling.
If --export is given with --incremental, then
  MD_DEVNAME
is output which gives the name of the device (in /dev/md) that
is the array (or container) that the device would be added to.
Also
  MD_STARTED
is set to one of
  no
  unsafe
  yes
  nothing

to indicate if the array was started.  IF MD_STARTED=unsafe
then it may be appropriate to run
  mdadm -R /dev/md/$MD_DEVNAME
after a timeout to ensure newly degraded array are started.

If
  MD_FOREIGN=yes
it might be appropriate to suppress this as the array is
probably not critical.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-28 15:15:30 +11:00
NeilBrown eb8b951657 Incremental: don't abort container if one member explicitly disabled.
If a member of a container is explicitly disabled, others may not
be so we should continue.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-28 13:33:56 +11:00
NeilBrown 2e44767fc2 Incremental: remove test that can never succeed.
Incremental_container never returns 1, so this test is pointless.
It is a holdover from when we called "Incremental()" rather than
"Incremental_container()" at this point.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-28 13:30:23 +11:00
NeilBrown d5a4041647 Make -IRs and --run work properly for containers.
We really need to make sure assemble_container_content()
gets called to finished the assembly of these.

Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-09-13 10:51:20 +10:00
NeilBrown 6f02172d2e Release mdadm-3.3
(and  various cosmetic fixes)

Signed-off-by: NeilBrown <neilb@suse.de>
2013-09-03 14:47:47 +10:00
NeilBrown d3786cdcd0 Change "mdadm --run" to use the same code as "mdadm --IRs".
Current "mdadm --run /dev/mdX" will not handle external metadata
properly.  mdmon won't be started etc.

So use the code from "mdadm -IRs" instead - that already does all
the right things.

Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-08-26 15:24:53 +10:00
NeilBrown fe7e0e64b0 Manage: split Manage_runstop into Manage_run and Manage_stop
The two branches have virtually nothing in common, so it is simpler if
they are separate.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-19 11:23:44 +10:00
NeilBrown f80057aec5 Assemble/Incr: Don't include spares with too-high event count.
Some failure scenarios can leave a spare with a higher event count
than an in-sync device.  Assembling an array like this will confuse
the kernel.
So detect spares with event counts higher than the best non-spare
event count and exclude them from the array.

Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-17 16:55:31 +10:00
NeilBrown 041b815f17 Incremental: allow --quiet to silence from errors from "-If"
-q is currently ineffective on "mdadm -If".   Messages that are not
usage errors should be suppressed.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-29 09:13:25 +10:00
NeilBrown ed503f89e4 Change some "fprintf(stderr,"s to pr_err.
They just keep slipping in..

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-21 12:42:57 +10:00
NeilBrown 83785d301f Incremental: remove partitions when assembling.
We remove partitions for --create and --assemble, but not for
--incrmental.
So fix that ommision.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-14 12:06:27 +10:00
NeilBrown 6b63c1a457 Incrmental: tell udevs to unmount when array looks to have disappeared.
If a device is removed which appears to be busy in an md array, then
it is very like the array cannot be used.
We currently try to stop it, but that could fail if udisks had
automatically mounted it.
So tell udisks to unmount it, but ignore any error.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 12:07:40 +10:00
NeilBrown 8af530b07f Enhance incremental removal.
When asked to incrementally-remove a device, try marking the array
read-auto first.  That will delay recording the failure in the
metadata until it is really relevant.
This way, if the device are just unplugged when the array is not
really in use, the metadata will remain clean.

If marking the default as faulty fails because it is EBUSY, that
implies that the array would be failed without the device.  As the
device has (presumably gone) - that means the array is dead.  So try
to stop it.  If that fails because it is in use, send a uevent to
report that it is gone.  Hopefully whoever mounted it will now let go.

This means that if  you plug in some devices and they are
auto-assembled, then unplugging them will auto-deassemble relatively
cleanly.

To be complete, we really need the kernel to disassemble the array
after the last close somehow.  Maybe if a REMOVE has failed and a STOP
has failed and nothing else much has happened, it could safely stop
the array on last close.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 09:46:34 +11:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
NeilBrown 75a410f622 Incremental: allow recently removed device to be added as a spare.
Currently, action=force-spare isn't effective at all as I'm not
sure what is really sensible.
This patch allows a device that was part of an array, but has been
removed, to be added as a spare of passed to --incremental while
force-spare is active.

If it is can be re-added, that done first.  If it fails, we add it as
a spare.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:08:16 +11:00
NeilBrown cb8f6859d1 IMSM - allow assembling any imsm array even without OROM.
It is important to check for compatibility with 'platform' or
Option ROM when creating or changing and array.  However there is no
real need when simply assembling the array.

On some systems there are situations where the platform information is
not available.  e.g. on some UEFI systems, UEFI is not available
during 'kdump' handling.  This makes it impossible to assemble
an IMSM array to receive the dump.

So remove the requirements that the platform be visible to assemble
an IMSM array.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:07:30 +11:00
NeilBrown 72e7fb13f0 Incremental: support replacement devices.
These need to be counted in the number of 'active' devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-24 12:06:51 +11:00
NeilBrown 0431869cec Fix up interactions between --assemble and --incremental
If --incremental has partly assembled an array and
--assemble is asked to assemble it, the just finds remaining
devices and makes a new array.  Not good.

So:
1/ modify locking policy so that assemble can be sure that
  no --incremental is running once it locks the map file
2/ Assemble() checks the map file for a duplicate and adds to
   that array instead of creating a new one.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-10 18:27:32 +11:00
NeilBrown 5e88ab2e2f New RESHAPE_NO_BACKUP flag to track when backup action is needed.
Some arrays (raid10) never need a backup file, so during assembly
we can avoid the whole Grow_continue check in that case.
Achieve this using a flag set by the metadata handler.

Also get "mdadm -I" to fail if a backup process would be
needed.  It currently does fail as the kernel rejects things,
but it is nicer to have this explicit.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 387fcd593c Add data_offset arg to ->avail_size
This is currently only useful for 1.x metadata and will allow an
explicit --data-offset request on command line.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown ca3b669603 Minor cosmetic fixes in various files.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:21 +10:00
NeilBrown 11b6d91dd0 Change Incremental and related functions to take struct context
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:20:22 +10:00
NeilBrown 0ea8f5b167 Assemble: allow arrays to be assembled read-only.
The option was there, but never used.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown c8e1a230b7 Remove re_add flag in favour of new disposition.
Instead of
   disposition == 'a'  re_add == 1
use
   disposition == 'A'

to record that a re-add was requested.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown 69fe207ed6 Incremental: fix adding devices with --incremental
We should use 'info' here, not 'info2'.
info2 refers to some other device (There may not even be one).l
info is *this* disk.

This is particularly important for getting info.disk.state
correct, which the kernel depends on to get 're-add' functionality
correct.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 15:53:53 +11:00
Jim Meyering 9200d418d0 avoid double-free upon "old buggy kernel" sysfs_read failure
* Incremental.c (Incremental): On sysfs_read failure, don't call
sysfs_free(sra) just before "goto out_unlock", since that very
same "sra" is freed the same way by the clean-up code below.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-22 07:14:36 +11:00
NeilBrown de5a472ea3 Remove avail_disks arg from 'enough'.
It can easily be calculated from 'avail' and  'raid_disks', and we
will soon have a case where we don't have it easily available to pass
in.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-07 14:04:47 +11:00
Lukasz Dorau 0c4304ca8b fix: container creation with --incremental used.
If there is no name provided for a container by the metadata it is
always appropriate to use the metadata version name.  create_mddev
will still add a uniquifying digit to the end so there is little risk
of confusion.
This makes the --incremental code behave the same as the --assemble code.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-01-12 10:57:20 +11:00
NeilBrown 576d028002 Assemble: make some plurals conditional.
"1 devices" is ugly.  Fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 10:49:07 +11:00
NeilBrown a408d66c4f Incremental: make sure container name appears in /dev
We need to send a "change" event just like we do when
creating an array.

This reverts commit 382afe49b1

The problem is that we need udev to create the file in /dev
for us.
It might be unnecessary for udev to consider assembling things
in this array, but it shouldn't cause a problem.  If it did that
would be a different bug which we probably need locking to fix.

Or maybe udev shouldn't trigger a "-I" for containers appearing.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 10:47:07 +11:00
Jes Sorensen f56128b9bc array_try_spare(): open_dev() returns -1 on error, not zero
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-03 08:08:00 +11:00
Jes Sorensen b1efa6c25c IncrementalScan(): Fix memory leak
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen fb745c4bb4 Incremental(): Check return value of dev_open() before trying to use it
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
NeilBrown ad098cdd79 Incremental: Fix a merge error in recent patch
commit  81219e70f2 required
merging and I messed it up.
The locking shouldn't be there - the caller locks now.

Reported-by: "Labun, Marcin" <Marcin.Labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 15:17:03 +11:00
Jes Sorensen be5c60e3fb partition_try_spare() use closedir() to release DIR * returned by opendir()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 14:54:27 +11:00
Jes Sorensen 1fdeb8a084 Fix memory leak of 'st3' in array_try_spare()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 14:51:30 +11:00
NeilBrown 2244d1a987 Remove duplicated code: search_mdstat and conf_match
search_mdstat and conf_match are almost identical.

Put all the functionality in conf_match, and remove search_mdstat.

Reported-by: Jes.Sorensen@redhat.com
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 13:30:41 +11:00
Labun, Marcin 81219e70f2 kill-subarray: fix, IMSM cannot kill-subarray with unsupported metadata
container_content retrieves volume information from disks in the
container.  For unsupported volumes the function was not returning
mdinfo. When all volumes were unsupported the function was returning
NULL pointer to block actions on the volumes. Therefore, such volumes
were not activated in Incremental and Assembly. As side effect they
also could not be deleted using kill-subarray since "kill" function
requires to obtain a valid mdinfo from container_content.

This patch fixes the kill-subarray problem by allowing to obtain
mdinfo of all volumes types including unsupported and introducing new
array.status flags.

There are following changes:

1. Added MD_SB_BLOCK_VOLUME for blocking an array, other arrays in the
   container can be activated.

2. Added MD_SB_BLOCK_CONTAINER_RESHAPE block container wide reshapes
   (like changing disk numbers in arrays).

3. IMSM container_content handler is to load mdinfo for all volumes
   and set both blocking flags in array.state field in mdinfo of
   unsupported volumes.  In case of some errors, all volumes can be
   affected. Only blocked array is not activated (also reshaped as
   result). The container wide reshapes are also blocked since by
   metadata definition they require modifications of both arrays.

4. Incremental_container and Assemble functions check array.state and
   do not activate volumes with blocking bits set.

5. assemble_container_content is changed to check container wide reshapes
   before activating reshapes of assembled containers.

6. Grow_reshape and Grow_continue_command checks blocking bits
   before starting reshapes or continueing (-G --continue) reshapes.

7. kill-subarray ignores array.state info and can remove requested array.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-31 11:29:46 +11:00
Jes Sorensen 25824e2d07 Incremental() lock error handling
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-22 11:34:08 +11:00
Jes Sorensen 015da8f5a8 array_try_spare(): missing map_unlock()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-22 11:32:19 +11:00
Jes Sorensen 382afe49b1 Don't tell sysfs to launch the container as we are doing it ourselves
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-22 11:30:02 +11:00
Jes Sorensen 5fc8cff3a4 Remove race for starting container devices.
This moves the lock handling out of Incremental_container() and relies
on the caller holding the lock. This prevents conflict with a
follow-on mdadm comment which may try and launch the device in
parallel.

This involves replacing a call to "Incremental" with an
unrolled version with just the case that calls Incremental_container
and so needs a call to ->load_container.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-22 11:29:47 +11:00
Adam Kwolek b76b30e0f9 Do not continue reshape during initrd phase
During initrd phase continuing reshape will cause file system context
lost. This blocks ability to control reshape using checkpoints.

To avoid this, during initrd phase assemble has to be executed with
'--freeze-reshape' option. This causes that mdadm restores reshape
critical section only.

Reshape can be continued later after system full boot.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-03 09:15:22 +11:00
Lukasz Dorau cc700db34f fix: correct unlocking of map file
1. Three missing map_unlock() calls were added.
2. Map file must be unlocked on fork, else child will hold lock.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-03 08:55:02 +11:00
NeilBrown 11b391ece9 Discourage large devices from being added to 0.90 arrays.
0.90 arrays can only use up to 4TB per device.  So when a larger
device is added, complain a bit.  Still allow it if --force is given
as there could be a valid use.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-08 13:05:31 +10:00
NeilBrown 75c2df6509 FIX: Prevent using null list pointer
When not all attributes are supported (attributes incompatibility)
function container_content_imsm returns NULL pointer.
We need to cope with a NULL list better.

Reported-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-07 12:45:24 +10:00
Luca Berra e4c72d1dc6 Fix some compiler warnings.
Original by Luca, with various changes by Neil

Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-17 14:35:06 +10:00
NeilBrown 47c7a4be14 Incr: fix breakage in count_active.
If the second device is much newer than the first, but has a lower
raid_disk number, we clear 'avail' badly and don't set up
'best' properly.

Fix these things.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-15 12:21:26 +10:00
NeilBrown 95eeceeb32 getinfo_super now clears the 'info' structure before filling it in.
Some code currently clears 'info' before calling getinfo_super,
some code doesn't.

To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.

Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL.  In that case it is copied out before the structure
is zeroed.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-08 15:54:13 +10:00
NeilBrown 7b0bbd0f71 Release 3.2.1
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-28 13:30:29 +11:00
Adam Kwolek 983fff45a1 FIX: ping_monitor() usage causes memory leaks
When for ping_monitor() input devnum2devname() is used,
received string pointer should be passed to free() for memory release.
It is not made in several places. This use case should have function
to avoid memory leak.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-18 12:32:16 +11:00
NeilBrown 4968025884 Run Grow_restart/Grow_continue when assembling the content of a container.
As containers can now grow, we need to use both Grow_restart (to
replay any backup-file) and Grow_continue when assembling the content
of a container.

Note that we don't pass a backup-file when doing incremental assembly.
If such is needed in that case, the assembly will fail.

To restart such arrays, explicit assembly is required.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-08 17:14:00 +11:00
Adam Kwolek 588bebfcc2 Continue reshape after assembling array
assemble_container_content() cannot close mdfd handle, as it could be
required by reshape continuation.
mdfd handle is closed outside this function, when it is not longer
necessary.
Call to Grow_continue is added for reshape continuation after
assembly.

In the nearest future, simple condition:
    if (content->reshape_active)
before Grow_continue() call will be replaced by check function
for support container operation /reshape/.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-02 12:28:15 +11:00
Krzysztof Wojcik 51cf74194f FIX: Seg Fault in incremental if BBM log detected
Bug  detected for imsm metadata.
Assembling of array using Incremental switch generate segmentation
fault if BBM log is detected.
Reason: missing return from Incremental_container if BBM is detected
and unnecessary list=NULL assignment.
This patch fix the problem and memory leak in this area.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-18 23:51:17 +11:00
NeilBrown 71204a5029 Various compile fixes.
Make "make everything" succeed.
This fixed some real bugs.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 15:48:03 +11:00
NeilBrown e5508b361d Allow domain_test to report that no domains were found.
Sometime we will need to know the difference between no domains found
and domains didn't match.
So allow domain_test to return different values and fix up all callers
to maintain current behaviour.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 14:44:02 +11:00
NeilBrown e5e5d7cea3 Incr: don't exclude 'active' devices from auto inclusion in a container.
For containers, it is always appropriate to include a device in the
container.
Whether it should then be included in an array is a separate question.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 13:07:36 +11:00
Anna Czarnowska 037e12d6da Incremental: move suitable spares to container when subarrays started.
By default Incremental places all imsm spares in separate container
with uuid=0:0:0:0. (patch giving spares uuid_zero needed)

When we find enough members to start an array
we are able to determine domain so we search spare container
for suitable spares and move them to the container that
is currently assembled.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-05 14:42:27 +11:00
Przemyslaw Czarnowski fee6a49ee8 Consider target only for spare-same-domain
otherwise, matching target will force spare-same-domain regardless of
action that comes in domain.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 22:38:44 +11:00
Przemyslaw Czarnowski 4886570497 Validate size of potential spare disk for external metadata (with containers)
mdinfo read with sysfs_read do not contain information about the space
needed to store data of all volumes created in that container, so that
spare can be used as replacement for existing subarrays in the future.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 22:38:42 +11:00
Przemyslaw Czarnowski b6b3f0f7eb Skip domain check for spare-same-slot
If lost disk was the only one that belonged to particular domain, array
won't match with that domain any longer. We can achieve this by moving
domain check below the 'target' test.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 22:36:46 +11:00
Przemyslaw Czarnowski b4924220f1 Added test for array degradation for spare-same-slot
spare-same-slot allows re-adding of missing array member with disk
re-inserted into the same slot where previous member was plugged in.
If in the meantime another spare has been used for recovery, same slot
cookie should be ignored.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 22:33:14 +11:00
Przemyslaw Czarnowski 5be68a0762 external: get number of failed disks for container
Container degradation here is defined as the number of failed disks in
mostly degraded sub-array. This number is used as value for
array.failed_disks and used in comparison to find best match.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 22:31:25 +11:00
Krzysztof Wojcik a06d022db4 FIX: Bad block verification during assembling array
We need to refuse to assemble an arrays with bad blocks.
Initially there was condition in container_content function
that returns error value in the case when metadata store information
about bad blocks.
When the container_content function is called from functions NOT connected
with assemble (Kill_subarray, Detail) we get faulty error return value.
Patch introduces new flag in array.status - MD_SB_BBM_ERRORS. It is set
in container_content when bad blocks are detected and can be checked by
container_content caller.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 21:41:57 +11:00
Przemyslaw Czarnowski 8659d08998 fix: incremental for bare disks returns invalid value
return value should remain the same as result of Manage_Subdevs (last
significant operation). Right now it is inverted what results in
error status for successful operation.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-21 09:11:34 +11:00
Przemyslaw Czarnowski ed690d36e2 fix: adding spare via incremental do not trigger recovery
After incremental has added spare, monitor should be woken up in order
to see if anything has changed. If mdmon is not waken up, recovery do not
start.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-21 09:10:32 +11:00
NeilBrown 833bb0f8f6 Allow --update=devicesize with --re-add
This is useful with 1.1 and 1.2 metadata to update the metadata if
the device size has changed.
The same functionality can be achieved by writing to the device size
in sysfs after re-adding normally, but in some cases this might be
easier.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-09 13:06:29 +11:00
Hawrylewicz Czarnowski, Przemyslaw a92b211229 fix: incremental on invalid container causes segfault
counterpart of 417f346ee0 for incremental.
If md device has metadata_version="none" super_by_fd() matches
supertype=super0.
Call of load_container() dereferences null, so we have to forbid it.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-07 21:01:54 +11:00
NeilBrown de6ae75015 Incremental - avoid including wayward devices.
If a devices - typically in a mirrored set - is assembled
independently of the other devices, and then attempted to be brought
back into the set, it could contain inconsistent data.  It should not
be included.

So detect this situation by ensuring that the 'most recent' device is
believed to be active by every other device.  If a device is wayward,
it will only consider fellow wayward devices to be active and will
think all others are failed or missing.

This patches fixes --incremental, --assemble was done in an earlier
patch.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-29 09:40:15 +11:00
NeilBrown 2bf740b394 Incr: reduce the number of times we load data from sysfs.
Rather than calling sysfs_read whenever we want data from sysfs, call
it once at the start will all the requests of interest, then just use
that,

Make sure we free it properly too.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-25 18:58:44 +11:00
NeilBrown d2db304558 Add action=spare-same-slot policy.
When "mdadm -I" is given a device with no metadata, mdadm tries to add
it as a 'spare' somewhere based on policy.

This patch changes the behaviour in two ways:

1/ If the device is at a 'path' where a previous device was removed
  from an array or container, then we preferentially add the spare to
  that array or container.

2/ Previously only 'bare' devices were considered for adding as
  spares.  Now if action=spare-same-slot is active, we will add
  non-bare devices, but *only* if the path was previously in use
  for some array, and the device will only be added to that array.

Based on code
  From: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 625b25071b incr/spare: recheck allowed action for each metadata.
The current act_spare tests only test if it is allowed for some
metadata.
As we check each array or partitioning type, we need to double-check
that sparing is allowed for that array or partitioning type.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 6e57f80a90 Incr/spare: make sure failure to identify metadata if handled gracefully.
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown aaccda4406 Incr: fix up return value in try_spare
We only want to try partition_try_spare if array_try_spare failed.
If it succeeded, there is nothing more to try.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 52e965c296 Factor out is_bare test.
Instead of open coding (and using horrible gotos), make this
a separate function.

Also fix the check for end of device - SEEK_END doesn't work on
block devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
Przemyslaw Czarnowski 403410eb97 extension of IncrementalRemove to store location (path-id) of removed device
If the disk is taken out from its port this port information is
lost. Only udev rule can provide us with this information, and then we
have to store it somehow. This patch adds writing 'cookie' file in
/dev/.mdadm/failed-slots directory in form of file named with value of
f<path-id> containing the metadata type and uuid of the array (or
container) that the device was a member of.  The uuid is in exactly
the same format as in the mapfile.

FAILED_SLOTS_DIR constant has been added to hold the location of
cookie files.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 08387a0473 Teach IncrementalRemove about containers.
When we -I -R a device in a container, we must first fail it
from each member array before we can remove it from the container.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
Przemyslaw Czarnowski 950bc34477 added --path <path_id> to give the information on the 'path-id' of removed device
<path-id> allows to identify the port to which given device is plugged
in.  In case of hot-removal, udev can pass this information for future
use (eg. write this name as 'cookie' allowing to detect the fact of
reinserting device to the same port).

--path <path-id> parameter has been added to device removal handle
(and char *path has been added to IncrementalRemove() to pass this
value) in order to pass path-id to this handler.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 3a3716107b Add must_be_container helper.
This checks a block device to see if it could be a container, and
in particular cannot be a member device.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown a655e55064 Improve type names for mddev_dev
Remove the _t pointer typedef and remove the _s suffix for the
structure,

These things do not help readability.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown fa56eddbd1 Improve mddev_ident type definitions.
Remove the _t typedef and remove the _s suffix from the struct name.

These things do not help readability.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown 47c74f3f50 Use load_container in Incremental assembly.
We more clearly separate out -I on a container, and use
load_container in that case and load_super only for true members.

This removes another use of loaded_container.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:57:58 +11:00
NeilBrown 3a97f21010 Incremental: Factor out search of mdstat
As we will soon use it in two places.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown 7d91c3f547 Make Incremental_container static
as it is only used in Incremental.c

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown 00bbdbdac6 Add subarray arg to container_content.
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.

For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:26 +11:00
NeilBrown 1d1a9f87a4 Incremental - fix small bug in count_active.
If the first device found has a much smaller event count than a
subsequent device, that device will not be entered in the 'avail'
array properly.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown a5d85af748 get_info_super: report which other devices are thought to be working/failed.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.

We should never include a device in an array if it thinks that the
primary device has failed.

This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown 4e8d9f0a16 Convert 'auto' config line to policy statements 2010-09-06 11:26:28 +10:00
NeilBrown 61018da020 Add support for auto-partitioning base devices.
If a device is bare and policy suggests that it can be used as a spare
for virtual 'partitions' array, find an appropriate partition table
and write it to the device.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:28 +10:00
NeilBrown 56e8be854a First steps to supporting auto-spare-add to groups of partitioned devices.
Adding a spare to a group of partitioned devices is quite different
from adding one to an array.  So detect which option is worth trying
based on policy and then try one or the other - or possibly both - as
appropriate.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:28 +10:00
NeilBrown 0f22b998fb Add mbr pseudo metadata handler.
To support incorpating a new bare device into a collection of arrays -
one partition each - mdadm needs a modest understanding of partition
tables.
The main needs to be able to recognise a partition table on one device
and copy it onto another.

This will be done using pseudo metadata types 'mbr' and 'gpt'.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:28 +10:00
NeilBrown f08605b3ad Allow --incremental to add a device as a spare if policy allows.
If policy allows act_spare or act_force_spare, -I will add a
bare device as a spare to an appropriate array.

We don't support adding non-bare devices as spares yet.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:27 +10:00
NeilBrown 7e83544bc4 Use action policy to keep recently-disconnected devices in the array.
When we find a device that was recently part of the array but is now
out of date (based on the event count) we might want to add it back in
(like --re-add) if the likely cause was a connection problem or we
might not if the likely cause was device failure.

So make this a policy issue: if action=re-add or better, try to re-add
any device that looks like it might be part of the array.

This applies:
  when we assemble the array:  old devices will be evicted by the
     kernel and need to be re-added.
  when we assemble the array during --incr for the same reason.
  when we find a device that could be added to a running array.

This doesn't affect arrays with external metadata at all.
For such arrays:
 When the container is assembled, the most recent instance of each
 device is included without reference to whether it is too old or not.
 Then the metadata handler must which slices of which devices to
 include in which array and with what state.  So the
 ->container_content should probably check the policy and compare the
 sequence numbers/event counts.
 When a device is added (--add) to a container with active arrays
 we only add as a 'spare'. --re-add doesn't seem to be an option.
 When a device is added with -I ->container_content gets another
 chance to assess things again.  So again it should check the policy.


Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:27 +10:00
NeilBrown 15d4a7e447 Introduce single-exit pattern for Incremental
All exits should goto the end for clean-up.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:27 +10:00
NeilBrown ef83fe7cba Allow --incremental to add spares to an array.
Commit 3a6ec29ad5 stopped us from adding apparently-working devices
to an active array with --incremental as there is a good chance that they
are actually old/failed devices.

Unfortunately it also stopped spares from being added to an active
array, which is wrong.  This patch refines the test to be more
careful.

Reported-by: <fibreraid@gmail.com>
Analysed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-12 11:41:41 +10:00