Commit Graph

198 Commits

Author SHA1 Message Date
NeilBrown 71204a5029 Various compile fixes.
Make "make everything" succeed.
This fixed some real bugs.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 15:48:03 +11:00
NeilBrown a5d10dcec8 Allow explicitly listed spared to be included by default.
When the metadata doesn't identify which array a spare belongs to
we normally require an explicit domain match to connect a spare
with an array.
However when the spare is explicitly listed in argv, it should be
safe to include as long as there is no domain conflict.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 14:44:02 +11:00
NeilBrown e5508b361d Allow domain_test to report that no domains were found.
Sometime we will need to know the difference between no domains found
and domains didn't match.
So allow domain_test to return different values and fix up all callers
to maintain current behaviour.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 14:44:02 +11:00
NeilBrown ac597b1c21 free_super after assembling a container
Else the devices are held open.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 13:07:24 +11:00
NeilBrown d438679977 Assemble: ignore unknown devices not listed on command line.
If we find a device that has not superblock, we currently fail
unless in auto_assem mode.
However we really should only fail if the device was explicitly listed
in the arg list.  So add a test for that.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 13:07:07 +11:00
Czarnowska, Anna 3c7b4a2595 Assemble: allow to assemble container with uuid=0:0:0:0
When there are any arrays in config file the spares with
domain not matching any array are not assembled because
auto assembly is not attempted.
Addition of ARRAY line with uuid=0:0:0:0 in config will work
with modified condition for gathering spares.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-01 10:40:56 +11:00
Anna Czarnowska ed7fc6b4d9 Assemble: allow to assemble spares on their own
If we find spares but no members of given array
we create container with just spares.

This allows auto assemble to pick up all lose imsm spares when there
is no config file.
When there is a valid config file and any array is assembled from it
we don't try auto assembly so we will not assemble spares that don't
match any array.
To remedy this we must add
ARRAY metadata=imsm UUID=00000000:00000000:00000000:00000000
to config file.
This container will include all remaining spares.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-05 13:54:18 +11:00
Anna Czarnowska 26b05aeaed Assemble: we need to read policy to know array domains
Policy must be read on all disks identified as array members
to get array's domains list.
Currently it is only read on first array member in auto assembly mode.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-05 13:42:59 +11:00
Anna Czarnowska cbeeb0e5f0 Assemble imsm spares in matching domain only
Imsm spare will only be taken if it matches domain of
identified members of currently assembled array.

This implies that:
- spare with null domain will match first array assembled.
- if array has null domain then no spare will match

If we allow spares to set st they may block assembly of subarrays.
This is because in auto-assembly tmpdev->used=0 for a spare not matching
any array. If we find such spare before container and set st, the content
will not get assembled.

We allow uuid_zero match any uuid in assembly as unsuitable spares will
be rejected on domain check.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 22:08:51 +11:00
Krzysztof Wojcik a06d022db4 FIX: Bad block verification during assembling array
We need to refuse to assemble an arrays with bad blocks.
Initially there was condition in container_content function
that returns error value in the case when metadata store information
about bad blocks.
When the container_content function is called from functions NOT connected
with assemble (Kill_subarray, Detail) we get faulty error return value.
Patch introduces new flag in array.status - MD_SB_BBM_ERRORS. It is set
in container_content when bad blocks are detected and can be checked by
container_content caller.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 21:41:57 +11:00
NeilBrown 87f26d14f7 Assemble: allow an array undergoing reshape to be started without backup file
Though not having the proper backup file can cause data corruption, it
is not enough to justify not being able to start the array at all.
So allow "--invalid-backup" to be specified which says "just continue
even if a backup cannot be restored".

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-01 11:47:32 +11:00
Hawrylewicz Czarnowski, Przemyslaw 417f346ee0 fix: assemble for external metadata generates segfault if invalid device found
An attempt to invoke super_by_fd() on device that has
metadata_version="none" always matches super0 (as test_version is "").
In Assemble() it results in segfault when load_container is invoked
(=null for super0).
As of now load_container is only started if it points to valid pointer.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-01 11:06:09 +11:00
NeilBrown 484ae54d16 Assemble: call remove_partitions later.
We shouldn't call remove_partitions until we have made a really firm
decision to include the device into the array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 16:56:01 +11:00
Dan Williams dcc4210f58 Assemble: fix assembly in the delta_disks > max_degraded case
Incremental assembly works on such an array because the kernel sees the
disk as in-sync and that the array is reshaping.  Teach Assemble() the
same assumptions.

This is only needed on kernels that do not initialize ->recovery_offset
when activating spares for reshape.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 15:10:01 +11:00
NeilBrown 87477e6d5e Assemble: get content before testing it.
When checking that a container matches the required uuid,
we need to call 'getinfo_super' before we have a 'content'
to test.

Reported-by: "Czarnowska, Anna" <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 11:34:36 +11:00
NeilBrown 5083d66b9c Assemble: use load_container
Separate the load_container call from the load_super call,
and use different validity tests as appropriate.

Add some general code tidying and a bit of indent change to make
structure a little clearer.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 88cef9b3e6 Assemble: turn next_member goto loop into a for loop.
It becomes much clearer what is happening now.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 02c2c47487 Assemble: simplify the handling of is_member_busy.
This is somewhat inconsistent with the last member of a
container getting special handling.
Just simplify it so the code seems to make sense and important
is easy to follow.

Signed-of-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown d76c4d8894 Assemble: remove the skip variable.
it seems we don't need it any more

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown 805d30b288 Assemble: merge 'member' test into ident_matches.
This is a more sensible place for it, gathering all the tests
together.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown fa0312397e Assemble: change 'skip' label to a variable.
This gets rid of some gotos which makes the code flow a bit
more clear.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown 2b594614a1 Remove content from mddev_dev
Now that the next_member loop is much smaller it is easy to
just use 'content' rather than stashing it in 'tmpdev->content'.
So we can remove the 'content' field from 'struct mddev_dev'.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown 1415fe4b6c Assemble: contract next_member loop.
We have a 'goto next_member' loop which is rather spread-out and
confusing.
Recent refactoring make it possible to contract that loop
significantly.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown bac0d92e93 Assemble: merge to large 'if' statements.
In assemble, we see (inside a 'for' loop):

 if (condition) {
    lots of stuff
 } else
    something

 small thing

 if (same condition) {
     lots more stuff
     break;
 }

where 'condition' cannot be changed in the middle.

So simplify this to

 if (condition) {
    lots of stuff
    small thing
    lots more stuff
    break;
 }

 something
 small thing

which duplicates the small thing, but provides much
conceptual simplicity.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown a655e55064 Improve type names for mddev_dev
Remove the _t pointer typedef and remove the _s suffix for the
structure,

These things do not help readability.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown fa56eddbd1 Improve mddev_ident type definitions.
Remove the _t typedef and remove the _s suffix from the struct name.

These things do not help readability.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown 08fb91a363 Assemble: factor out ident_matches
This will help future patch, and we need to make "Assemble()" smaller
anyway.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown d68ea4d775 Assemble: small cleanup of error checking.
If we get an early error (e.g. not a block device) we need to
not continue through and check e.g. uuid.

Also make sure we set used=2 whenever we find an error, and don't
bother with ->free_super as 'goto loop' does that.

Now that we abort earlier, we can remove lots of tests on
  tst && tst->sb


Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown 00bbdbdac6 Add subarray arg to container_content.
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.

For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:26 +11:00
NeilBrown 02e7c5b75c Assemble - avoid including wayward devices.
If a device - typically in a mirrored set - is assembled independently
of the other devices, and then attempted to be brought back into the
set it could contain inconsistent data.  It should not be included.

So detect this situation by ensuring that the 'most recent' device is
believed to be active by every other device.  If a device is wayward,
it will only consider fellow wayward devices to be active and will
think all others are failed or missing.

This patch only fixes --assemble, not --incremental

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown d7f7ebb73d Assemble: handle devices array better.
Only allocate when it is about to be used, and free it when finished.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown a5d85af748 get_info_super: report which other devices are thought to be working/failed.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.

We should never include a device in an array if it thinks that the
primary device has failed.

This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown 1e2b276535 Report error in --update string is not recognised.
If an --update is requested by the relevant metadata doesn't
understand it, print a useful message rather than silently ignoring
the issue.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:24 +11:00
NeilBrown 4e8d9f0a16 Convert 'auto' config line to policy statements 2010-09-06 11:26:28 +10:00
NeilBrown 0f22b998fb Add mbr pseudo metadata handler.
To support incorpating a new bare device into a collection of arrays -
one partition each - mdadm needs a modest understanding of partition
tables.
The main needs to be able to recognise a partition table on one device
and copy it onto another.

This will be done using pseudo metadata types 'mbr' and 'gpt'.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:28 +10:00
NeilBrown 7e83544bc4 Use action policy to keep recently-disconnected devices in the array.
When we find a device that was recently part of the array but is now
out of date (based on the event count) we might want to add it back in
(like --re-add) if the likely cause was a connection problem or we
might not if the likely cause was device failure.

So make this a policy issue: if action=re-add or better, try to re-add
any device that looks like it might be part of the array.

This applies:
  when we assemble the array:  old devices will be evicted by the
     kernel and need to be re-added.
  when we assemble the array during --incr for the same reason.
  when we find a device that could be added to a running array.

This doesn't affect arrays with external metadata at all.
For such arrays:
 When the container is assembled, the most recent instance of each
 device is included without reference to whether it is too old or not.
 Then the metadata handler must which slices of which devices to
 include in which array and with what state.  So the
 ->container_content should probably check the policy and compare the
 sequence numbers/event counts.
 When a device is added (--add) to a container with active arrays
 we only add as a 'spare'. --re-add doesn't seem to be an option.
 When a device is added with -I ->container_content gets another
 chance to assess things again.  So again it should check the policy.


Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:27 +10:00
NeilBrown f21e18ca89 Compile with -Wextra by default
This produced lots of warning, some of which pointed to actual bugs.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-05 13:13:02 +10:00
NeilBrown e5c99c0811 Assemble: Fix honouring of 'auto' config line
commit 1ff9833928
broke the checking of metadata types via the 'auto' line.

Be moving 'load_super" before "conf_test_metadata" we left
tst->sb set even if conf_test_metadata fails, so the device will
actually be accepted and used.

So if we decide to reject the device, free the superblock so it is
clear that it is rejected.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-07-06 11:57:09 +10:00
NeilBrown 1ff9833928 Assemble: fix some recently introduced bugs.
Found during testing:
 - cannot check metadata for homehost before loading metadata.
 - As 1.x metadata can has a state 'rebuilding' between
   'spare' and 'ok', we need to include that in our calculations.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-10 11:21:40 +11:00
NeilBrown d1d3482b56 config: add 'homehost' option to 'AUTO' line.
This allows basing auto-assembly decisions on whether
the array is recorded as belonging to this host or not.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-03 14:33:55 +11:00
NeilBrown 24af7a8744 Assemble: clean up properly if we cannot add the bitmap file.
If we find we cannot add the requested bitmap file when
assembling the array, then make sure to clean up properly
and don't leave a half-configured array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-24 12:17:17 +11:00
NeilBrown 4c1c3ad8cf Assemble: check inargv before complaining about stray arguments.
If --assemble is given a container and some other devices to assemble
an array from, it complains with an error because that doesn't make
sense.
However it currently also complains if the list of devices was extract
from the config file rather than being given on the command line.
That is not appropriate.

So add an '&& inargv' test to ensure that we are really complaining
about the right thing.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
2010-02-24 11:43:59 +11:00
NeilBrown 921d9e164f Assemble: fix --force assembly of v1.x arrays which are recovering.
1.x metadata allows a device to be a member of the array while it
is still recoverying.  So it is a working member, but is not
completely in-sync.

mdadm/assemble does not understand this distinction and assumes that a
work member is fully in-sync for the purpose of determining if there
are enough in-sync devices for the array to be functional.

So collect the 'recovery_start' value from the metadata and use it in
assemble when determining how useful a given device is.

Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 12:02:09 +11:00
NeilBrown 9f22b13fe1 Assemble: error-check ->load_super
Once load_super has succeeded, it should continue to succeed.  However
devices can disappear etc so it is prudent to always check the return
status of load_super.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-01-28 09:02:21 +11:00
NeilBrown cd77ac4eaf Assemble: fix testing of 'verbose' flag.
The 'verbose' flag can be negative, meaning 'quiet'.
So never check for != 0.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-19 15:55:59 +11:00
NeilBrown df0d4ea04e Replace all relevant occurrences of -4 with LEVEL_MULTIPATH
Also -1 -> LEVEL_LINEAR.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-17 12:31:12 +11:00
NeilBrown f22385f982 Assemble: include ACTIVE but not in-sync devices as non-spares.
Previously such things did not exist: ACTIVE and SYNC were either both
set or both clear.   Recent changes with reshape means that a device
can be ACTIVE but not yet fully in-sync, so they need to be handled
and included in the array as active devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-17 12:30:54 +11:00
NeilBrown 4a997737a1 Merge branch 'master' into devel-3.1 2009-10-22 11:13:13 +11:00
NeilBrown eb3929a47f Compile fixes for mdassemble
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 16:53:43 +11:00
NeilBrown ea0ebe9685 Assemble: print more verbose messages about restarting a reshape
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 16:23:45 +11:00