Commit Graph

352 Commits

Author SHA1 Message Date
NeilBrown d80f7aa9a1 Assemble: correctly capture error from ->write_bitmap
else 'err' might be undefined.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-05 14:55:31 +10:00
NeilBrown 5997585200 Merge branch 'mdadm-3.3.x' 2015-08-03 16:21:37 +10:00
NeilBrown 8360760457 Assemble: really don't assemble IMSM array without OROM.
Previous patch missed on case.

Also print more useful information when rejecting
a device with IMSM metadata.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 16:06:51 +10:00
NeilBrown 7eee461e91 Assemble: don't assemble IMSM array without OROM.
If someone has an IMSM array, and disables RAID in the BIOS
and uses the devices for some other purpose, then they really don't
want mdadm to start syncing the array.

So don't assemble if OROM doesn't confirm it is OK.

There can still be problems for crash-dump not being able to find
the OROM.   Some explicit work-around might be needed for that
rather than a more general workaround that can corrupt data.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 15:42:16 +10:00
NeilBrown 9f2e55a421 Assemble: don't assemble IMSM array without OROM.
If someone has an IMSM array, and disables RAID in the BIOS
and uses the devices for some other purpose, then they really don't
want mdadm to start syncing the array.

So don't assemble if OROM doesn't confirm it is OK.

There can still be problems for crash-dump not being able to find
the OROM.   Some explicit work-around might be needed for that
rather than a more general workaround that can corrupt data.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-29 14:38:37 +10:00
NeilBrown 653299b699 Merge branch 'cluster'
Now that 3.3.3 is out, it is time to include the cluster-support code.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-27 11:01:08 +10:00
NeilBrown 86b77ddf87 Assemble: extend --homehost='<ignore>' to allow --name= to ignore homehost
Also make --homehost='<ignore>' work properly.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-24 12:50:54 +10:00
NeilBrown 00f23a8861 Assemble: improve tests for matching --name= request.
If the name in the array has a home-host, then
require that it matches, or is "any", or requested
homehost is "any".

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-22 09:24:36 +10:00
NeilBrown 29a312f2f3 Assemble: really ensure stripe_cache is bit enough to handle new chunk size
Earlier patch:
  56fcbcbb6f
calculated the proper chunk size - but didn't use it..

Let's actually use it this time.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-17 13:10:25 +10:00
NeilBrown 56fcbcbb6f Assemble: ensure stripe_cache is big enough to handle new chunk size
If you reshape to a larger chunk size, and need to restart,
it can have problems.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-18 15:49:52 +10:00
Guoqing Jiang 7e6e839a26 mdadm: change the num of cluster node
This extends nodes option for assemble mode, make the num of
cluster node could be change by user.

Before that, it is necessary to ensure there are enough space
for those nodes, calc_bitmap_size is introduced to calculate
the bitmap size of each node.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:43:31 +10:00
Guoqing Jiang 0aa2f15b20 mdadm: add the ability to change cluster name
To support change the cluster name, the commit do the followings:

1. extend original write_bitmap function for new scenario.
2. add the scenarion to handle the modification of cluster's name
   in write_bitmap1.
3. let the cluster name also show in examine_super1 and detail_super1

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:33:39 +10:00
NeilBrown ec6db5ba71 Assemble: don't check for pre-existing array when updating uuid.
This is a very corner-case, but the self-tests tripped on it,
and it makes sense not to trust the uuid when it is being changed.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-13 12:41:48 +10:00
NeilBrown 87af7267bd Assemble/force: make it possible to "force" a new device in a reshape.
Normally we do not "force"-assemble devices which are in the
middle of recovery, as they are unlikely to have useful data.

However, when a reshape increases the number of devices,
the newly added devices appear to be recovering because they
do not have complete data on them yet, but then they aren't expected
to until the reshape completes.
So in this case, it can be appropriate to force-assemble them.

Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 12:04:00 +10:00
NeilBrown c34fef774a Assemble: remove stray ':' from error message.
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 11:27:34 +10:00
NeilBrown d316dba7c9 Revert "Assemble: support assembling of a RAID0 being reshaped."
This reverts commit b720636a58.

As it said, this was a hack.  It causes problems when trying to
--force assemble a RAID4.  There is a better way.

Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:31:32 +10:00
NeilBrown ee466574f2 Assemble: fix "no uptodate device" message.
Since we introduced replacement devices, the 'i' used in
start_array() is twice the slot number.

So we need to adjust when printing.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:20:26 +10:00
NeilBrown e9e6894d4b Assemble: don't ignore the return value from stat.
static checkers complain about that.
So change the code to use 'fstat', as we really don't want
to see an error here..

Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:56:52 +11:00
NeilBrown 7a862a020f Don't break long strings onto multiple lines.
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.

Only strings which contain a newline can be broken
into multiple lines:

 "It is OK to\n"
 "break this string\n"


Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:46:53 +11:00
NeilBrown 5141638c54 Assemble: Only fail auto-assemble in face of mdadm.conf conflicts.
We should never auto-assemble things that conflict with mdadm.conf
However explicit assembly requests should be allowed.

Reported-by: olovopb
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1070245
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-29 13:48:23 +10:00
Pawel Baldysiak 0c21b485e4 IMSM: Add warning message when assemble spanned container
Due to several changes in code assemble with disks
spanned between different controllers can be obtained
in some cases. After IMSM container will be assembled, check HBA of
disks, and print proper warning if mismatch is detected.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-08 11:39:23 +10:00
NeilBrown 02b70e83e6 Incremental: remove old devices when assembling in container.
When assembling a native array we just give all devices to the kernel
and leave it to discard the 'old' ones (based on sequence/event
number).

For external/container arrays, mdadm needs to do that.

So in assemble_container_content, get list of current devices in
array and discard any that aren't in the 'content' given.
They must have been rejected by metadata manager.

If we cannot discard old devices the array must already be active, so
just leave it alone, but with a message.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-06-05 15:58:31 +10:00
NeilBrown 06e293d097 Grow: fix resent grow_continue breakage.
Commit 5e76dce1ac changed
Grow_continue to assume a fork had already happened, so that
   mdadm --grow --continue

didn't fork.  This is good, but it means that if Grow_continue
is run from Assemble, then
  mdadm --assemble ....

can misbehave if the array was in the middle of a reshape.

So introduce finer control.  Grow_continue only assumes it has
already forked if run from "mdadm --grow --continue".

Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-22 14:22:58 +10:00
NeilBrown 54ded86fbd Grow: store a link to current backup file in /run/mdadm or similar.
Subsequent patch will allow the background part of "mdadm --grow" to
be run from systemd.  This can require the passing of a backup file
name.
To do this, store that name as a symlink in /run/mdadm (or MAP_DIR)
and look for it when appropriate.

It might be useful to also store the name across reboot, but that
would be a different patch.  We would need to use the uuid to identify
it, and store it in stable storage.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-15 14:23:16 +10:00
NeilBrown 56bbc588f7 Assemble: change load_devices to return most_recent 'st' value.
This means that

	st->ss->getinfo_super(st, content, NULL);
	clean = content->array.state & 1;

will get an up-to-date value for 'clean'.  This fix allows
  tests/03r5assem-failed
to work.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-02-25 15:04:16 +11:00
NeilBrown 9ee314dab9 Assemble: re-arrange freeing of 'tst' in load_devices().
When we return in error, we need to free(tst), and ->free_super(tst);
Sometimes we didn't.

Also the final ->free_super(tst) should be followed by free(tst)
but wasn't.

Move that file free forward in the code a bit as we will want to use
the tst there in the next patch.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-02-25 14:59:12 +11:00
NeilBrown df842e69a3 Assemble: allow load_devices to change the 'st' which is passed in.
The given 'st' might not be best.  Making this interface change
will allow load_devices to return a better 'st'.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-02-25 14:54:34 +11:00
NeilBrown 284546ef89 Assemble: avoid infinite loop when auto-assembling partial container.
When auto-assembling we loop until we get no successes.

If a device is found that look like it is part of an already-existing
container, but we subsequently fail to add that device, then the fact
that the container is running looks like a success.  This can result
in infinite looping.
So if a container was already partially assemble, and is still only
partially assembled after we try to add devices, then don't treat that
as success.

Signed-off-by: NeilBrown  <neilb@suse.de>
2014-01-20 15:23:31 +11:00
NeilBrown 8832342d3a Assemble/Incremental: don't hold O_EXCL on mddev after assembly.
As soon as the array is assembled, udev or systemd might run
fsck and mount it.  So we need to drop O_EXCL promptly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-12-05 10:35:16 +11:00
NeilBrown 9ca39acb3e Incremental: add --export handling.
If --export is given with --incremental, then
  MD_DEVNAME
is output which gives the name of the device (in /dev/md) that
is the array (or container) that the device would be added to.
Also
  MD_STARTED
is set to one of
  no
  unsafe
  yes
  nothing

to indicate if the array was started.  IF MD_STARTED=unsafe
then it may be appropriate to run
  mdadm -R /dev/md/$MD_DEVNAME
after a timeout to ensure newly degraded array are started.

If
  MD_FOREIGN=yes
it might be appropriate to suppress this as the array is
probably not critical.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-28 15:15:30 +11:00
NeilBrown c1736844ba Restructure assemble_container_content and improve messages.
We lose one level of indent, and now get told the difference between
'not assemble because not safe' and 'not assembled because not enough
devices'.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-28 14:47:41 +11:00
NeilBrown f81a2b56c4 Assembe: fix bug in force_array - it wasn't forcing properly.
Since 'best' was expanded to hold replacement devices, we might
need to go up to raid_disks*2 to find devices to force.

Also fix another place when considering replacement drives would
be wrong (the 'chosen' device should never be a replacement).

Reported-by: John Yates <jyates65@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-10-22 09:55:04 +11:00
NeilBrown d5a4041647 Make -IRs and --run work properly for containers.
We really need to make sure assemble_container_content()
gets called to finished the assembly of these.

Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-09-13 10:51:20 +10:00
NeilBrown 6f02172d2e Release mdadm-3.3
(and  various cosmetic fixes)

Signed-off-by: NeilBrown <neilb@suse.de>
2013-09-03 14:47:47 +10:00
NeilBrown a792ece676 Assemble: don't ever consider a 'spare' to be the 'most recent'.
If all devices have the same event count and the first one is a spare,
then that spare will be the 'most_recent'.
However then other devices will think the 'most_recent' has failed
(for v0.90 metadata) and will be rejected from the array.

So never consider a 'spare' to be 'most recent'.

Reported-by: Andreas Baer <synthetic.gods@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-09-02 11:48:06 +10:00
NeilBrown b4924f46c0 Don't set 'hold' option for mdstat_read if not needed.
We only need 'hold' if we want to mdstat_wait for a change.
These two callers don't care about a change, so they shouldn't
use the 'hold' flag.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 11:02:10 +10:00
NeilBrown eb20ecf101 Assemble: avoid a consistency check when --force is given.
mdadm will normally not include a device into an array if that device
reports that the "best" device has failed, as this normally implies
some sort of inconsistency.
However when --force is given it means that the given drives really
should be assembled if at all possible so in that case the test should
be avoided.

The particular case where this was a problem was a RAID5 were all
devices had the same event count but three of them reported that the
first two had failed.
As they all had the same event count the first was taken as the 'best'
and that caused the later ones to be excluded.  Listing one of the
later ones first allowed the array to be assembled.  So in this case
the test clearly just got in the way and did nothing useful.

Reported-by: "Marek Jaros" <mjaros1@nbox.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-08 12:02:23 +10:00
NeilBrown be7c26b48c Assemble: improve messages when restarting a reshape.
If the restarted reshape needs a backup file and we don't have one,
that should be reported before we try to start the array.
Also we shouldn't say the "Cannot grow" but "cannot complete".

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-02 13:09:07 +10:00
NeilBrown c39b2e633f Assemble: ignore devices= if container= is present.
If "container=" is present, then we are going to assemble from the
given container where that container is made of those devices or not.
So in this case the "devices=" is purely documentation and is best
ignored.

As part of this, move the test on the "container=" value when that
start with "/" up before the device is opened.  There sooner we test
things, the better.

Reported-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-02 11:14:09 +10:00
NeilBrown babb8dd427 Assemble: write raid-disks should be less fatal.
If the container metadata doesn't know how many device to expect (as
is the case with IMSM), don't fail an --assemble which over-specifies
the number of devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-02 10:33:35 +10:00
NeilBrown 9b6bf8aa54 Assemble: remove some stray tracing.
Was introduced in:
  Assemble: when forcing a single-degraded RAID6 array, trigger a 'repair'.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-27 14:07:38 +10:00
NeilBrown 71417de6fe Add test for interaction of --assemble with --incr
and fix the bug that it found.  The refactor of start_array()
missed a test.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-19 16:34:47 +10:00
NeilBrown 1011e8344a Remove lots of unnecessary white space.
Now that I am using white-space mode in Emacs I can see all of this,
and I don't like it :-)

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-19 12:31:45 +10:00
NeilBrown 8cde842b18 Assemble: when forcing a single-degraded RAID6 array, trigger a 'repair'.
When an active/degraded RAID6 array is force-started we clear the
'active' flag, but it is still possible that some parity is
no in sync.  This is because there are two parity block.
It would be nice to be able to tell the kernel "P is OK, Q maybe not".
But that is not possible.

So when we force-assemble such an array, trigger a 'repair' to fix up
any errant Q blocks.

This is not ideal as a restart during the repair will not be continued
after the restart, but it is the best we can do without kernel help.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-19 11:09:33 +10:00
NeilBrown f80057aec5 Assemble/Incr: Don't include spares with too-high event count.
Some failure scenarios can leave a spare with a higher event count
than an in-sync device.  Assembling an array like this will confuse
the kernel.
So detect spares with event counts higher than the best non-spare
event count and exclude them from the array.

Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-17 16:55:31 +10:00
NeilBrown a7dec3fd92 Make sure NOFILE resource limit is big enough.
Some people want to create truely enormous arrays.
As we sometimes need to hold one file descriptor for each
device, this can hit  the NOFILE limit.

So raise the limit if it ever looks like it might be a problem.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-30 14:31:09 +10:00
NeilBrown afa368f49a Assemble: --update=metadata converts v0.90 to v1.0
This allows the smooth conversion of legacy 0.90 arrays
to 1.0 metadata.
Old metadata is likely to remain but will be ignored.
It can be removed with
  mdadm --zero-superblock --metadata=0.90 /dev/whatever

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-28 16:44:22 +10:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
NeilBrown 8cf2eb96b2 Assemble: fix spelling: report_missmatch -> report_mismatch
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-05 11:40:28 +11:00
NeilBrown 1d04e27570 Assemble: Don't auto-assemble arrays which conflict with mdadm.conf
When auto-assembling we might find an array which appear in
mdadm.conf.
This can happen if the array (based on UUID) doesn't match what is
in mdadm.conf.
For consistency we should avoid auto-assembling such an array just as
we avoid regular-assembling of the array.


Reported-by: Ross Boylan <ross@biostat.ucsf.edu>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-05 11:06:55 +11:00
NeilBrown 66eb2c93a6 Assemble: ensure that <ignore>d arrays are not auto-assembled.
It isn't enough to simply not assemble arrays found to be called
<ignore>, as the final stage of auto-assemble doesn't check for names
in mdadm.conf.

So add a check to Assemble, similar to the check in Incremental()

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-22 17:04:20 +11:00
NeilBrown b20c8a502d Assemble: fix call to wait_for
Recent patch closed 'mdfd' before calling wait_for, which means
it doesn't work.

Put the close back in the right place.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:08:56 +11:00
NeilBrown 5e9fd96f21 Assemble: Fix critical-section-recovery when assembling a growing array.
commit aacb2f816a
    Assemble: add support for replacement devices.

broke the restoring of the 'critical section' because it messed up the
list of file descriptors passed to Grow_restart.  Put it back the way
it should be.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:08:36 +11:00
NeilBrown cb8f6859d1 IMSM - allow assembling any imsm array even without OROM.
It is important to check for compatibility with 'platform' or
Option ROM when creating or changing and array.  However there is no
real need when simply assembling the array.

On some systems there are situations where the platform information is
not available.  e.g. on some UEFI systems, UEFI is not available
during 'kdump' handling.  This makes it impossible to assemble
an IMSM array to receive the dump.

So remove the requirements that the platform be visible to assemble
an IMSM array.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:07:30 +11:00
NeilBrown aacb2f816a Assemble: add support for replacement devices.
Need to possibly collect 2 devices for each slot, and
original and a replacement.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-24 09:48:18 +11:00
NeilBrown 79f9f56da6 Assemble.c - re-indent file.
Make sure spaces and indents are consistent.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-22 17:25:19 +11:00
NeilBrown 6f4dbdc4e8 Assemble: remove support for assembling arrays with ancient kernel.
Using "START_ARRAY" ioctl never really worked reliably,
was removed a decade ago, and just clutters the code.
So remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-22 17:23:25 +11:00
NeilBrown ddc1b11fb5 Assemble: split out "start_array()" function.
Apart from code movement, there is a small functional change here.
If the array is not successfully started, it is stopped.
Previously we would sometimes leave the array in a partially-assembled
but inactive state.
This just causes confusion.
"--incremental" can be used to partially assemble arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-22 17:23:11 +11:00
NeilBrown 9f5470ce8d Assemble: split out force_array()
force_array() is called if --force was specified to update
and metadata necessary to make the array assemble.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-18 17:30:51 +11:00
NeilBrown 2c355c225e Assemble: split out load_devices() functionality.
Once we have found the devices we want, we need to load the
metadata from them and store it.  This new function extracts that
functionality out of Assemble()

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-18 16:39:49 +11:00
NeilBrown 95425a89fc Assemble: split out select_devices function.
Assemble() is way too big.
This patch starts cleaning it up by pulling the 'select_devices()'
function.  This examines the device to make sure they all belong to
one array, or select those that do (depending on exact use case).

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-18 15:31:20 +11:00
NeilBrown 0431869cec Fix up interactions between --assemble and --incremental
If --incremental has partly assembled an array and
--assemble is asked to assemble it, the just finds remaining
devices and makes a new array.  Not good.

So:
1/ modify locking policy so that assemble can be sure that
  no --incremental is running once it locks the map file
2/ Assemble() checks the map file for a duplicate and adds to
   that array instead of creating a new one.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-10 18:27:32 +11:00
NeilBrown 5e88ab2e2f New RESHAPE_NO_BACKUP flag to track when backup action is needed.
Some arrays (raid10) never need a backup file, so during assembly
we can avoid the whole Grow_continue check in that case.
Achieve this using a flag set by the metadata handler.

Also get "mdadm -I" to fail if a backup process would be
needed.  It currently does fail as the kernel rejects things,
but it is nicer to have this explicit.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 56dcaa6ba0 Assemble: don't leak memory with fdlist.
We should free fdlist when finished with it.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:20:25 +10:00
NeilBrown 11b6d91dd0 Change Incremental and related functions to take struct context
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:20:22 +10:00
NeilBrown 4977146a84 Convert Assemble() to take a context rather than a list of options.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:19:07 +10:00
NeilBrown 0ea8f5b167 Assemble: allow arrays to be assembled read-only.
The option was there, but never used.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
Alexander Lyakas 135a31f5ed Don't consider disks with a valid recovery offset as candidates for bumping up event count
When we are looking for a candidate disk to bump up the event count,
we consider only disks that have recovery_start==MaxSector.
However, after we find one such disk, we agree to accept more disks
having same event count, regardless of their recovery_start.
Be consistent and don't accept disks with a valid recovery_start at all.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-15 14:20:42 +10:00
Adam Kwolek 4aecb54a21 FIX: Assembled second array is in read only state during reshape
When arrays using external metadata are assembled, and one of array
in container is under reshape, second array will remain in read only
state (not auto read only). It is caused by array fact that array
is frozen and mdmon doesn't has opportunity to switch array in r/w mode.

Freezing not reshaped array just after it is being assembled allows mdmon
to enable it for writing.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:38 +10:00
NeilBrown e62b778573 Assemble: improve verbose logging when including old devices.
Reporting:

mdadm: added /dev/loop1 to /dev/md0 as 1
mdadm: added /dev/loop2 to /dev/md0 as 2
mdadm: added /dev/loop0 to /dev/md0 as 0
mdadm: /dev/md0 has been started with 2 drives (out of 3).


is confusing - why only 2?  Code now reports:

mdadm: added /dev/loop1 to /dev/md0 as 1
mdadm: added /dev/loop2 to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/loop0 to /dev/md0 as 0
mdadm: /dev/md0 has been started with 2 drives (out of 3).

which is somewhat clearer.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 14:52:21 +11:00
NeilBrown b720636a58 Assemble: support assembling of a RAID0 being reshaped.
This is a bit of a hack and the code need to be made more
general.  But this adds the special case of a RAID0 being
reshaped which looks like a RAID4 but doesn't need as many
devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-07 10:47:34 +11:00
NeilBrown 56d1885944 Assemble: don't use O_EXCL until we have checked device content.
If we open with O_EXCL before checking that the device is one that
we really want, then that could cause some other process to think
the device is busy when it isn't really.

This particularly affects running "mdadm -A devname" in parallel for
different arrays.  One might be looking at a device that it won't
end up using while another trys and fails to look at a device that
it needs.

So delay the O_EXCL until after all identity checks.

Multiple "mdadm -As" will still have races, but that is fundamentally
racy anyway.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-07 10:41:24 +11:00
Adam Kwolek 111e9fdaa8 FIX: Array is not run when expansion disks are added
When added disk is disk added by expansion and this is last disk added
to array, assemble_container_content() will not even try to run such array.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-09 12:20:51 +11:00
NeilBrown da8fe5aa9b Assemble: fix --force assemble during reshape.
If we have to --force assembly during reshape, we need to
check by the 'before' and 'after' cases to make sure there
are enough devices.

Reported-by: Richard Herd <2001oddity@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-07 14:06:44 +11:00
NeilBrown de5a472ea3 Remove avail_disks arg from 'enough'.
It can easily be calculated from 'avail' and  'raid_disks', and we
will soon have a case where we don't have it easily available to pass
in.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-07 14:04:47 +11:00
NeilBrown 887162637f Assemble: fix count in "assembled with .. but not started".
We need to include the count of pre-existing devices here.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 10:49:07 +11:00
NeilBrown 576d028002 Assemble: make some plurals conditional.
"1 devices" is ugly.  Fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 10:49:07 +11:00
NeilBrown 81a5b4f52f Remove update_private
This fields doesn't work any more as ->getinfo_super clears the info
structure at an awkward time.  So get rid of it and do it differently.

The issue is that the metadata handler cannot tell if the uuid it has
was randomly generated or explicitly requested, except on the first
call.
And we don't want to accept explicit requests for IMSM.
So when it was auto-generated, make it look distinctive by having the
same int copied in all 4 positions.  If someone requests a uuid like
that, I guess they get away with it.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-20 10:30:34 +11:00
NeilBrown a648241517 Resolve some more warnings
unused variables when MDASSEMBLE is defined, and a typo in mdadm.8

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-13 13:24:52 +11:00
Lukasz Dorau 7728e1c635 fix: correct metadata's update communication
The problem occurs when array under migration is assembled incrementally.
st->update_tail is not initialized in function
assemble_container_content() and during reshape
the checkpoint information in metadata is not being updated.

The value of st->update_tail is now initialized in function
assemble_container_content() and during reshape the checkpoint
information in metadata is being updated correctly on all disks.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-21 16:17:56 +11:00
Jes Sorensen 518a60f385 Assemble(): don't dup_super() before we need it.
Avoid resource leak in case we bail loop early

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen 22472ee1d2 assemble_container_content(): fix memory leak
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen 83366b3352 Fix memory leak
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 14:50:44 +11:00
Labun, Marcin 81219e70f2 kill-subarray: fix, IMSM cannot kill-subarray with unsupported metadata
container_content retrieves volume information from disks in the
container.  For unsupported volumes the function was not returning
mdinfo. When all volumes were unsupported the function was returning
NULL pointer to block actions on the volumes. Therefore, such volumes
were not activated in Incremental and Assembly. As side effect they
also could not be deleted using kill-subarray since "kill" function
requires to obtain a valid mdinfo from container_content.

This patch fixes the kill-subarray problem by allowing to obtain
mdinfo of all volumes types including unsupported and introducing new
array.status flags.

There are following changes:

1. Added MD_SB_BLOCK_VOLUME for blocking an array, other arrays in the
   container can be activated.

2. Added MD_SB_BLOCK_CONTAINER_RESHAPE block container wide reshapes
   (like changing disk numbers in arrays).

3. IMSM container_content handler is to load mdinfo for all volumes
   and set both blocking flags in array.state field in mdinfo of
   unsupported volumes.  In case of some errors, all volumes can be
   affected. Only blocked array is not activated (also reshaped as
   result). The container wide reshapes are also blocked since by
   metadata definition they require modifications of both arrays.

4. Incremental_container and Assemble functions check array.state and
   do not activate volumes with blocking bits set.

5. assemble_container_content is changed to check container wide reshapes
   before activating reshapes of assembled containers.

6. Grow_reshape and Grow_continue_command checks blocking bits
   before starting reshapes or continueing (-G --continue) reshapes.

7. kill-subarray ignores array.state info and can remove requested array.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-31 11:29:46 +11:00
Adam Kwolek 3bd58dc65f Always run Grow_continue() for started array.
So far there were 2 reshape continuation cases:
 1. array is started /e.g. reshape was already invoked during initrd
                      start-up stage using "--freeze-reshape" option/
 2. array is not started yet /"normal" assembling array under reshape case/

This patch narrows continuation cases in to single one. To do this
array should be started /set readonly in to array_state/ before calling
Grow_continue() function.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-07 09:46:07 +11:00
Adam Kwolek a93ada3b7d Monitor reshaped array
Reshape can be run for monitored arrays only /external metadata case/.
Before reshape can be executed, make sure that just starter array/container
is monitored. If not, run mdmon for it.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-05 13:59:28 +11:00
Adam Kwolek 6e75048bc5 Add recovery blocked field to mdinfo
When container is assembled while reshape is active on one of its member
whole container can be required to be blocked from monitoring.
For such purpose field recovery blocked is added to mdinfo structure.

When metadata handler finds active reshape in container it should set
recovery_blocked field to disable whole container monitoring during
reshape.

For arrays that doesn't use containers, recovery_blocked field
has the same value as reshape_active field e.g. super0/1.
In fact,recovery is blocked during reshape for such arrays.
For ddf, metadata handler doesn't set reshape_active field,
so recovery_blocked is not set also.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-05 13:30:50 +11:00
Adam Kwolek b76b30e0f9 Do not continue reshape during initrd phase
During initrd phase continuing reshape will cause file system context
lost. This blocks ability to control reshape using checkpoints.

To avoid this, during initrd phase assemble has to be executed with
'--freeze-reshape' option. This causes that mdadm restores reshape
critical section only.

Reshape can be continued later after system full boot.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-03 09:15:22 +11:00
Adam Kwolek 3f54bd62dc Move restore backup code to function
Reshape backup should be able to be restored during reshape continuation
also. To reuse already existing code it is moved to function.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-21 12:17:30 +10:00
Adam Kwolek 910e9fa7f9 FIX: Memory leak during Assembly
For fdlist pointer allocated in assemble_container_content() function,
free() is never called. This patch fixes this memory leak.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-21 11:55:15 +10:00
NeilBrown b787bec6bd Don't index past the end of 'best' array in Assemble.
The 'best' array only has 'bestcnt' entries allocated, so 'i' should
always be "< bestcnt", not "<= bestcnt".

Reported-by: "Lawrence, Joe" <Joe.Lawrence@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-17 14:48:33 +10:00
Adam Kwolek ba53ea59ad Add reshape restart support for external metadata
Patch introduces support for reshape process restart for external metadata
using metadata specific data handling methods.
It introduces recover_backup() function that restores array to stable state
It is equivalent to Grow_restart() functionality for native metadata.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-08 17:11:11 +10:00
NeilBrown 95eeceeb32 getinfo_super now clears the 'info' structure before filling it in.
Some code currently clears 'info' before calling getinfo_super,
some code doesn't.

To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.

Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL.  In that case it is copied out before the structure
is zeroed.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-08 15:54:13 +10:00
Adam Kwolek 7af0334155 FIX: Count correctly added devices
When array is in reshape state raid_disks field contains final disks number.
To know how many disks were added, disk.raid_disk index has to be compared
against old disk number computed using delta_disks.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-04-18 10:31:43 +10:00
NeilBrown a28232b83f Assemble: improve efficacy of -Af in assembling degraded dirty arrays.
If a degraded dirty array has some superblocks which are clean and
others that are dirty, and the dirty ones are newer by precisely '1'
in the event count, then the current code to force the array to be
clean will not work.
We need to make sure to find a superblock with most recent event count
and force that one to be 'clean'.

Reported-by: A J Wyborny <ajwyborny@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-23 12:10:31 +11:00
Adam Kwolek 983fff45a1 FIX: ping_monitor() usage causes memory leaks
When for ping_monitor() input devnum2devname() is used,
received string pointer should be passed to free() for memory release.
It is not made in several places. This use case should have function
to avoid memory leak.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-18 12:32:16 +11:00
NeilBrown b8b8eda804 Remove incorrect use of open_dev
open_dev can only be used for md array.  To open an
arbitrary device, dev_open must be used.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-10 11:36:47 +11:00
Adam Kwolek 1403201652 FIX: Make expansion counter usable
Currently whole array geometry is set in sysfs_set_array(),
so none of disks (even for expansion) should fail during sysfs_add_disk()
Due to this expansion counter should be used for reshaped array when
disk slot is bigger than number of disks in array.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-03-10 09:58:35 +11:00