If the name in the array has a home-host, then
require that it matches, or is "any", or requested
homehost is "any".
Signed-off-by: NeilBrown <neilb@suse.com>
Earlier patch:
56fcbcbb6f
calculated the proper chunk size - but didn't use it..
Let's actually use it this time.
Signed-off-by: NeilBrown <neilb@suse.com>
This extends nodes option for assemble mode, make the num of
cluster node could be change by user.
Before that, it is necessary to ensure there are enough space
for those nodes, calc_bitmap_size is introduced to calculate
the bitmap size of each node.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
To support change the cluster name, the commit do the followings:
1. extend original write_bitmap function for new scenario.
2. add the scenarion to handle the modification of cluster's name
in write_bitmap1.
3. let the cluster name also show in examine_super1 and detail_super1
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This is a very corner-case, but the self-tests tripped on it,
and it makes sense not to trust the uuid when it is being changed.
Signed-off-by: NeilBrown <neilb@suse.de>
Normally we do not "force"-assemble devices which are in the
middle of recovery, as they are unlikely to have useful data.
However, when a reshape increases the number of devices,
the newly added devices appear to be recovering because they
do not have complete data on them yet, but then they aren't expected
to until the reshape completes.
So in this case, it can be appropriate to force-assemble them.
Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
This reverts commit b720636a58.
As it said, this was a hack. It causes problems when trying to
--force assemble a RAID4. There is a better way.
Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
Since we introduced replacement devices, the 'i' used in
start_array() is twice the slot number.
So we need to adjust when printing.
Signed-off-by: NeilBrown <neilb@suse.de>
static checkers complain about that.
So change the code to use 'fstat', as we really don't want
to see an error here..
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.
Only strings which contain a newline can be broken
into multiple lines:
"It is OK to\n"
"break this string\n"
Signed-off-by: NeilBrown <neilb@suse.de>
We should never auto-assemble things that conflict with mdadm.conf
However explicit assembly requests should be allowed.
Reported-by: olovopb
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1070245
Signed-off-by: NeilBrown <neilb@suse.de>
Due to several changes in code assemble with disks
spanned between different controllers can be obtained
in some cases. After IMSM container will be assembled, check HBA of
disks, and print proper warning if mismatch is detected.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When assembling a native array we just give all devices to the kernel
and leave it to discard the 'old' ones (based on sequence/event
number).
For external/container arrays, mdadm needs to do that.
So in assemble_container_content, get list of current devices in
array and discard any that aren't in the 'content' given.
They must have been rejected by metadata manager.
If we cannot discard old devices the array must already be active, so
just leave it alone, but with a message.
Signed-off-by: NeilBrown <neilb@suse.de>
Commit 5e76dce1ac changed
Grow_continue to assume a fork had already happened, so that
mdadm --grow --continue
didn't fork. This is good, but it means that if Grow_continue
is run from Assemble, then
mdadm --assemble ....
can misbehave if the array was in the middle of a reshape.
So introduce finer control. Grow_continue only assumes it has
already forked if run from "mdadm --grow --continue".
Signed-off-by: NeilBrown <neilb@suse.de>
Subsequent patch will allow the background part of "mdadm --grow" to
be run from systemd. This can require the passing of a backup file
name.
To do this, store that name as a symlink in /run/mdadm (or MAP_DIR)
and look for it when appropriate.
It might be useful to also store the name across reboot, but that
would be a different patch. We would need to use the uuid to identify
it, and store it in stable storage.
Signed-off-by: NeilBrown <neilb@suse.de>
This means that
st->ss->getinfo_super(st, content, NULL);
clean = content->array.state & 1;
will get an up-to-date value for 'clean'. This fix allows
tests/03r5assem-failed
to work.
Signed-off-by: NeilBrown <neilb@suse.de>
When we return in error, we need to free(tst), and ->free_super(tst);
Sometimes we didn't.
Also the final ->free_super(tst) should be followed by free(tst)
but wasn't.
Move that file free forward in the code a bit as we will want to use
the tst there in the next patch.
Signed-off-by: NeilBrown <neilb@suse.de>
The given 'st' might not be best. Making this interface change
will allow load_devices to return a better 'st'.
Signed-off-by: NeilBrown <neilb@suse.de>
When auto-assembling we loop until we get no successes.
If a device is found that look like it is part of an already-existing
container, but we subsequently fail to add that device, then the fact
that the container is running looks like a success. This can result
in infinite looping.
So if a container was already partially assemble, and is still only
partially assembled after we try to add devices, then don't treat that
as success.
Signed-off-by: NeilBrown <neilb@suse.de>
As soon as the array is assembled, udev or systemd might run
fsck and mount it. So we need to drop O_EXCL promptly.
Signed-off-by: NeilBrown <neilb@suse.de>
If --export is given with --incremental, then
MD_DEVNAME
is output which gives the name of the device (in /dev/md) that
is the array (or container) that the device would be added to.
Also
MD_STARTED
is set to one of
no
unsafe
yes
nothing
to indicate if the array was started. IF MD_STARTED=unsafe
then it may be appropriate to run
mdadm -R /dev/md/$MD_DEVNAME
after a timeout to ensure newly degraded array are started.
If
MD_FOREIGN=yes
it might be appropriate to suppress this as the array is
probably not critical.
Signed-off-by: NeilBrown <neilb@suse.de>
We lose one level of indent, and now get told the difference between
'not assemble because not safe' and 'not assembled because not enough
devices'.
Signed-off-by: NeilBrown <neilb@suse.de>
Since 'best' was expanded to hold replacement devices, we might
need to go up to raid_disks*2 to find devices to force.
Also fix another place when considering replacement drives would
be wrong (the 'chosen' device should never be a replacement).
Reported-by: John Yates <jyates65@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We really need to make sure assemble_container_content()
gets called to finished the assembly of these.
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If all devices have the same event count and the first one is a spare,
then that spare will be the 'most_recent'.
However then other devices will think the 'most_recent' has failed
(for v0.90 metadata) and will be rejected from the array.
So never consider a 'spare' to be 'most recent'.
Reported-by: Andreas Baer <synthetic.gods@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We only need 'hold' if we want to mdstat_wait for a change.
These two callers don't care about a change, so they shouldn't
use the 'hold' flag.
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm will normally not include a device into an array if that device
reports that the "best" device has failed, as this normally implies
some sort of inconsistency.
However when --force is given it means that the given drives really
should be assembled if at all possible so in that case the test should
be avoided.
The particular case where this was a problem was a RAID5 were all
devices had the same event count but three of them reported that the
first two had failed.
As they all had the same event count the first was taken as the 'best'
and that caused the later ones to be excluded. Listing one of the
later ones first allowed the array to be assembled. So in this case
the test clearly just got in the way and did nothing useful.
Reported-by: "Marek Jaros" <mjaros1@nbox.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
If the restarted reshape needs a backup file and we don't have one,
that should be reported before we try to start the array.
Also we shouldn't say the "Cannot grow" but "cannot complete".
Signed-off-by: NeilBrown <neilb@suse.de>
If "container=" is present, then we are going to assemble from the
given container where that container is made of those devices or not.
So in this case the "devices=" is purely documentation and is best
ignored.
As part of this, move the test on the "container=" value when that
start with "/" up before the device is opened. There sooner we test
things, the better.
Reported-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
If the container metadata doesn't know how many device to expect (as
is the case with IMSM), don't fail an --assemble which over-specifies
the number of devices.
Signed-off-by: NeilBrown <neilb@suse.de>
When an active/degraded RAID6 array is force-started we clear the
'active' flag, but it is still possible that some parity is
no in sync. This is because there are two parity block.
It would be nice to be able to tell the kernel "P is OK, Q maybe not".
But that is not possible.
So when we force-assemble such an array, trigger a 'repair' to fix up
any errant Q blocks.
This is not ideal as a restart during the repair will not be continued
after the restart, but it is the best we can do without kernel help.
Signed-off-by: NeilBrown <neilb@suse.de>
Some failure scenarios can leave a spare with a higher event count
than an in-sync device. Assembling an array like this will confuse
the kernel.
So detect spares with event counts higher than the best non-spare
event count and exclude them from the array.
Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Some people want to create truely enormous arrays.
As we sometimes need to hold one file descriptor for each
device, this can hit the NOFILE limit.
So raise the limit if it ever looks like it might be a problem.
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the smooth conversion of legacy 0.90 arrays
to 1.0 metadata.
Old metadata is likely to remain but will be ignored.
It can be removed with
mdadm --zero-superblock --metadata=0.90 /dev/whatever
Signed-off-by: NeilBrown <neilb@suse.de>
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.
So get rid of devnum (a number) and use devnm (a 32char string).
eg.
md0
md_d2
md_home
Signed-off-by: NeilBrown <neilb@suse.de>
When auto-assembling we might find an array which appear in
mdadm.conf.
This can happen if the array (based on UUID) doesn't match what is
in mdadm.conf.
For consistency we should avoid auto-assembling such an array just as
we avoid regular-assembling of the array.
Reported-by: Ross Boylan <ross@biostat.ucsf.edu>
Signed-off-by: NeilBrown <neilb@suse.de>
It isn't enough to simply not assemble arrays found to be called
<ignore>, as the final stage of auto-assemble doesn't check for names
in mdadm.conf.
So add a check to Assemble, similar to the check in Incremental()
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Recent patch closed 'mdfd' before calling wait_for, which means
it doesn't work.
Put the close back in the right place.
Signed-off-by: NeilBrown <neilb@suse.de>
commit aacb2f816a
Assemble: add support for replacement devices.
broke the restoring of the 'critical section' because it messed up the
list of file descriptors passed to Grow_restart. Put it back the way
it should be.
Signed-off-by: NeilBrown <neilb@suse.de>
It is important to check for compatibility with 'platform' or
Option ROM when creating or changing and array. However there is no
real need when simply assembling the array.
On some systems there are situations where the platform information is
not available. e.g. on some UEFI systems, UEFI is not available
during 'kdump' handling. This makes it impossible to assemble
an IMSM array to receive the dump.
So remove the requirements that the platform be visible to assemble
an IMSM array.
Signed-off-by: NeilBrown <neilb@suse.de>