Commit Graph

3000 Commits

Author SHA1 Message Date
NeilBrown 7d63efc8d8 tests: ignore failure status from mdadm -IRs
This can report non-zero if there was nothing to do,
and that isn't really an error.
If the array doesn't get started, something else
will complain.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-13 13:11:02 +10:00
NeilBrown ec6db5ba71 Assemble: don't check for pre-existing array when updating uuid.
This is a very corner-case, but the self-tests tripped on it,
and it makes sense not to trust the uuid when it is being changed.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-13 12:41:48 +10:00
Martin Wilck b87fdf4e89 DDF: _write_super_to_disk: fix anchor header type
Since commit 30bee0201, the anchor is updated from the active
DDF header. This requires fixing the header type before the
anchor is written.

The LSI Software RAID code will reject DDF meta data with wrong
anchor type and will erase all meta data when it encounters
such a broken anchor. Thus starting Linux md once on a system
with LSI RAID BIOS may cause the meta data to get destroyed.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-13 10:33:35 +10:00
NeilBrown 3c899cab4d tests: never fail if --wait fails.
"--wait" will return non-zero status if it didn't need to wait.
This is no a reason to fail a test.

So ignore the return status from those commands.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-07 17:00:57 +10:00
NeilBrown 42129b3f80 Add "Name" defines to some ancillary programs
All programs now need to declare their "Name".

Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: d56dd607ba ("Change way of printing name of a process")
2015-05-07 14:46:05 +10:00
NeilBrown d180d2aa2a Manage: fix test for 'is array failed'.
We 'active_disks' does not count spares, so if array is rebuilding,
this will not necessarily find all devices, so may report an array
as failed when it isn't.

Counting up to nr_disks is better.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-06 15:03:50 +10:00
Pawel Baldysiak 72a4577704 IMSM: Count arrays per orom
Active arrays with IMSM metadata are counted per hba so far.
This is bad due to new functionality of orom shared between multiple
controllers i.e. more arrays can be created than is supported by orom.
This patch changes the way of counting arrays, so the result will be
sum of arrays under every hba supported by specific orom.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-09 09:06:23 +10:00
NeilBrown 87af7267bd Assemble/force: make it possible to "force" a new device in a reshape.
Normally we do not "force"-assemble devices which are in the
middle of recovery, as they are unlikely to have useful data.

However, when a reshape increases the number of devices,
the newly added devices appear to be recovering because they
do not have complete data on them yet, but then they aren't expected
to until the reshape completes.
So in this case, it can be appropriate to force-assemble them.

Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 12:04:00 +10:00
NeilBrown c34fef774a Assemble: remove stray ':' from error message.
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 11:27:34 +10:00
NeilBrown 330d6900bb Assemble: allow a RAID4 to assemble easily when parity devices is missing.
If the parity device of a RAID4 is missing, then there is no immediate
risk to data.  So it doesn't matter if the array is dirty or not.

This can be important when reshaping a RAID0, and is a much better
solution that that in the resent-reverted.
   b720636a58

Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:39:02 +10:00
NeilBrown d316dba7c9 Revert "Assemble: support assembling of a RAID0 being reshaped."
This reverts commit b720636a58.

As it said, this was a hack.  It causes problems when trying to
--force assemble a RAID4.  There is a better way.

Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:31:32 +10:00
NeilBrown ee466574f2 Assemble: fix "no uptodate device" message.
Since we introduced replacement devices, the 'i' used in
start_array() is twice the slot number.

So we need to adjust when printing.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:20:26 +10:00
NeilBrown 04e27c2084 Monitor: use the "space protocol" for "Wrong-Level".
"Wrong-Level" is a reason, not a component device, so it should
start with a space to indiciate this to alert().

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:18:55 +10:00
NeilBrown b033913a3c Monitor: Obey "space protocol" when writing to syslog.
"alert" treats the "disc" arg differently if it starts with a space.

At least it does for sending email.  It doesn't for writing to syslog.

Make this consistent and obey the 'space protocol' when writing to
syslog.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-08 09:17:17 +10:00
NeilBrown 783bbc2b13 reshape: support raid5 grow on certain older kernels.
Kernels between
  c6563a8c38fde3c1c7fc925a v3.5-rc1~110^2~53
and
  b5254dd5fdd9abcacadb5101 v3.5-rc1~110^2~51

allow new_offset to be set, but don't then allow a RAID5
to be reshaped to change that offset.
Due to selective backports, this includes the SLES11-SP3 kernel.

It is quite easy to handle this case in mdadm, so we do.
Specifically: if the reshape with data-offset fails with EINVAL,
abort the data-offset change and try the "old" way.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-26 10:06:26 +11:00
Pawel Baldysiak 4d149ab517 IncRemove: Set "auto-read" only after successful excl open.
"mdadm -If" - triggered from udev rules when disk is removed from OS -
tries to set array in auto-read-only mode. This can interrupt rebuild
process which is started automatically, e.g. if array is mounted and
spare disk is available (I/O error is detected faster than removing
failed disk by mdadm).
This patch prevents "mdadm -If" from setting array into "auto-read-only",
by requiring exclusive open to succeed.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:59:53 +11:00
Pawel Baldysiak f666bcc652 IMSM-orom: make sure, that device list is supported
Devices list in PCI Data Structure is supported only in
3 and above revision. Make sure that this is checked.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:59:53 +11:00
Artur Paszkiewicz 5e1d612824 imsm: simplified multiple OROMs support
Replaced oroms array with list, add_orom() now only appends to this list
and add_orom_device_id() only appends devid_list node to an orom_entry.

Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:56:56 +11:00
NeilBrown e9e6894d4b Assemble: don't ignore the return value from stat.
static checkers complain about that.
So change the code to use 'fstat', as we really don't want
to see an error here..

Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:56:52 +11:00
Jes Sorensen 68641cdb64 write_super_imsm_spares(): C statements are terminated by ;
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:56:47 +11:00
Jes Sorensen 5d94384e93 IncrementalScan(): Make sure 'st' is valid before dereferencing it
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-04 15:56:46 +11:00
Jes Sorensen 9eb5ce5ae2 Grow.c: Fix classic readlink() buffer overflow
The buffer passed on to readlink() needs to contain space for the
terminating \0. See 'man 3 readlink' for details.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-25 08:06:45 +11:00
NeilBrown 7a862a020f Don't break long strings onto multiple lines.
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.

Only strings which contain a newline can be broken
into multiple lines:

 "It is OK to\n"
 "break this string\n"


Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:46:53 +11:00
NeilBrown 1ade5cc15a Consistently print program Name and __func__ in debug messages.
make dprintf() print program name and __func__, so that
this messaging is consistent.

Also remove all __func__ messages from pr_err(). We shouldn't
leak that internal data in error message.
If we really want function name there, we new pr_XXX might
be wanted.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:21:17 +11:00
Pawel Baldysiak d56dd607ba Change way of printing name of a process
Sometimes mdadm prints messages with wrong name "mdmon",
and vice versa.
This patch solves this problem by changing method of determining
process name.
Now "Name" will be set in const at start of a program,
previously was hardcoded as #define.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 12:11:01 +11:00
Artur Paszkiewicz 19d3ea0f0b Monitor: fix for regression with container devices
This patch fixes 2 problems introduced by commit 9a518d8: not closing a
file descriptor and ignoring container devices. Array state is always
"inactive" for containers, so we make sure that the device is not a
container by reading also the "level" sysfs entry.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-11 15:27:57 +11:00
NeilBrown 979b1feb09 mdcheck: be careful when sourcing the output of "mdadm --detail --export"
The output of "mdadm --detail --export" isn't quoted properly so
fields that contain spaces can be a problem.
We only want the MD_UUID field, and it has a very well defined
format with no spaces.
So use 'grep' to limit the output to just that.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-04 09:06:47 +11:00
Pawel Baldysiak 71e5411eea IMSM: Clear migration record on disks more often
Migration record is not always cleared after successful migration. This can
block another reshape from being started. Migration will not be continued via
systemd service due to error in verifying reshape position. This patch added
clearing migration record when disk is added to container, and after successful
migration.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-01-29 14:09:36 +11:00
NeilBrown 93d3bd3b28 util: remove rounding error where reporting "human sizes".
The division
 1<<20 / 200
is not exact, so dividing by this to convert bytes into half-megs
is wrong and results in incorrect output.

As we are doing "long long" arithmetic, there is no risk of an overflow
until we reach 64 petabytes.
So change to
   * 200 / (1<<20).

Reported-by: Jan Echternach <jan@goneko.de>
Resolved-debian-bug: 763917
URL: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763917
Signed-off-by: NeilBrown <neilb@suse.de>
2014-12-18 16:58:44 +11:00
Pawel Baldysiak 16afb1a5ef Grow: Fix wrong 'goto' in set_new_data_offset
Commit a821c95f11
besides introducing additional message, also changed
direct return to "goto" instruction.
'goto release' will cause routine to return with '-1',
when previously '1' was returned.
Described behaviour breaks e.g. IMSM reshape process.
This patch fixes this issue by changing 'goto' to proper one -
the one that returns '1'.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-12-02 09:52:34 +11:00
NeilBrown 9a518d81fe Monitor: don't open md array that doesn't exist.
Opening a block-special-device for an array that doesn't
exist causes that array to be instantiated (as an empty array).
Races at array shutdown can cause the array to spontaneously
re-appear if some deamon notices a 'change' event and goes
to investigate.

Teach "mdadm --monitor" to avoid this race by checking the
"array_state" before opening the device.

Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:44:29 +11:00
NeilBrown 7ae0775871 Makefile: binaries shouldn't directly depend on check_rundir
check_rundir always needs to be "built", so making
mdadm and mdmon depend on it causes them to always be built.
i.e. running
   make ; make

will needlessly link the binaries a second time.

So change the makefile to use "order-only" pre-requisites.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:44:18 +11:00
Artur Paszkiewicz 88605db99c imsm: use efivarfs interface for reading UEFI variables
Read UEFI variables using the new efivarfs interface, fallback to
sysfs-efivars if that fails.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:37:38 +11:00
Artur Paszkiewicz 0858eccf86 imsm: detail-platform improvements
Print platform details per OROM, not per controller, differentiate
RST(e) platforms from legacy IMSM, print NVMe device paths, adjust port
printing to newer sysfs path.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:37:38 +11:00
Pawel Baldysiak 614902f64e imsm: add support for NVMe devices
Recognize Intel(R) NVMe devices as IMSM-capable.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:37:38 +11:00
Artur Paszkiewicz 81188ef870 imsm: support for second and combined AHCI controllers in UEFI mode
Grantly platform introduces a second AHCI controller (sSATA) and two new
UEFI variables for the RSTe firmware. This patch adds support for those
variables in order to correctly determine IMSM platform capabilities in
UEFI mode.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:37:38 +11:00
Artur Paszkiewicz 6b781d331b imsm: support for OROMs shared by multiple HBAs
The IMSM platform code was based on an assumption that the OROM or UEFI
capability structure (represented by struct imsm_orom) always belongs to
only one HBA. This assumption is no longer valid, because of newer
platforms with dual AHCI HBAs. Each HBA can have a separate OROM, but
some versions have a combined OROM for both HBAs.

This patch implements this HBA-OROM relationship in struct orom_entry,
which matches an OROM with a list of HBA PCI ids. All the detected
orom_entries are stored and retrieved using a global array and the
functions add_orom(), add_orom_device_id() and get_orom_by_device_id().
This replaces the arrays: imsm_orom, populated_orom, imsm_efi,
populated_efi.

The scan() function is extended to find all HBAs for an OROM. The list
of their device ids is retrieved from the PCI Expansion ROM Data
Structure, hence the additional field devListOffset in struct
pciExpDataStructFormat.

In UEFI mode we can't read the PCI Expansion ROM Data Structure and the
imsm_orom structures are stored in UEFI variables. They do not provide a
similar device id list, so we also check the HBA PCI class to make sure
that the HBA has RAID mode enabled.

In super-intel.c there are changes which allow spanning of IMSM
containers over HBAs of the same type, but only if the HBAs share the
same OROM.  This is done by comparing imsm_orom pointers, which (outside
of platform-intel.c) always point to the global array containing all the
detected oroms. Additional warnings are added to
validate_container_imsm() to warn about potentially dangerous operations
in all the possible cases, e.g. when an array is assembled using disks
attached to HBAs with separate OROMs.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-25 11:34:02 +11:00
NeilBrown 6c90491f44 Incremental: don't be distracted by partition table when calling try_spare.
Currently a partition table on a device makes "mdadm -I" think
the array has a particular metadata type and so will only
add it to an array of that (partition table) type .. which doesn't
make any sense.

So tell guess_super to only look for 'array' metadata.

Reported-by: Caspar Smit <c.smit@truebit.nl>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-05 16:21:42 +11:00
NeilBrown 8057db46a1 Detail: fix handling of 'disks' array.
Since the introduction of replacement devices, we reserve
to places in the "disks" array for each raid disk.
That means we should allocate to twice "max_disk" as the array
could have that many raid_disks (though that would limit the
number of replacements).

A couple of other places need to use "max_disks*2" instead of
"max_disks" to co-ordinate with this.

Reported-by: Or Sagi <ors@reduxio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-04 09:35:20 +11:00
NeilBrown 21dc47172d super1: remove some debugging printfs in update_super1
These should never have been there.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-03 12:56:37 +11:00
NeilBrown 628cdf19ea Rebuildmap: strip local host name from device name.
When /run/mdadm/map is being rebuilt, e.g. by "mdadm -Ir",
if the device doesn't exist in /dev, we have to choose
a name.
Currently we don't strip the hostname which is wrong if
it is the local host.

Reported-by: Stephen Kent <smkent@smkent.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-03 12:49:05 +11:00
NeilBrown 36dab45b89 mdcheck: don't git error if not /dev/md?* devices exist.
If there are no such devices, the 'for' will set '$dev' to
'/dev/md?*', which should be ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-11-03 11:58:06 +11:00
Justin Maggard 0448027b76 Grow: fix resize of array component size to > 32bits
If the request --size to --grow an array to is larger
than 32bits, then mdadm may make the wrong choice and
use ioctl instead of setting component_size via sysfs
and the change is ignored.

Instead of using casts to check for a 32-bit overflow,
just check for set bits outside of INT32_MAX.

Fixes: 4e9a3dd16d
Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-29 11:03:09 +11:00
NeilBrown da5a36fa1f mdmon: already read sysfs files once after opening.
seq_file in the kernel will allocate a read buffer on
first read.  We want this to happen under the managemon thread,
not the 'monitor' thread, as the latter is not allow to allocate
memory (might deadlock).
So do a first read after opening.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-09-17 15:02:18 +10:00
Andy Smith a821c95f11 Grow: Report when grow needs metadata update
Report when the array's metadata needs updating instead of just
reporting the generic "kernel too old" message.

Signed-off-by: Andy Smith <andy@strugglers.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-09-03 13:26:31 +10:00
NeilBrown c60495c8b6 --update: add 'bbl' and 'no-bbl' to the list of known updates.
so "mdadm -A --update=?" mentions them.

Reported-by: Peter Hoeg <peter@hoeg.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-27 21:04:59 +10:00
NeilBrown fed12d436b Release mdadm-3.3.2
Minor bugfix/stability release.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-21 20:16:56 +10:00
Samuli Suominen 47c4331d1f Fix parallel make problem.
When make is called with, for example,
   "make -j9 install install-system"
i.e. both install and install-systemd targets at the same
line and with high -j value,
then the same install.tmp file was used, and udev rules
ends up in systemd service files, or otherway around.

For more information, see:
  http://www.spinics.net/lists/raid/msg46782.html
  http://bugs.gentoo.org/show_bug.cgi?id=517218

Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-21 15:00:29 +10:00
NeilBrown 6ac17e734b super1: make sure 'room' includes 'bbl_size' when creating array.
Because we then go ahead and subtrace bbl_size from room.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-21 10:57:55 +10:00
NeilBrown 268cccac2e super1: don't allow adding a bitmap if there is no space.
If the data is too close to the superblock there may be
no space for a bitmap.
If that happens, fail the adding of the bitmap rather than
corrupt data.

Reported-by:  Lars Wijtemans <rhelbugzilla@lars.wijtemans.nl>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=922944
2014-08-15 15:45:54 +10:00