Commit Graph

422 Commits

Author SHA1 Message Date
NeilBrown 2dd271fe70 Retry HOT_REMOVE_DISK a few times.
HOT_REMOVE_DISK can fail with EBUSY if there are outstanding
IO request that have not completed yet.  It can sometimes
be helpful to wait a little while for these to complete.

We already do this in impose_level() when reshaping a device,
but not in Manage.c in response to an explicit --remove request.

So create hot_remove_disk() to central this code, and call it
where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:25:23 -04:00
Tomasz Majchrzak cf52eff58a Increase buffer for sysfs disk state
Bad block support has incremented sysfs disk state reported by kernel
("external_bbl") so it became longer than 20 bytes. It causes reshape to
fail as it reads truncated entry from sysfs.

Increase buffer so it can accommodate the string including all state
values currently implemented in kernel at the same time.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-17 09:46:42 -05:00
Mariusz Dabrowski ddab63c7de Allow level migration only for single-array container
IMSM doesn't allow to change RAID level of array in container with two
arrays but array count check is being done too late (after removing disks)
and in some cases (e. g. RAID 0 and RAID 1 migrated to RAID 0) both arrays
become degraded. This patch adds array count check before disks are being
removed.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-19 11:26:49 -04:00
Xiao Ni 8800f85381 MDADM:Check mdinfo->reshape_active more times before calling Grow_continue
When reshaping a 3 drives raid5 to 4 drives raid5, there is a chance that
it can't start the reshape. If the disks are not enough to have spaces for
relocating the data_offset, it needs to call start_reshape and then run
mdadm --grow --continue by systemd. But mdadm --grow --continue fails
because it checkes that info->reshape_active is 0.

The info->reshape_active is got from the superblock of underlying devices.
Function start_reshape write reshape to /sys/../sync_action. Before writing
latest superblock to underlying devices, mdadm --grow --continue is called.
There is a chance info->reshape_active is 0. We should wait for superblock
updating more time before calling Grow_continue.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:53:45 -04:00
Mike Lovell 13db17bd1f Use dev_t for devnm2devid and devid2devnm
Commit 4dd2df0966 added a trip through makedev(), major(), and minor() for
device major and minor numbers. This would cause mdadm to fail in operating
on a device with a minor number bigger than (2^19)-1 due to it changing
from dev_t to a signed int and back.

Where this was found as a problem was when a array was created with a device
specified as a name like /dev/md/raidname and there were already 128 arrays
on the system. In this case, mdadm would chose 1048575 ((2^20)-1) for the
array and minor number. This would cause the major and minor number to become
negative when generated from devnm2devid() and passed to major() and minor()
in open_dev_excl(). open_dev_excl() would then call dev_open() which would
detect the negative minor number and call open() on the *char containing the
major:minor pair which isn't a valid file.

Signed-off-by: Mike Lovell <mlovell@bluehost.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-03 15:35:26 -04:00
Jes Sorensen 6ac963cef0 Grow: Apply some more consistent formatting to Grow_addbitmap()
This should be purely cosmetic and cause no functional change
... famous last words!

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:27:24 -04:00
Jes Sorensen 4ed129aca7 Grow: Simplify error paths in Grow_addbitmap()
This gets rid of some repeated exit paths, making the code a little
cleaner.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:27:18 -04:00
Jes Sorensen 2ec2b7e9d5 mdadm: Make add_internal_bitmap() return 0 on success
add_internal_bitmap() returned 1 on success and 0 on error which is
inconsistent. This changes it to return 0 on success and use more
reasonable error codes on error.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:19:16 -04:00
Jes Sorensen c152f3610f Grow: Handle failure to load superblock in Grow_addbitmap()
Reported-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 14:30:10 -04:00
Jes Sorensen dac1b1115f Grow: Grow_addbitmap() reduce indentation
This makes the code a little more readable.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 14:27:11 -04:00
Guoqing Jiang 81306e021e Change the option from NoUpdate to NodeNumUpdate
Actually, we need to use NodeNumUpdate here to
ensure there are enough spaces for those nodes.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-24 12:33:27 -04:00
Guoqing Jiang 31dbeda730 Grow: goto release if Manage_subdevs failed
If failure happened when add disk to array
by grow mode, need to goto release instead
of continue the reshape.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 13:53:10 -04:00
Yi Zhang a58e0da443 Grow: analyse_change add notification about only 2-device can be convert from RAID1 to RAID5
Notify "Can only convert a 2-device array to RAID5" instead of
"Impossibly level change request for RAID1" when convert from
RAID1 to RAID5 if the disk num is not equal two like RAID4/5->RAID1
did.

Signed-off-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-11 12:40:47 -05:00
Pawel Baldysiak ad2f464602 Grow: close fd earlier to avoid "cannot get excl access" when stopping
If this file descriptor is not closed here, it remains open during
reshape process and stopping process will end up with
"cannot get exclusive access to container".
Once this file descriptor is no longer needed - it can be closed.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-11 12:32:31 -05:00
Jes Sorensen efdfcc9e95 Grow: Grow_addbitmap(): Add check to quiet down static code checkers
Grow_addbitmap() is only ever called with s->bitmap_file != NULL, but
not all static code checkers catch this. This adds a check to quiet
down the false positive warnings.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen 12add44564 Grow: Grow_continue_command() remove dead code
All cases where fd2 is used are completed with a close(fd2), so there
is no need to set fd2 = -1 or check for it before exiting.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen bf08f6b1ef Grow: Add documentation to abort_reshape() for suspend_{lo,hi} setting
Add documentation for quirky reset procedure for resetting suspended
region range.

Suggested-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-02-22 09:43:43 -05:00
Artur Paszkiewicz 10df72a080 Grow: close file descriptor earlier to avoid "still in use" when stopping
Close fd2 as soon as it is no longer needed, before calling
Grow_continue(). Otherwise, we won't be able to stop an array with
external metadata during reshape, because mdadm running in background
will be keeping it open.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-24 10:00:00 +11:00
Xiao Ni f7cf9699dc Check and remove bitmap first when reshape to raid0
If reshape one raid device with bitmap to raid0, the reshape progress will
start. But it'll fail and lose some components. So it should remove bitmap
first.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-22 15:16:08 +11:00
Guoqing Jiang 37d0ca9be6 mdadm: output info more precisely when change bitmap to none
WHen change bitmap to none, the infos could be more accurate
based on existed bitmap type.

And s->bitmap_file is passed from cmd "--bitmap=TYPE", so
remove s->bitmap_file from err info since it should means
change the bitmap to one type failed rather than the type is
already presented.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 13:23:58 +11:00
Deepa Dinamani 26714713cd mdadm: Change timestamps to unsigned data type.
32 bit signed timestamps will overflow in the year 2038.

Change the user interface mdu_array_info_s structure timestamps:
ctime and utime values used in ioctls GET_ARRAY_INFO and
SET_ARRAY_INFO to unsigned int. This will extend the field to last
until the year 2106.

Add time_after/time_before and supporting typecheck from
the kernel to take care of unsigned time wraparound.

The long term plan is to get rid of ctime and utime values in
this structure as this information can be read from the on-disk
meta data directly.

v0.90 on disk meta data uses u32 for maintaining time stamps.
So this will also last until year 2106.
Assumption is that the usage of v0.90 will be deprecated by
year 2106.

Timestamp fields in the on disk meta data for v1.0 version already
use 64 bit data types.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 12:43:25 +11:00
Goldwyn Rodrigues 6d9c7c2551 Increment version for clustered bitmaps
Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels
to assemble a clustered device.

In order to maximize compatibility, the major version is set to
BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered.

Also, added MD_FEATURE_CLUSTERED in order to return error
for older kernels which would assemble MD in case bitmap is
corrupted.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 11:47:04 +10:00
NeilBrown 653299b699 Merge branch 'cluster'
Now that 3.3.3 is out, it is time to include the cluster-support code.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-27 11:01:08 +10:00
NeilBrown 62844a4da6 Grow: remove stray tracing message.
Signed-off-by: NeilBrow <neilb@suse.com>
2015-07-06 13:47:45 +10:00
NeilBrown caf9ac0ca4 Grow: fix typo in comment
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-18 15:51:45 +10:00
Guoqing Jiang 0aa2f15b20 mdadm: add the ability to change cluster name
To support change the cluster name, the commit do the followings:

1. extend original write_bitmap function for new scenario.
2. add the scenarion to handle the modification of cluster's name
   in write_bitmap1.
3. let the cluster name also show in examine_super1 and detail_super1

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:33:39 +10:00
Guoqing Jiang 7c25f4d706 Convert a bitmap=none device to clustered
This adds the ability to convert a regular md without bitmap
(--bitmap=none) to a clustered device (--bitmap=clustered).

To convert a device with --bitmap=internal or --bitmap=external,
you have to convert to --bitmap=none and then re-execute the
command with --bitmap=clustered.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:24:41 +10:00
NeilBrown 2a6493cfe1 Grow: fix a couple of typos.
Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-28 17:21:06 +10:00
NeilBrown 8e7ddc5f50 Grow: fix problem with --grow --continue
If an array is being reshaped using backup space on a 'spare' device,
then
  mdadm --grow --continue
won't find it as by the time it runs, nothing looks like a spare are
more.  The spare has been added to the array, but has no data yet.

So allow reshape_prepare_fdlist to find a newly-incorporated spare and
report this so it can be used.

Reported-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-28 16:43:15 +10:00
NeilBrown e0cc1c8d8b Grow: another attempt to fix stop-during-reshape race.
When the array is stopped during a critical section, we sometimes
erase the backup, which is bad.
This happens when 'completed' is zero.
This can happen easily when 'stop' freezes reshape.

So try to be more careful and check 'reshape_position'.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-25 16:33:45 +10:00
NeilBrown 3ee556f8b6 Grow: be even more careful about handing a '0' completed value.
Some old kernels set 'completed' to '0' too soon.
But modern kernels don't.
And when 'mdadm --stop' freezes and resume the grow,
'completed' goes back to zero briefly, which can confuse this
logic.
So only  think '0' might be wrong from an old kernel when
the reshape has gone idle.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-15 15:11:48 +10:00
NeilBrown ada38ebbcb Grow: retry when writing 'reshape' to 'sync_action' is EBUSY.
EBUSY can be returned if something has recently happened
to cause md to want to check if recovery is needed, but hasn't
had a chance yet.

This can easily happen in testing.

So retry a few times in that case.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-15 11:07:25 +10:00
NeilBrown e0184a0cd0 Grow: be more careful if array is stopped during critical section.
In that case, updating 'completed' to 'max_progress' is wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-15 11:07:25 +10:00
NeilBrown a5a6a7d9fa Grow: add missing space in message.
Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-15 11:07:25 +10:00
NeilBrown dd243f561f Grow: only warn about incompatible metadata when no fallback available.
We might be trying to set_new_data_offset() for RAID10, when it is
a necessary requirement, or for RAID5 where it is optional.
In the latter case, a message about metadata versions is no helpful.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-14 11:17:39 +10:00
NeilBrown 783bbc2b13 reshape: support raid5 grow on certain older kernels.
Kernels between
  c6563a8c38fde3c1c7fc925a v3.5-rc1~110^2~53
and
  b5254dd5fdd9abcacadb5101 v3.5-rc1~110^2~51

allow new_offset to be set, but don't then allow a RAID5
to be reshaped to change that offset.
Due to selective backports, this includes the SLES11-SP3 kernel.

It is quite easy to handle this case in mdadm, so we do.
Specifically: if the reshape with data-offset fails with EINVAL,
abort the data-offset change and try the "old" way.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-03-26 10:06:26 +11:00
Jes Sorensen 9eb5ce5ae2 Grow.c: Fix classic readlink() buffer overflow
The buffer passed on to readlink() needs to contain space for the
terminating \0. See 'man 3 readlink' for details.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-25 08:06:45 +11:00
NeilBrown 7a862a020f Don't break long strings onto multiple lines.
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.

Only strings which contain a newline can be broken
into multiple lines:

 "It is OK to\n"
 "break this string\n"


Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:46:53 +11:00
NeilBrown 1ade5cc15a Consistently print program Name and __func__ in debug messages.
make dprintf() print program name and __func__, so that
this messaging is consistent.

Also remove all __func__ messages from pr_err(). We shouldn't
leak that internal data in error message.
If we really want function name there, we new pr_XXX might
be wanted.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:21:17 +11:00
Pawel Baldysiak d56dd607ba Change way of printing name of a process
Sometimes mdadm prints messages with wrong name "mdmon",
and vice versa.
This patch solves this problem by changing method of determining
process name.
Now "Name" will be set in const at start of a program,
previously was hardcoded as #define.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 12:11:01 +11:00
Pawel Baldysiak 16afb1a5ef Grow: Fix wrong 'goto' in set_new_data_offset
Commit a821c95f11
besides introducing additional message, also changed
direct return to "goto" instruction.
'goto release' will cause routine to return with '-1',
when previously '1' was returned.
Described behaviour breaks e.g. IMSM reshape process.
This patch fixes this issue by changing 'goto' to proper one -
the one that returns '1'.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-12-02 09:52:34 +11:00
Justin Maggard 0448027b76 Grow: fix resize of array component size to > 32bits
If the request --size to --grow an array to is larger
than 32bits, then mdadm may make the wrong choice and
use ioctl instead of setting component_size via sysfs
and the change is ignored.

Instead of using casts to check for a 32-bit overflow,
just check for set bits outside of INT32_MAX.

Fixes: 4e9a3dd16d
Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-29 11:03:09 +11:00
Andy Smith a821c95f11 Grow: Report when grow needs metadata update
Report when the array's metadata needs updating instead of just
reporting the generic "kernel too old" message.

Signed-off-by: Andy Smith <andy@strugglers.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-09-03 13:26:31 +10:00
NeilBrown 46643e1ad5 Grow: improve error message is "--grow -n2" used on Linear arrays.
Linear arrays don't respond to setting raid-disks, only to
adding a device.

Reported-by: mulhern
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1122146
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-29 13:37:42 +10:00
NeilBrown 4e9a3dd16d Grow: fix that preventing resize of array to 32bit size.
If the request --size to --grow an array to is 32bits
(i.e. msb in bit 32) then mdadm make wrong choice and
uses ioctl instead of setting component_size via sysfs
and the change is ignored.

This is fixed by using correct casts.

Reported-and-tested-by: Killian De Volder <killian.de.volder@megasoft.be>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-21 16:51:53 +10:00
Pawel Baldysiak 13ffbe89b6 Grow: Do not try to restart if reshape is running
Grow process did not check if reshape is already started
when deciding about restarting.
Sync_action should be checked in this case, and if
reshape is running - restart flag should not be set.
Otherwise, Grow process will fail to write data to
sysfs, and reshape will not be continued.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-17 14:08:24 +10:00
Pawel Baldysiak e339dba2a1 Grow: fix removal of line in wrong case
Commit 18d9bcfa33
removed wrong line (in case RAID0->RAID4).
This patch corrects this mistake
(line should be removed in case RAID4->RAID4).

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-06-12 17:32:21 +10:00
NeilBrown 1e60caebbc Make sure "make everything" builds again.
Signed-off-by: NeilBrown <neilb@suse.de>
2014-06-05 16:38:29 +10:00
Baldysiak, Pawel 40b941b813 Grow: Do not fork via systemd if freeze_reshape is set
Mdadm should not run 'grow-continue' unit file for container if
'--freeze-reshape' argument is passed. Otherwise it will be ignored,
and reshape will start anyway.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-06-02 12:42:01 +10:00
Baldysiak, Pawel 054cba7719 Grow: Use 'forked' also for reshape_container in Grow_continue
Similar to commit 06e293d097
same thing should be done for reshape_container in Grow_continue

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-06-02 12:39:14 +10:00