Commit Graph

237 Commits

Author SHA1 Message Date
Mariusz Tkaczyk 1f5d54a06d Manage: Call validate_geometry when adding drive to external container
When adding drive to container call validate_geometry to verify whether
drive is supported and can be addded to container.

Remove unused parameters from validate_geometry_imsm_container().
There is no need to pass them.
Don't calculate freesize if it is not mandatory. Make it configurable.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-05-26 07:26:34 -04:00
Tkaczyk Mariusz 12724c018c Manage, imsm: Write metadata before add
New drive in container always appears as spare. Manager is able to
handle that, and queues appropriative update to monitor.
No update from mdadm side has to be processed, just insert the drive and
ping the mdmon. Metadata has to be written if no mdmon is running (case
for Raid0 or container without arrays).

If bare drive is added very early on startup (by custom bare rule),
there is possiblity that mdmon was not restarted after switch root. Old
one is not able to handle new drive. New one fails because there is
drive without metadata in container and metadata cannot be loaded.

To prevent this, write spare metadata before adding device
to container. Mdmon will overwrite it (same case as spare migration,
if drive appears it writes the most recent metadata).
Metadata has to be written only on new drive before sysfs_add_disk(),
don't race with mdmon if running.

Signed-off-by: Tkaczyk Mariusz <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2020-04-27 10:36:58 -04:00
Jes Sorensen 9cf361f879 Fix up a few formatting issues
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-11-27 11:33:15 -05:00
Jes Sorensen 02af379337 Remove last traces of HOT_ADD_DISK
This ioctl is no longer used, so remove all references to it.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-11-27 10:19:54 -05:00
Xiao Yang 1cc3965d48 Manage: Remove the legacy code for md driver prior to 0.90.03
Previous re-add operation only calls ioctl(HOT_ADD_DISK) for array without
metadata(e.g. mdadm -B/--build) when md driver is less than 0.90.02, but
commit 091e8e6 breaks the logic and current re-add operation can call
ioctl(HOT_ADD_DISK) even if md driver is 0.90.03.

This issue is reproduced by 05r1-re-add-nosuper:
------------------------------------------------
++ die 'resync or recovery is happening!'
++ echo -e '\n\tERROR: resync or recovery is happening! \n'
ERROR: resync or recovery is happening!
------------------------------------------------

Fixes: 091e8e6("Manage: Remove all references to md_get_version()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Xiao Yang <ice_yangxiao@163.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-11-27 10:16:42 -05:00
Jes Sorensen ffaf1a7eef Manage_subdevs(): Use a dev_t
Use the correct type for rdev

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-09-29 18:08:01 -04:00
NeilBrown 8e5b52cdda Error messages should end with a newline character.
Add "\n" to the end of error messages which don't already
have one.  Also spell "opened" correctly.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-08-16 08:25:07 -04:00
Song Liu 3373d49f32 mdadm/r5cache: allow adding journal to array without journal
Currently, --add-journal can be only used to recreate broken journal
for arrays with journal since  creation. As the kernel code getting
more mature, this constraint is no longer necessary.

This patch allows --add-journal to add journal to array without
journal.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-08-02 09:22:51 -04:00
Jes Sorensen d7be7d8736 mdadm: Fixup more broken logical operator formatting
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-16 13:59:43 -04:00
Jes Sorensen fc54fe7a7e mdadm: Fixup a large number of bad formatting of logical operators
Logical oprators never belong at the beginning of a line.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-16 13:52:15 -04:00
Zhilong Liu e644902ddb retire the APIs that driver no longer supports
refer to commit: e6e5f8f126 ("Build: Stop
bothering about supporting md driver ...")
continue to retire the APIs that md driver
wasn't supported for very long period of time.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-11 11:45:44 -04:00
Zhilong Liu 9e04ac1c43 mdadm/util: unify stat checking blkdev into function
declare function stat_is_blkdev() to integrate repeated stat
checking blkdev operations, it returns 'true/1' when it is a
block device, and returns 'false/0' when it isn't.
The devname is necessary parameter, *rdev is optional, parse
the pointer of dev_t *rdev, if valid, assigned device number
to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:05:32 -04:00
Zhilong Liu 0a6bff09d4 mdadm/util: unify fstat checking blkdev into function
declare function fstat_is_blkdev() to integrate repeated fstat
checking block device operations, it returns true/1 when it is
a block device, and returns false/0 when it isn't.
The fd and devname are necessary parameters, *rdev is optional,
parse the pointer of dev_t *rdev, if valid, assigned the device
number to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:04:02 -04:00
Jes Sorensen 80223cb4db Manage: Manage_ro(): Use md_array_active()
One call less to md_get_array_info() for determining whether an array
is active or not.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-02 10:41:23 -04:00
Jes Sorensen 5e4ca8bb82 sysfs: Parse array_state in sysfs_read()
Rather than copying in the array_state string, parse it and use an
enum to indicate the state.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-20 00:12:27 -04:00
Jes Sorensen 32141c1765 Retire mdassemble
mdassemble doesn't handle container based arrays, no support for sysfs,
etc. It has not been actively maintained for years, so time to send it
off to retirement.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-11 12:54:26 -04:00
Jes Sorensen 091e8e6e06 Manage: Remove all references to md_get_version()
At this point, support for md driver prior to 0.90.03 is going to
disappear.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 15:34:44 -04:00
Jes Sorensen dae131379f sysfs: Make sysfs_init() return an error code
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-30 16:52:37 -04:00
Jes Sorensen d97572f5a5 util: Introduce md_get_disk_info()
This removes all the inline ioctl calls for GET_DISK_INFO, allowing us
to switch to sysfs in one place, and improves type checking.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 15:23:50 -04:00
Jes Sorensen 9cd39f0155 util: Introduce md_get_array_info()
Remove most direct ioctl calls for GET_ARRAY_INFO, except for one,
which will be addressed in the next patch.

This is the start of the effort to clean up the use of ioctl calls and
introduce a more structured API, which will use sysfs and fall back to
ioctl for backup.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 14:35:41 -04:00
NeilBrown 1ab9ed2afb Add 'force' flag to *hot_remove_disk().
In rare circumstances, the short period that *hot_remove_disk()
waits isn't long enough to IO to complete.  This particularly happens
when a device is failing and many retries are still happening.

We don't want to increase the normal wait time for "mdadm --remove"
as that might be use just to test if a device is active or not, and a
delay would be problematic.
So allow "--force" to mean that mdadm should try extra hard for a
--remove to complete, waiting up to 5 seconds.

Note that this patch fixes a comment which claim the previous
wait time was half a second, where it was really 50msec.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:32:35 -04:00
NeilBrown fdd015696c Introduce sys_hot_remove_disk()
The new hot_remove_disk() will retry HOT_REMOVE_DISK
several times in the face of EBUSY.
However we sometimes remove a device by writing "remove" to the
"state" attributed.  This should be retried as well.
So introduce sys_hot_remove_disk() to repeat this action a few times.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:30:49 -04:00
NeilBrown 2dd271fe70 Retry HOT_REMOVE_DISK a few times.
HOT_REMOVE_DISK can fail with EBUSY if there are outstanding
IO request that have not completed yet.  It can sometimes
be helpful to wait a little while for these to complete.

We already do this in impose_level() when reshaping a device,
but not in Manage.c in response to an explicit --remove request.

So create hot_remove_disk() to central this code, and call it
where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:25:23 -04:00
NeilBrown e22fe3ae15 Introduce enum flag_mode for setting and clearing flags.
We currently use '1' to indicate that a flag (writemostly or failfast)
needs to be set, and '2' to indicate that it needs to be cleared.

Using magic number like this is not a best-practice.

So replaced them with values from a enum.

No functional change.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-29 17:12:13 -05:00
NeilBrown 71574efb07 Add failfast support.
Allow per-device "failfast" flag to be set when creating an
array or adding devices to an array.

When re-adding a device which had the failfast flag, it can be removed
using --nofailfast.

failfast status is printed in --detail and --examine output.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 08:50:36 -05:00
Tomasz Majchrzak c922221e25 Remove: container should wait for an array to release a drive
A 'faulty' drive is being removed from a container after it has been
released by an array, however there is a race there. The drive is
released asynchronously by a monitor but sometimes it doesn't happen
before container checks it. It results in a container refusing to remove
a drive as it still seems to be a part of some array.

It seems 'ping_monitor' could be a solution here to assure monitor has
had a chance to process the events, however it doesn't resolve the
problem - sometimes an array has to request a release of the drive few
times (as the array is busy) and single 'ping_monitor' call is not
sufficient. As there is no way to query monitor progress, it forces us
to retry a check several times before an error is returned.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-07-21 11:25:16 -04:00
Jes Sorensen e9ddbb2be9 Manage: Manage_subdevs(): Remove unnecessary NULL initialization
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:06:18 -04:00
Jes Sorensen fbd3e15c0a Manage: Manage_add(): Avoid NULL initialization of dev_st
dev_st is only ever assigned if array->not_persistent == 0, so move
the second use of it into the same scope where the assignment is
made.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:06:07 -04:00
Jes Sorensen d209181d96 Manage: Manage_add(): Fix memory leak
sysfs_read() allocates and populates a struct mdinfo, however the code
forgot to free it again, before dropping the reference to the pointer.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:03:12 -04:00
Hannes Reinecke d31d0f5218 Fix regression during add devices
Commit d180d2aa2a ("Manage: fix test for 'is array failed'.")
introduced a regression which would not allow to re-add new
drivers to a failed array.

Fixes: d180d2aa2a ("Manage: fix test for 'is array failed'.")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Coly Li <colyli@suse.de>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-10 11:44:21 -05:00
Jes Sorensen cc5083d114 Manage: Manage_subdevs() fix file descriptor leak
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen 2a1990c0f4 Manage: Manage_add(): Fix potential NULL pointer dereference
sysfs_read() may return NULL, so we should check the validity of the
pointer before dereferencing it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-08 12:22:26 -05:00
Jes Sorensen 6e8d27e77e Manage: Remove unnecessary NULL pointer checks
sysfs_free() handles NULL pointers, so remove superfluous NULL pointer
checks before calling it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-08 12:19:03 -05:00
Jes Sorensen 229e66cb96 Manage.c: Only issue change events for kernels older than 2.6.28
2.6.28+ kernels handle this themselves and issuing the event here can
cause a race.

Reported-by: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Suggested-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-02-17 12:31:24 -05:00
Song Liu 38c2e05b6a in --add assign raid_disk of 0 to journal
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-22 07:50:05 +11:00
Song Liu 01290056d0 recreate journal in mdadm
This patch tries recreates missing/faulty journal in mdadm.

Example:

./mdadm --fail /dev/md1 /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md1

./mdadm --stop /dev/md1
mdadm: stopped /dev/md1

./mdadm -A --scan --force
mdadm: Journal is missing or stale, starting array read only.
mdadm: /dev/md/1 has been started with 15 drives.

./mdadm --add-journal /dev/md1 /dev/sdb2
mdadm: added /dev/sdb2

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 12:43:56 +11:00
Guoqing Jiang 9465f17058 re-add: make re-add try to write sysfs node first
If sysfs node existed, we should try to write "re-add" to it.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-08 11:08:40 +11:00
Guoqing Jiang bff96f7366 mdadm: make cluster raid also could support re-add
If it is a cluster raid, the disc.state need to be
changed accordingly when do re-add.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 14:55:02 +10:00
Guoqing Jiang d7a493695a mdadm: fix wrong condition for go to abort
When parse_cluster_confirm_arg return 0, it means the
arg are parsed successfully, so change !rv to rv.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-29 17:26:12 +10:00
NeilBrown 653299b699 Merge branch 'cluster'
Now that 3.3.3 is out, it is time to include the cluster-support code.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-27 11:01:08 +10:00
NeilBrown e3e0d0a843 Manage/stop: don't stop during initial critical section.
If the array is reshaping to more devices, then stopping
during that initial critical section is a bad idea.
So check for it and wait a bit.

Should probably handle final critical section of a reduction
too.
same-size reshape should be handled correctly already.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-07-06 13:45:39 +10:00
NeilBrown 932be6276e Manage/stop: improve some comments.
This code always confuses me - this might help a bit.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-06 13:37:19 +10:00
NeilBrown 30ddba7de5 Manage/stop: guard against 'completed' being too large.
A race can allow 'completed' to read as 2^63-1, which takes
a long time to count up to.
So guard against that possibility.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-06 13:33:20 +10:00
NeilBrown 52b6ccad34 Manage: fix no-op test in Manage_stop.
A 'devnm' never starts with '/', so this test is pointless.
The code should use the passed-in devname unless it is clearly
not usable.  So fix it to do that.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-07-02 08:16:59 +10:00
NeilBrown 9581efb1ae mdstat: discard 'dev' field, just use 'devnm'
These both have the same value, and have done since the
'devnm' concept was introduced.
So discard the pointless duplicate.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-07-02 08:15:10 +10:00
Guoqing Jiang 4de9091302 Add a new clustered disk
A clustered disk is added by the traditional --add sequence.
However, other nodes need to acknowledge that they can "see"
the device. This is done by --cluster-confirm:

--cluster-confirm SLOTNUM:/dev/whatever (if disk is found)
or
--cluster-confirm SLOTNUM:missing (if disk is not found)

The node initiating the --add, has the disk state tagged with
MD_DISK_CLUSTER_ADD and the one confirming tag the disk with
MD_DISK_CANDIDATE.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:21:29 +10:00
NeilBrown 2609f33902 Manage: when re-adding, do check avail size if ->sb cannot be found.
avail_size1 requires ->sb, so we must only call it if ->sb
was loaded.

If ->sb wasn't loaded, then we are only proceding on the basis that
the kernel might be able to work something out - we don't need to
do any tests on size.

Reported-by: Christoffer Hammarström <christoffer.hammarstrom@linuxgods.com>
Signed-off-by: NeilBrown <neilb@suse.de>
URL: https://bugs.debian.org/784874
2015-05-13 14:08:41 +10:00
NeilBrown d180d2aa2a Manage: fix test for 'is array failed'.
We 'active_disks' does not count spares, so if array is rebuilding,
this will not necessarily find all devices, so may report an array
as failed when it isn't.

Counting up to nr_disks is better.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-05-06 15:03:50 +10:00
NeilBrown 7a862a020f Don't break long strings onto multiple lines.
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.

Only strings which contain a newline can be broken
into multiple lines:

 "It is OK to\n"
 "break this string\n"


Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:46:53 +11:00
NeilBrown b47024f1c5 Manage: fix removal of non-existent devices.
"--remove detached" and others stopped working a while
back when I refactored some code.
For 'remove' and  'fail', the device may not exist so
if it is  "MM:mm", (e.g. added by "detached"), just parse
out the numbers.

Reported-by: Killian De Volder <killian.de.volder@megasoft.be>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-11 10:34:41 +10:00