Commit Graph

540 Commits

Author SHA1 Message Date
Blazej Kucman 3364781b92 imsm: pass subarray id to kill_subarray function
After patch b6180160f ("imsm: save current_vol number")
current_vol for imsm is not set and kill_subarray()
cannot determine which volume has to be deleted.
Volume has to be passed as "subarray_id".
The parameter affects only IMSM metadata.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2020-02-24 16:08:15 -05:00
NeilBrown 329dfc28de Create: add support for RAID0 layouts.
Since Linux 5.4 a layout is needed for RAID0 arrays with
varying device sizes.
This patch makes the layout of an array visible (via --examine)
and sets the layout on newly created arrays.
--layout=dangerous
can be used to avoid setting a layout so that they array
can be used on older kernels.

Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-12-02 16:14:49 -05:00
Blazej Kucman b771faef93 imsm: return correct uuid for volume in detail
Fixes the side effect of the patch b6180160f ("imsm: save current_vol number")
- wrong UUID is printed in detail for each volume.
New parameter "subarray" is added to determine what info should be extracted
from metadata (subarray or container).
The parameter affects only IMSM metadata.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-12-02 16:01:16 -05:00
Guilherme G. Piccoli 43ebc9105e mdadm: Introduce new array state 'broken' for raid0/linear
Currently if a md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.

Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of mdadm 'detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.

This patch is the mdadm counterpart of kernel new array state 'broken'.
The 'broken' state mimics the state 'clean' in every aspect, being useful
only to distinguish if an array has some member missing. All necessary
paths in mdadm were changed to deal with 'broken' state, and in case the
tool runs in a kernel that is not updated, it'll work normally, i.e., it
doesn't require the 'broken' state in order to work.
Also, this patch changes the way the array state is showed in the 'detail'
command (for raid0/linear only) - now it takes the 'array_state' sysfs
attribute into account instead of only rely in the MD_SB_CLEAN flag.

Cc: Jes Sorensen <jes.sorensen@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-09-30 15:08:09 -04:00
Coly Li d11abe4bd5 mdadm: add --no-devices to avoid component devices detail information
When people assemble a md raid device with a large number of
component deivces (e.g. 1500 DASD disks), the raid device detail
information generated by 'mdadm --detail --export $devnode' is very
large. It is because the detail information contains information of
all the component disks (even the missing/failed ones).

In such condition, when udev-md-raid-arrays.rules is triggered and
internally calls "mdadm --detail --no-devices --export $devnode",
user may observe systemd error message ""invalid message length". It
is because the following on-stack raw message buffer in systemd code
is not big enough,
        systemd/src/libudev/libudev-monitor.c
        _public_ struct udev_device *udev_monito ...
                struct ucred *cred;
                union {
                        struct udev_monitor_netlink_header nlh;
                        char raw[8192];
                } buf;
Even change size of raw[] from 8KB to larger size, it may still be not
enough for detail message of a md raid device with much larger number of
component devices.

To fix this problem, an extra option '--no-devices' is added (the
original idea is proposed by Neil Brown). When printing detailed
information of a md raid device, if '--no-devices' is specified, then
all component devices information will not be printed, then the output
message size can be restricted to a small number, even with the systemd
only has 8KB on-disk raw buffer, the md raid array udev rules can work
correctly without failure message.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-08-12 16:16:36 -04:00
Baruch Siach 452dc4d13a mdadm.h: include sysmacros.h unconditionally
musl libc now also requires sys/sysmacros.h for the major/minor macros.
All supported libc implementations carry sys/sysmacros.h, including
diet-libc, klibc, and uclibc-ng.

Cc: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-08-12 16:16:32 -04:00
Mariusz Dabrowski b068159891 mdadm: load default sysfs attributes after assemblation
Added new type of line to mdadm.conf which allows to specify values of
sysfs attributes for MD devices that should be loaded after the array is
assembled. Each line is interpreted as list of structures containing
sysname of MD device (md126 etc.) and list of sysfs attributes and their
values.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-07-10 16:12:09 -04:00
Jes Sorensen 7039d1f820 mdadm.h: Introduced unaligned {get,put}_unaligned{16,32}()
We need these to avoid gcc9 going all crazy on us.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-07-09 14:15:38 -04:00
Artur Paszkiewicz ae7d61e35e mdmon: fix wrong array state when disk fails during mdmon startup
If a member drive disappears and is set faulty by the kernel during
mdmon startup, after ss->load_container() but before manage_new(), mdmon
will try to readd the faulty drive to the array and start rebuilding.
Metadata on the active drive is updated, but the faulty drive is not
removed from the array and is left in a "blocked" state and any write
request to the array will block. If the faulty drive reappears in the
system e.g. after a reboot, the array will not assemble because metadata
on the drives will be incompatible (at least on imsm).

Fix this by adding a new option for sysfs_read(): "GET_DEVS_ALL". This
is an extension for the "GET_DEVS" option and causes all member devices
to be returned, even if the associated block device has been removed.
Use this option in manage_new() to include the faulty device on the
active_array's devices list. Mdmon will then properly remove the faulty
device from the array and update the metadata to reflect the degraded
state.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-05-20 14:14:36 -04:00
NeilBrown cd72f9d114 policy: support devices with multiple paths.
As new releases of Linux some time change the name of
a path, some distros keep "legacy" names as well.  This
is useful, but confuses mdadm which assumes each device has
precisely one path.

So change this assumption:  allow a disk to have several
paths, and allow any to match when looking for a policy
which matches a disk.

Reported-and-tested-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-12-06 07:43:19 -05:00
Michal Zylowski 5a5b3a6725 imsm: Do not require MDADM_EXPERIMENTAL flag anymore
Grow feature for IMSM metadata is currently fully supported and tested.
Reshape operation is not in experimental state anymore, so usage of this
flag is unnecessary.

Do not require MDADM_EXPERIMENTAL flag and remove obsolete information
from manual.

Signed-off-by: Michal Zylowski <michal.zylowski@intel.com>
Acked-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Acked-by: Roman Sobanski <roman.sobanski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-05-31 11:30:36 -04:00
Guoqing Jiang 1b7eb962db mdadm: improve the dlm locking mechanism for clustered raid
Previously, the dlm locking only protects several
functions which writes to superblock (update_super,
add_to_super and store_super), and we missed other
funcs such as add_internal_bitmap. We also need to
call the funcs which read superblock under the
locking protection to avoid consistent issue.

So let's remove the dlm stuffs from super1.c, and
provide the locking mechanism to the main() except
assemble mode which will be handled in next commit.
And since we can identify it is a clustered raid or
not based on check the different conditions of each
mode, so the change should not have effect on native
array.

And we improve the existed locking stuffs as follows:

1. replace ls_unlock with ls_unlock_wait since we
should return when unlock operation is complete.

2. inspired by lvm, let's also try to use the existed
lockspace first before creat a lockspace blindly if
the lockspace not released for some reason.

3. try more times before quit if EAGAIN happened for
locking.

Note: for MANAGE mode, we do not need to get lock if
node just want to confirm device change, otherwise we
can't add a disk to cluster since all nodes are compete
for the lock.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-03-08 14:16:42 -05:00
Mariusz Tkaczyk 1ea0462990 Monitor/msg: Don't print error message if mdmon doesn't run
Commit 4515fb28a5 ("Add detail information when can not connect
monitor") was added to warn about failed connection to monitor in
WaitClean function (see link below).

Mdmon runs for IMSM containers when they have array with redundancy so
if mdmon doesn't run, mdadm prints this error. This is misleading and
unnecessary. Just print it in WaitClean function.

The sock in WaitClean is deprecated so it is removed.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1375002
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-11-21 13:26:09 -05:00
Guoqing Jiang 5339f99606 To support clustered raid10
We are now considering to extend clustered raid to
support raid10. But only near layout is supported,
so make the check when create the array or switch
the bitmap from internal to clustered.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-11-09 11:56:10 -05:00
Pawel Baldysiak b251424242 Zeroout whole ppl space during creation/force assemble
PPL area should be cleared before creation/force assemble.
If the drive was used in other RAID array, it might contains PPL from it.
There is a risk that mdadm recognizes those PPLs and
refuses to assemble the RAID due to PPL conflict with created
array.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-10-02 16:11:42 -04:00
Jes Sorensen a37563c913 sysfs_init_dev - take a dev_t argument
Be consistent and use the correct type.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-09-29 18:04:06 -04:00
Jes Sorensen d3c40faba8 lib: devid2kname() should take a dev_t
Make devid2kname() and devid2devnm() consistent in their APIs

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-09-29 17:54:12 -04:00
Mariusz Tkaczyk a822017f30 Detail: correct output for active arrays
The check for inactive array is incorrect as it compares it against
active array. Introduce a new function md_is_array_active so the check
is consistent across the code.

As the output contains list of disks in the array include this
information in sysfs read.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-08-16 08:19:38 +00:00
Tomasz Majchrzak b13b52c80f Get failed disk count from array state
Recent commit has changed the way failed disks are counted. It breaks
recovery for external metadata arrays as failed disks are not part of
the array and have no corresponding entries is sysfs (they are only
reported for containers) so degraded arrays show no failed disks.

Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
is not set again if GET_STATE has not been requested. As GET_STATE
provides the same information as GET_DEGRADED, the latter is not needed
anymore. Remove GET_DEGRADED option and replace it with GET_STATE
option.

Don't count number of failed disks looking at sysfs entries but
calculate it at the end. Do it only for arrays as containers report
no disks, just spares.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-06-05 11:11:36 -04:00
Alexey Obitotskiy 4b57ecf6ce Add sector size as spare selection criterion
Add sector size as new spare selection criterion. Assume that 0 means
there is no requirement for the sector size in the array. Skip disks
with unsuitable sector size when looking for a spare to move across
containers.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-09 14:18:38 -04:00
Alexey Obitotskiy fbfdcb06dc Allow more spare selection criteria
Disks can be moved across containers in order to be used as a spare
drive for reubild. At the moment the only requirement checked for such
disk is its size (if it matches donor expectations). In order to
introduce more criteria rename corresponding superswitch method to more
generic name and move function parameter to a structure. This change is
a big edit but it doesn't introduce any changes in code logic, it just
updates function naming and parameters.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-09 14:18:36 -04:00
Zhilong Liu 9e04ac1c43 mdadm/util: unify stat checking blkdev into function
declare function stat_is_blkdev() to integrate repeated stat
checking blkdev operations, it returns 'true/1' when it is a
block device, and returns 'false/0' when it isn't.
The devname is necessary parameter, *rdev is optional, parse
the pointer of dev_t *rdev, if valid, assigned device number
to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:05:32 -04:00
Zhilong Liu 0a6bff09d4 mdadm/util: unify fstat checking blkdev into function
declare function fstat_is_blkdev() to integrate repeated fstat
checking block device operations, it returns true/1 when it is
a block device, and returns false/0 when it isn't.
The fd and devname are necessary parameters, *rdev is optional,
parse the pointer of dev_t *rdev, if valid, assigned the device
number to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:04:02 -04:00
Jes Sorensen 9db2ab4e9b util: md_array_valid(): Introduce md_array_valid() helper
Using md_get_array_info() to determine if an array is valid is broken
during creation, since the ioctl() returns -ENODEV if the device is
valid but not active.

Where did I leave my stash of brown paper bags?

Fixes: ("40b054e mdopen/open_mddev: Use md_get_array_info() to determine valid array")
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-03 16:15:16 -04:00
NeilBrown cd6cbb08c4 Create: tell udev md device is not ready when first created.
When an array is created the content is not initialized,
so it could have remnants of an old filesystem or md array
etc on it.
udev will see this and might try to activate it, which is almost
certainly not what is wanted.

So create a mechanism for mdadm to communicate with udev to tell
it that the device isn't ready.  This mechanism is the existance
of a file /run/mdadm/created-mdXXX where mdXXX is the md device name.

When creating an array, mdadm will create the file.
A new udev rule file, 01-md-raid-creating.rules, will detect the
precense of thst file and set ENV{SYSTEMD_READY}="0".
This is fairly uniformly used to suppress actions based on the
contents of the device.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-02 09:41:39 -04:00
Jes Sorensen 44356754ec util: Get rid of unused enough_fd()
enough_fd() is no longer used, so lets get rid of it.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-20 11:53:30 -04:00
Jes Sorensen 3ab8f4bf33 util: Introduce md_array_active() helper
Rather than querying md_get_array_info() to determine whether an array
is valid, do the work in md_array_active() using sysfs, and fall back
on md_get_array_info() if sysfs fails.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-20 00:12:34 -04:00
Jes Sorensen 5e4ca8bb82 sysfs: Parse array_state in sysfs_read()
Rather than copying in the array_state string, parse it and use an
enum to indicate the state.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-20 00:12:27 -04:00
Jes Sorensen 303949f6f0 util: Finally kill off md_get_version()
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 15:49:18 -04:00
Jes Sorensen dae131379f sysfs: Make sysfs_init() return an error code
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-30 16:52:37 -04:00
Jes Sorensen 018a488238 util: Introduce md_set_array_info()
Switch from using ioctl(SET_ARRAY_INFO) to using md_set_array_info()

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 15:43:53 -04:00
Jes Sorensen d97572f5a5 util: Introduce md_get_disk_info()
This removes all the inline ioctl calls for GET_DISK_INFO, allowing us
to switch to sysfs in one place, and improves type checking.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 15:23:50 -04:00
Jes Sorensen 9cd39f0155 util: Introduce md_get_array_info()
Remove most direct ioctl calls for GET_ARRAY_INFO, except for one,
which will be addressed in the next patch.

This is the start of the effort to clean up the use of ioctl calls and
introduce a more structured API, which will use sysfs and fall back to
ioctl for backup.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 14:35:41 -04:00
Jes Sorensen a86b1c8d15 mdadm.h: struct mdinfo: reorganize ppl elements for better struct packing
Minor optimization putting ints next to ints for better data
alignment.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:37:27 -04:00
Artur Paszkiewicz 860f11ed4d Grow: support consistency policy change
Extend the --consistency-policy parameter to work also in Grow mode.
Using it changes the currently active consistency policy in the kernel
driver and updates the metadata to make this change permanent. Currently
this supports only changing between "ppl" and "resync" policies, that is
enabling or disabling PPL at runtime.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:35:16 -04:00
Artur Paszkiewicz e97a7cd011 super1: PPL support
Enable creating and assembling raid5 arrays with PPL for 1.x metadata.

When creating, reserve enough space for PPL and store its size and
location in the superblock and set MD_FEATURE_PPL bit. Write an initial
empty header in the PPL area on each device. PPL is stored in the
metadata region reserved for internal write-intent bitmap, so don't
allow using bitmap and PPL together.

While at it, fix two endianness issues in write_empty_r5l_meta_block()
and write_init_super1().

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:33:52 -04:00
Artur Paszkiewicz 2432ce9b32 imsm: PPL support
Enable creating and assembling IMSM raid5 arrays with PPL. Update the
IMSM metadata format to include new fields used for PPL.

Add structures for PPL metadata. They are used also by super1 and shared
with the kernel, so put them in md_p.h.

Write the initial empty PPL header when creating an array. When
assembling an array with PPL, validate the PPL header and in case it is
not correct allow to overwrite it if --force was provided.

Write the PPL location and size for a device to the new rdev sysfs
attributes 'ppl_sector' and 'ppl_size'. Enable PPL in the kernel by
writing to 'consistency_policy' before the array is activated.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:32:49 -04:00
Artur Paszkiewicz 5308f11727 Generic support for --consistency-policy and PPL
Add a new parameter to mdadm: --consistency-policy=. It determines how
the array maintains consistency in case of unexpected shutdown. This
maps to the md sysfs attribute 'consistency_policy'. It can be used to
create a raid5 array using PPL. Add the necessary plumbing to pass this
option to metadata handlers. The write journal and bitmap
functionalities are treated as different policies, which are implicitly
selected when using --write-journal or --bitmap options.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:32:15 -04:00
NeilBrown 1ab9ed2afb Add 'force' flag to *hot_remove_disk().
In rare circumstances, the short period that *hot_remove_disk()
waits isn't long enough to IO to complete.  This particularly happens
when a device is failing and many retries are still happening.

We don't want to increase the normal wait time for "mdadm --remove"
as that might be use just to test if a device is active or not, and a
delay would be problematic.
So allow "--force" to mean that mdadm should try extra hard for a
--remove to complete, waiting up to 5 seconds.

Note that this patch fixes a comment which claim the previous
wait time was half a second, where it was really 50msec.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:32:35 -04:00
NeilBrown fdd015696c Introduce sys_hot_remove_disk()
The new hot_remove_disk() will retry HOT_REMOVE_DISK
several times in the face of EBUSY.
However we sometimes remove a device by writing "remove" to the
"state" attributed.  This should be retried as well.
So introduce sys_hot_remove_disk() to repeat this action a few times.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:30:49 -04:00
NeilBrown 2dd271fe70 Retry HOT_REMOVE_DISK a few times.
HOT_REMOVE_DISK can fail with EBUSY if there are outstanding
IO request that have not completed yet.  It can sometimes
be helpful to wait a little while for these to complete.

We already do this in impose_level() when reshaping a device,
but not in Manage.c in response to an explicit --remove request.

So create hot_remove_disk() to central this code, and call it
where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-28 14:25:23 -04:00
Zhilong Liu 7054da69c7 mdadm:fixed some trivial typos in comments of mdadm.h
mdadm.h: fixed some trivial typos in comments

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-27 18:00:25 -04:00
NeilBrown e22fe3ae15 Introduce enum flag_mode for setting and clearing flags.
We currently use '1' to indicate that a flag (writemostly or failfast)
needs to be set, and '2' to indicate that it needs to be cleared.

Using magic number like this is not a best-practice.

So replaced them with values from a enum.

No functional change.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-29 17:12:13 -05:00
Tomasz Majchrzak 42d902d9db mdmon: bad block support for external metadata - clear bad blocks
If an update of acknowledged bad blocks file is notified, read entire
bad block list from sysfs file and compare it against local list of bad
blocks. If any obsolete entries are found, remove them from metadata.

As mdmon cannot perform any memory allocation, new superswitch method
get_bad_blocks is expected to return a list of bad blocks in metadata
without allocating memory. It's up to metadata handler to allocate all
required memory in advance.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 17:49:42 -05:00
Tomasz Majchrzak 1ab97c976b mdmon: bad block support for external metadata - store bad blocks
If md has changed the state to 'blocked' and metadata handler supports
bad blocks, try process them first. If metadata handler has successfully
stored bad block, acknowledge it to md via 'badblocks' sysfs file. If
metadata handler has failed to store the new bad block (ie. lack of
space), remove bad block support for a disk by writing "-external_bbl"
to state sysfs file. If all bad blocks have been acknowledged, request
to unblock the array.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 17:48:59 -05:00
Tomasz Majchrzak 6dc1785fdb mdmon: bad block support for external metadata - sysfs file open
Open 'badblocks' and 'unacknowledged_bad_blocks' sysfs files for each
disk in the array. Add them to the list of files observed by monitor.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 17:45:56 -05:00
Tomasz Majchrzak bb758ccad0 mdadm: bad block support for external metadata - initialization
If metadata handler provides support for bad blocks, tell md by writing
'external_bbl' to rdev state file (both on create and assemble),
followed by a list of known bad blocks written via sysfs 'bad_blocks'
file.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 17:44:45 -05:00
NeilBrown 71574efb07 Add failfast support.
Allow per-device "failfast" flag to be set when creating an
array or adding devices to an array.

When re-adding a device which had the failfast flag, it can be removed
using --nofailfast.

failfast status is printed in --detail and --examine output.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 08:50:36 -05:00
Pawel Baldysiak 329715091c Add function for getting member drive sector size
This patch introduces the function for getting sector size of
given device (fd).

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-17 09:24:18 -05:00
Artur Paszkiewicz 561ad5597b super1: make internal bitmap size calculations more consistent
Determining internal bitmap size is performed using two different
functions (bitmap_sectors() and calc_bitmap_size()) and in
getinfo_super1() it is calculated in yet another way. Each of these
methods give slightly different results. The most accurate is
calc_bitmap_size() but it also has a rounding issue. So:

- fix the rounding issue in calc_bitmap_size() using bitmap_bits()
- replace usages of bitmap_sectors() and open-coded calculations with
  calc_bitmap_size()
- remove bitmap_sectors()
- move bitmap_bits() to mdadm.h as inline - otherwise mdassemble won't
  compile (it does not use bitmap.c)

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-16 09:56:39 -05:00