Commit Graph

127 Commits

Author SHA1 Message Date
Mariusz Tkaczyk 5f21d67472 mdadm: add map_num_s()
map_num() returns NULL if key is not defined. This patch adds
alternative, non NULL version for cases where NULL is not expected.

There are many printf() calls where map_num() is called on variable
without NULL verification. It works, even if NULL is passed because
gcc is able to ignore NULL argument quietly but the behavior is
undefined. For safety reasons such usages will use map_num_s() now.
It is a potential point of regression.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-04-04 21:29:43 -04:00
Krzysztof Smolinski fd5b09c9a9 mdadm: check value returned by snprintf against errors
GCC 8 checks possible truncation during snprintf more strictly
than GCC 7 which result in compilation errors. To fix this
problem checking result of snprintf against errors has been added.

Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-08-16 10:06:50 -04:00
Mariusz Dabrowski b068159891 mdadm: load default sysfs attributes after assemblation
Added new type of line to mdadm.conf which allows to specify values of
sysfs attributes for MD devices that should be loaded after the array is
assembled. Each line is interpreted as list of structures containing
sysname of MD device (md126 etc.) and list of sysfs attributes and their
values.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-07-10 16:12:09 -04:00
Artur Paszkiewicz ae7d61e35e mdmon: fix wrong array state when disk fails during mdmon startup
If a member drive disappears and is set faulty by the kernel during
mdmon startup, after ss->load_container() but before manage_new(), mdmon
will try to readd the faulty drive to the array and start rebuilding.
Metadata on the active drive is updated, but the faulty drive is not
removed from the array and is left in a "blocked" state and any write
request to the array will block. If the faulty drive reappears in the
system e.g. after a reboot, the array will not assemble because metadata
on the drives will be incompatible (at least on imsm).

Fix this by adding a new option for sysfs_read(): "GET_DEVS_ALL". This
is an extension for the "GET_DEVS" option and causes all member devices
to be returned, even if the associated block device has been removed.
Use this option in manage_new() to include the faulty device on the
active_array's devices list. Mdmon will then properly remove the faulty
device from the array and update the metadata to reflect the degraded
state.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-05-20 14:14:36 -04:00
Mariusz Tkaczyk fe05dc43d8 sysfs: include faulty drive in disk count
When the disk fails, it goes into faulty state first and it is removed
from the array in a while. It gives mdadm monitor a chance to see the disk
has failed and notify an event (e.g. FailSpare). It doesn't work when
sysfs is used to get a number of disks in the array as it skips faulty
disk. ioctl implementation doesn't differentiate between active and
faulty disk. Do the same for sysfs then. It should not matter that number
of disks reported is greater than list of disk structures returned by the
call because the same approach already takes place for offline disks.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-11-09 15:45:52 -05:00
Artur Paszkiewicz 2c8890e926 Don't abort starting the array if kernel does not support ppl
Change the behavior of assemble and create for consistency-policy=ppl
for external metadata arrays. If the kernel does not support ppl, don't
abort but print a warning and start the array without ppl
(consistency-policy=resync). No change for native md arrays because the
kernel will not allow starting the array if it finds an unsupported
feature bit in the superblock.

In sysfs_add_disk() check consistency_policy in the mdinfo structure
that represents the array, not the disk and read the current consistency
policy from sysfs in mdmon's manage_member(). This is necessary to make
sysfs_add_disk() honor the actual consistency policy and not what is in
the metadata. Also remove all the places where consistency_policy is set
for a disk's mdinfo - it is a property of the array, not the disk.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-10-02 16:07:04 -04:00
Jes Sorensen a37563c913 sysfs_init_dev - take a dev_t argument
Be consistent and use the correct type.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-09-29 18:04:06 -04:00
Tomasz Majchrzak b13b52c80f Get failed disk count from array state
Recent commit has changed the way failed disks are counted. It breaks
recovery for external metadata arrays as failed disks are not part of
the array and have no corresponding entries is sysfs (they are only
reported for containers) so degraded arrays show no failed disks.

Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
is not set again if GET_STATE has not been requested. As GET_STATE
provides the same information as GET_DEGRADED, the latter is not needed
anymore. Remove GET_DEGRADED option and replace it with GET_STATE
option.

Don't count number of failed disks looking at sysfs entries but
calculate it at the end. Do it only for arrays as containers report
no disks, just spares.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-06-05 11:11:36 -04:00
Jes Sorensen 8b0ebd6452 sysfs/sysfs_read: Count working_disks
This counts working_disks the same way as get_array_info counts it in
the kernel.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-09 17:09:40 -04:00
Jes Sorensen 64ec81da7a sysfs/sysfs_read: Count active_disks and failed_disks
Cound active_disks as drives mark 'in_sync' and failed_disks as
disks marked 'faulty', in the same way ioctl(GET_ARRAY_INFO) does.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:51:43 -04:00
Artur Paszkiewicz b75805662e Don't use UnSet with consistency_policy
Use CONSISTENCY_POLICY_UNKNOWN instead. Simplify some checks because
since 5e8e35fb7e ("maps: Use keyvalue for null terminator to indicate
'unset' value") map_name() can return this default directly.

Suggested-by: Jes Sorensen <Jes.Sorensen@gmail.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2017-04-24 17:10:56 -04:00
Jes Sorensen 5e8e35fb7e maps: Use keyvalue for null terminator to indicate 'unset' value
This simplifies the code calling map_name() so it no longer has to
manually check for UnSet and convert the value manually.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-20 00:27:42 -04:00
Jes Sorensen 5e4ca8bb82 sysfs: Parse array_state in sysfs_read()
Rather than copying in the array_state string, parse it and use an
enum to indicate the state.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-20 00:12:27 -04:00
Jes Sorensen dae131379f sysfs: Make sysfs_init() return an error code
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-30 16:52:37 -04:00
Jes Sorensen 67a02d5200 sysfs: Use the presence of /sys/block/<dev>/md as indicator of valid device
Rather than calling ioctl(RAID_VERSION), use the presence of
/sys/block/<dev>/md as indicator of the device being valid and sysfs
being active for it. The ioctl could return valid data, but sysfs
not mounted, which renders sysfs_init() useless anyway.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-30 16:02:36 -04:00
Artur Paszkiewicz 2432ce9b32 imsm: PPL support
Enable creating and assembling IMSM raid5 arrays with PPL. Update the
IMSM metadata format to include new fields used for PPL.

Add structures for PPL metadata. They are used also by super1 and shared
with the kernel, so put them in md_p.h.

Write the initial empty PPL header when creating an array. When
assembling an array with PPL, validate the PPL header and in case it is
not correct allow to overwrite it if --force was provided.

Write the PPL location and size for a device to the new rdev sysfs
attributes 'ppl_sector' and 'ppl_size'. Enable PPL in the kernel by
writing to 'consistency_policy' before the array is activated.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:32:49 -04:00
Artur Paszkiewicz 5308f11727 Generic support for --consistency-policy and PPL
Add a new parameter to mdadm: --consistency-policy=. It determines how
the array maintains consistency in case of unexpected shutdown. This
maps to the md sysfs attribute 'consistency_policy'. It can be used to
create a raid5 array using PPL. Add the necessary plumbing to pass this
option to metadata handlers. The write journal and bitmap
functionalities are treated as different policies, which are implicitly
selected when using --write-journal or --bitmap options.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:32:15 -04:00
NeilBrown c07566f14c Make get_component_size() work with named array.
get_component_size() still assumes that all array are
 /sys/block/md%d or /sys/block/md_d%d
and so doesn't work with e.g. /sys/block/md_foo.

This cause "mdadm --detail" to report
   Used Dev Size : unknown
and causes problems when added spares and in other circumstances.

So change it to use stat2devnm() which does the right thing with all
types of array names.

Reported-and-tested-by: Robert LeBlanc <robert@leblancnet.us>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-12-22 12:19:10 -05:00
Tomasz Majchrzak bb758ccad0 mdadm: bad block support for external metadata - initialization
If metadata handler provides support for bad blocks, tell md by writing
'external_bbl' to rdev state file (both on create and assemble),
followed by a list of known bad blocks written via sysfs 'bad_blocks'
file.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 17:44:45 -05:00
Tomasz Majchrzak bbb52f2b1d Increase buffer for sysfs path
'unacknowledged_bad_blocks' is a long name for sysfs property and it
makes sysfs path over 50 characters long. Increase buffer to the double
length of the longest path available in sysfs at the moment.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-17 09:43:44 -05:00
Jes Sorensen 36138e4e4b sysfs: Avoid if and return on the same line
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:52:48 -04:00
Jes Sorensen 193b6c0b26 load_sys(): Add a buffer size argument
This adds a buffer size argument to load_sys(), rather than relying on
a hard coded buffer size. The old behavior was safe because we knew
the kernel would never return strings overrunning the buffers, however
it was ugly, and would cause code checking tools to spit out warnings.

This caused a Coverity warning over the read into
sra->sysfs_array_state which is only 20 bytes.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Song Liu 5aa644c68a add sysfs_array_state to struct mdinfo
Add sysfs_array_state to struct mdinfo, and add GET_ARRAY_STATE to
options of sysfs_read.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 12:43:45 +11:00
Guoqing Jiang 9465f17058 re-add: make re-add try to write sysfs node first
If sysfs node existed, we should try to write "re-add" to it.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-08 11:08:40 +11:00
NeilBrown 5418499ae4 sysfs: reject reads that use the whole buffer.
If a read fills the whole buffer, then we possibly
missed something of the end, and we definitely shouldn't
put a '\0' beyond the end, so just return an error.
This should never happen anyway.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-06 13:21:33 +10:00
NeilBrown 7a862a020f Don't break long strings onto multiple lines.
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.

Only strings which contain a newline can be broken
into multiple lines:

 "It is OK to\n"
 "break this string\n"


Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:46:53 +11:00
NeilBrown 1ade5cc15a Consistently print program Name and __func__ in debug messages.
make dprintf() print program name and __func__, so that
this messaging is consistent.

Also remove all __func__ messages from pr_err(). We shouldn't
leak that internal data in error message.
If we really want function name there, we new pr_XXX might
be wanted.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:21:17 +11:00
Pawel Baldysiak d56dd607ba Change way of printing name of a process
Sometimes mdadm prints messages with wrong name "mdmon",
and vice versa.
This patch solves this problem by changing method of determining
process name.
Now "Name" will be set in const at start of a program,
previously was hardcoded as #define.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 12:11:01 +11:00
NeilBrown bc17158dcc Introduce devid2kname - slightly different to devid2devnm.
The purpose od devid2devnm is to return a kernel name of an
md device, whether that device is a whole device or a partition,
we want the whole device.  md4, never md4p2.

In one place I was using devid2devnm where I really wanted the
partition if there was one ... and wasn't really interested in it
being an md device.
So introduce a new 'devid2kname' for that case.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-08-01 14:32:04 +10:00
NeilBrown 4bffc964b9 sysfs: fix bugs in new sysfs_wait function.
- 'tv' isn't initialised properly.
- 100?  I'm sure I fixed that already! Seems not.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-02 16:08:34 +10:00
NeilBrown 2eba849621 Manage: check alignment when stopping an array undergoing reshape.
To be able to revert-reshape of raid4/5/6 which is changing
the number of devices, the reshape must has been stopped on a multiple
of the old and new stripe sizes.

The kernel only enforces the new stripe size multiple.

So we enforce the old-stripe-size multiple by careful use of
"sync_max" and monitoring "reshape_position".

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-01 15:10:05 +10:00
NeilBrown efc67e8e9f New function: sysfs_wait
We have several places that wait for activity on a sysfs
file.  Combine most of these into a single 'sysfs_wait' function.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-01 13:28:13 +10:00
NeilBrown dea3786ae2 Grow: fix bug in raid0 -> raid5 conversion.
The moment we change a RAID0 to a RAID5 it will try to recovery.  This
will abort quite quickly as there are not spare devices, but it could
confuse the attempt to freeze the array.

So allow 'freeze' to work even on a recovering array.

Signed-off-by: NeilBrown  <neilb@suse.de>
2013-06-25 15:52:58 +10:00
NeilBrown 1011e8344a Remove lots of unnecessary white space.
Now that I am using white-space mode in Emacs I can see all of this,
and I don't like it :-)

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-19 12:31:45 +10:00
NeilBrown 64e103fe19 sysfs_read: return devices in same order as in filesystem.
When we read devices from sysfs (../md/dev-*), store them in the same
order that they appear.  That makes more sense when exposed to a
human (as the next patch will).

Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-19 10:33:47 +10:00
NeilBrown e6fc80a895 Detail: report on inactive arrays.
Array can be inactive when e.g. -I is in the process of assembling them.
This change allows --detail to report limited information about
these arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 16:57:10 +10:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
NeilBrown 6a67848ab6 Grow: fix reshape from RAID5 to RAID1.
Commit 5da9ab9874
       Grow_reshape re-factor
in mdadm-3.2 broke conversion from RAID5 and RAID1 - and we
never noticed.

This fixes it.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:06:34 +11:00
NeilBrown fe384ca0b9 Grow: set new_data_offset if appropriate 2012-10-04 16:34:21 +10:00
NeilBrown aab15415ed Manage: fix checks for removal from a container.
We must only remove from a container if the device isn't a
member of any member array.
To check we look at the 'holders' directory in sysfs.

We currently skip that check if ->devname is "detached", however
that can never be true since the change that introduced
add_detached().

Also sysfs_unique_holder returns status in 'errno' which isn't
entirely safe as e.g. closedir() is probably allowed to clear it.

So make sysfs_unique_holder return an unambigious value, and us
it to decide what to report.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-24 12:26:03 +10:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
Jes Sorensen 012a864129 Introduce sysfs_set_num_signed() and use it to set bitmap/offset
mdinfo->bitmap_offset is a signed long and needs to be treated as
such when passed to the kernel.

This resolves the problem with adding internal bitmaps to a 1.0 array.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-30 09:56:22 +10:00
NeilBrown fbdef49811 Bitmap_offset is a signed number
As the bitmap can be before the superblock, bitmap_offset is signed.
But some of the code didn't honour that :-(

Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-04 14:03:45 +10:00
NeilBrown fd324b08db sysfs: fixed sysfs_freeze_array array to work properly with Manage_subdevs.
If the array is already frozen when Manage_subdevs is called we don't
want it to unfreeze the array.
This is because Grow calls Manage_subdevs to add devices to an array
being reshaped, and the array must stay frozen over this call.

So if sysfs_freeze_array find the array to be frozen it returns '0',
meaning that it didn't and cannot freeze it.  Then the caller will not
try to unfreeze, which is good.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-28 17:29:37 +11:00
NeilBrown c0c1acd691 Grow/bitmap: support adding bitmap via sysfs.
Adding a bitmap via ioctl can only add it at a fixed location.
That location is not suitable for 4K-block devices.

So allow setting the bitmap location via sysfs if kernel supports it
and aim to always use 4K alignments.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 14:10:41 +11:00
Jes Sorensen 99f6e52159 get_component_size(): Check read() return value for error before using it
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-03 08:07:39 +11:00
Jes Sorensen 93f1df3355 sysfs_unique_holder(): Check read() return value before using as buffer index
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Thomas Jarosch 9cf014ec40 Fix off-by-one in readlink() buffer size handling
readlink() returns the number of bytes in the buffer.

If we do something like

len = readlink(path, buf, sizeof(buf));
buf[len] = '\0';

we might write one byte past the end of the buffer.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-17 11:15:04 +11:00
Adam Kwolek ddb12f6ca6 FIX: Do not unblock array accidentally
When sysfs_set_array() function is called, it tests if array
can be configured using sysfs. Setting metadata_version entry
can accidentally unblock mdmon when array is under reshape.
To avoid this, blocking character '-' is checked and if is is set,
it is used for array test.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-21 11:55:08 +10:00