Commit Graph

229 Commits

Author SHA1 Message Date
Mariusz Tkaczyk 5f21d67472 mdadm: add map_num_s()
map_num() returns NULL if key is not defined. This patch adds
alternative, non NULL version for cases where NULL is not expected.

There are many printf() calls where map_num() is called on variable
without NULL verification. It works, even if NULL is passed because
gcc is able to ignore NULL argument quietly but the behavior is
undefined. For safety reasons such usages will use map_num_s() now.
It is a potential point of regression.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-04-04 21:29:43 -04:00
Mariusz Tkaczyk 913f07d1db Create, Build: use default_layout()
This code is duplicated for Build mode so make default_layout() extern
and use it. Simplify the function structure.

It introduced change for Build mode, now for raid0 RAID0_ORIG_LAYOUT
will be returned same as for Create.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-04-04 21:20:27 -04:00
Nigel Croxon a042210648 disallow create or grow clustered bitmap with writemostly set
Do not support creating an MD array on a clustered system
(--bitmap=clustered) and disks with the write mostly
(--write-mostly) flag set.

Or do not grow an MD array on a non-clustered bitmap to a
clustered bitmap with disks having the write mostly flag set.

The actual results is the MD array is created successfully.
But the expected results should be a failure with an
error message stating:
Can not set --write-mostly with a clustered bitmap.
and disks marked write-mostly are not supported with clustered bitmap.

V2:
Added the device name in the error message during creation:
mdadm -CR /dev/md0 -l1 --raid-devices=2 /dev/sda --write-mostly /dev/sdb --bitmap=clustered
mdadm: Can not set /dev/sdb --write-mostly with a clustered bitmap.

Added the array name in the error message when growing:
mdadm --grow /dev/md0 --bitmap=clustered
mdadm: /dev/md0 disks marked write-mostly are not supported with clustered bitmap

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-10-08 11:47:55 -04:00
Mateusz Grzonka f7889e5199 Fix error message when creating raid 4, 5 and 10
Change inappropriate error message "at least 2 raid-devices needed for
level 4 or 5" to only mention relevant raid level.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-10-08 06:29:51 -04:00
Mateusz Grzonka 5b30a34aa4 Add error handling for chunk size in RAID1
Print error if chunk size is set as it is not supported.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-07-16 10:31:41 -04:00
Jakub Radtke 848d71c91d Create: Block automatic enabling bitmap for external metadata
For external metadata, bitmap should be added only when
explicitly set by the administrator.
They could be additional requirements to consider before
enabling the external metadata's functionality
(e.g., kernel support).

Signed-off-by: Jakub Radtke <jakub.radtke@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-03-09 17:18:02 -05:00
Jakub Radtke b554ab5c9b Enable bitmap support for external metadata
The patch enables the implementation of a write-intent bitmap for external
metadata.
Configuration of the internal bitmaps for non-native metadata requires the
extension in superswitch to perform an additional sysfs setup before the
array is activated.

Signed-off-by: Jakub Radtke <jakub.radtke@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-03-09 17:17:43 -05:00
Mariusz Tkaczyk ce559078a5 Create.c: close mdfd and generate uevent
During mdfd closing change event is not generated because open() is
called before start watching mddevice by udev.
Device is ready at this stage. Unblock device, close fd and
generate event to give a chance next layers to work.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2020-11-25 18:06:23 -05:00
Xiao Ni 2ce0917240 Don't create bitmap for raid5 with journal disk
Journal disk and bitmap can't exist at the same time. It needs to check if the raid
has a journal disk when creating bitmap.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2020-10-14 11:24:39 -04:00
NeilBrown 329dfc28de Create: add support for RAID0 layouts.
Since Linux 5.4 a layout is needed for RAID0 arrays with
varying device sizes.
This patch makes the layout of an array visible (via --examine)
and sets the layout on newly created arrays.
--layout=dangerous
can be used to avoid setting a layout so that they array
can be used on older kernels.

Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-12-02 16:14:49 -05:00
Mariusz Tkaczyk 22dc741f63 Create: Block rounding size to max
When passed size is smaller than chunk, mdadm rounds it to 0 but 0 there
means max available space.
Block it for every metadata. Remove the same check from imsm routine.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-04-10 09:39:34 -04:00
Dimitri John Ledkov ebf3be9931 Fix spelling typos.
Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-02-11 14:42:50 -05:00
Michal Zylowski b91ad097d6 imsm: Allow create RAID volume with link to container
After 1db03765("Subdevs can't be all missing when create raid device")
raid volume can't be created with link to container. This feature should
not be blocked in Create function.  IMSM code forbids creation of
container with missing disk, so case like all dev's missing is already
handled.

Permit IMSM volume creation when devices are given as link to container.

Signed-off-by: Michal Zylowski <michal.zylowski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-04-10 16:12:00 -04:00
Xiao Ni 1db0376585 Subdevs can't be all missing when create raid device
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-01-26 13:51:01 -05:00
Zhilong Liu 0a6bff09d4 mdadm/util: unify fstat checking blkdev into function
declare function fstat_is_blkdev() to integrate repeated fstat
checking block device operations, it returns true/1 when it is
a block device, and returns false/0 when it isn't.
The fd and devname are necessary parameters, *rdev is optional,
parse the pointer of dev_t *rdev, if valid, assigned the device
number to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:04:02 -04:00
NeilBrown cd6cbb08c4 Create: tell udev md device is not ready when first created.
When an array is created the content is not initialized,
so it could have remnants of an old filesystem or md array
etc on it.
udev will see this and might try to activate it, which is almost
certainly not what is wanted.

So create a mechanism for mdadm to communicate with udev to tell
it that the device isn't ready.  This mechanism is the existance
of a file /run/mdadm/created-mdXXX where mdXXX is the md device name.

When creating an array, mdadm will create the file.
A new udev rule file, 01-md-raid-creating.rules, will detect the
precense of thst file and set ENV{SYSTEMD_READY}="0".
This is fairly uniformly used to suppress actions based on the
contents of the device.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-02 09:41:39 -04:00
Jes Sorensen 5f4cc23926 Create: Remove all attemps to handle md driver older than 0.90.03
More legacy code moved to the bit-bucket.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 15:32:40 -04:00
Jes Sorensen 98dbf73cba Create: Fixup various whitespace issues
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 11:57:30 -04:00
Jes Sorensen cf622ec1d8 Create: Fixup bad placement of logical || && in multi-line if statements
These always go at the end of the line, never at the front

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 11:53:12 -04:00
Zhilong Liu 230a0dde09 mdadm/Create: declaring an existing struct within same function
Create:declaring 'struct stat stb' twice within the same
function, rename stb as stb2 when declares 'struct stat'
at the second time.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 11:47:04 -04:00
Jes Sorensen dae131379f sysfs: Make sysfs_init() return an error code
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-30 16:52:37 -04:00
Jes Sorensen 9cd39f0155 util: Introduce md_get_array_info()
Remove most direct ioctl calls for GET_ARRAY_INFO, except for one,
which will be addressed in the next patch.

This is the start of the effort to clean up the use of ioctl calls and
introduce a more structured API, which will use sysfs and fall back to
ioctl for backup.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 14:35:41 -04:00
Artur Paszkiewicz e97a7cd011 super1: PPL support
Enable creating and assembling raid5 arrays with PPL for 1.x metadata.

When creating, reserve enough space for PPL and store its size and
location in the superblock and set MD_FEATURE_PPL bit. Write an initial
empty header in the PPL area on each device. PPL is stored in the
metadata region reserved for internal write-intent bitmap, so don't
allow using bitmap and PPL together.

While at it, fix two endianness issues in write_empty_r5l_meta_block()
and write_init_super1().

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:33:52 -04:00
Artur Paszkiewicz 5308f11727 Generic support for --consistency-policy and PPL
Add a new parameter to mdadm: --consistency-policy=. It determines how
the array maintains consistency in case of unexpected shutdown. This
maps to the md sysfs attribute 'consistency_policy'. It can be used to
create a raid5 array using PPL. Add the necessary plumbing to pass this
option to metadata handlers. The write journal and bitmap
functionalities are treated as different policies, which are implicitly
selected when using --write-journal or --bitmap options.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:32:15 -04:00
NeilBrown e22fe3ae15 Introduce enum flag_mode for setting and clearing flags.
We currently use '1' to indicate that a flag (writemostly or failfast)
needs to be set, and '2' to indicate that it needs to be cleared.

Using magic number like this is not a best-practice.

So replaced them with values from a enum.

No functional change.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-29 17:12:13 -05:00
NeilBrown 71574efb07 Add failfast support.
Allow per-device "failfast" flag to be set when creating an
array or adding devices to an array.

When re-adding a device which had the failfast flag, it can be removed
using --nofailfast.

failfast status is printed in --detail and --examine output.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-11-28 08:50:36 -05:00
Jes Sorensen 2ec2b7e9d5 mdadm: Make add_internal_bitmap() return 0 on success
add_internal_bitmap() returned 1 on success and 0 on error which is
inconsistent. This changes it to return 0 on success and use more
reasonable error codes on error.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:19:16 -04:00
Guoqing Jiang 82d9485e06 Create: check the node nums when create clustered raid
It doesn't make sense to create a clustered raid
with only 1 node.

Reported-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-09 14:59:01 -04:00
NeilBrown dfd7822ca6 Create: minor fix when adding a journal device
The check of "is there a filesystem here" is still appropriate for a
journal device.

Also set active_disks correctly - even though it is ignored.

Signed-off-by: NeilBrown <neilb@suse.com>
2016-01-14 14:13:17 +11:00
NeilBrown f170a5a9a0 Create: fix regression in setting raid_disk
Recent commit caused 'missing' declarations to not be handled correctly.

Fixes: cc1799c3dd ("Enable create array with write journal (--write-journal DEVICE).")
Signed-off-by: NeilBrown <neilb@suse.com>
2016-01-14 13:22:17 +11:00
Song Liu cc1799c3dd Enable create array with write journal (--write-journal DEVICE).
Specify the write journal device with --write-journal DEVICE

./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Only one journal device is allowed. If multiple --write-journal
are given, mdadm will use the first and ignore others

./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1 --write-journal /dev/sdx
mdadm: Please specify only one journal device for the array.
mdadm: Ignoring --write-journal /dev/sdx...
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:12 +11:00
Goldwyn Rodrigues 6d9c7c2551 Increment version for clustered bitmaps
Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels
to assemble a clustered device.

In order to maximize compatibility, the major version is set to
BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered.

Also, added MD_FEATURE_CLUSTERED in order to return error
for older kernels which would assemble MD in case bitmap is
corrupted.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 11:47:04 +10:00
Guoqing Jiang 7716570e6d Set home-cluster while creating an array
The home-cluster is stored in the bitmap super block of the
array. The device can be assembled on a cluster with the
cluster name same as the one recorded in the bitmap.

If home-cluster is not specified, this is auto-detected using
dlopen corosync cmap library.

neilb: allow code to compile when corosync-devel is not installed.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:06:30 +10:00
Guoqing Jiang 529e2aa573 Add nodes option while creating md
Specifies the maximum number of nodes in the cluster that may use
this device simultaneously. This is equivalent to the number of
bitmaps created in the internal superblock (patches to follow).

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 09:04:16 +10:00
Guoqing Jiang 95a05b37e8 Create n bitmaps for clustered mode
For a clustered MD, create bitmaps equal to number of nodes so
each node has an independent bitmap.

Only the first bitmap is has the bits set so that the first node
that assembles the device also performs the sync.

The bitmaps are aligned to 4k boundaries.

On-disk format:

0                    4k                     8k                    12k
-------------------------------------------------------------------
| idle                | md super            | bm super [0] + bits |
| bm bits[0, contd]   | bm super[1] + bits  | bm bits[1, contd]   |
| bm super[2] + bits  | bm bits [2, contd]  | bm super[3] + bits  |
| bm bits [3, contd]  |                     |                     |

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-17 07:54:03 +10:00
NeilBrown 7a862a020f Don't break long strings onto multiple lines.
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.

Only strings which contain a newline can be broken
into multiple lines:

 "It is OK to\n"
 "break this string\n"


Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-12 13:46:53 +11:00
NeilBrown 476066a3d5 DDF: add support of --data-offset when creating array.
Infrastructure is there, so use it.

This requires making sure that ->data_offset is correctly set, even
for containers.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-21 11:54:48 +10:00
Artur Paszkiewicz 39917e56cc Create: don't default to bitmap=internal when it is not supported
For large arrays (component size > 100GB) if write-intent bitmap is not
enabled, then it is set by default to "internal", even if the metadata
format does support internal bitmaps, which causes Create to fail.

This patch adds checking if add_internal_bitmap is set in the
superswitch before setting bitmap_file to "internal".

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-01 10:14:59 +10:00
Artur Paszkiewicz 19ad4b2cb2 Fix race between --create and --incremental
This modifies locking in Create to eliminate a situation where
--incremental can assemble a device between write_init_super() and
add_disk(), which causes Create to fail.

It sporadically occurs e.g. when metadata is written on a device,
causing an udev change event which triggers mdadm --incremental.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-01 10:14:53 +10:00
NeilBrown 6f02172d2e Release mdadm-3.3
(and  various cosmetic fixes)

Signed-off-by: NeilBrown <neilb@suse.de>
2013-09-03 14:47:47 +10:00
NeilBrown a7dec3fd92 Make sure NOFILE resource limit is big enough.
Some people want to create truely enormous arrays.
As we sometimes need to hold one file descriptor for each
device, this can hit  the NOFILE limit.

So raise the limit if it ever looks like it might be a problem.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-30 14:31:09 +10:00
NeilBrown a21e848a55 Create: over-ride "start_ro" setting when creating an array.
If module parameter start_ro is set, arrays start readonly.
This is OK when assembling, but is very surprising when creating
an array as the resync won't start.
So over-ride the setting (unless --read-only was given) make
arrays RW when created.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-15 11:40:27 +10:00
NeilBrown eca944fa9c create_mddev: add support for /dev/md_XXX non-numeric names.
With the 'devnm' infrastructure fixed, it is quite easy to support
names like "md_home" for md arrays.
The currently defaults to "off" and can be enabled in mdadm.conf with
  CREATE names=yes
This is incase other tools get confused by the new names.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-15 11:03:25 +10:00
NeilBrown 8baab049ce Create: fix bug with --data-offset.
Test for VARIABLE_OFFSET was wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-05-13 17:26:37 +10:00
NeilBrown 748952f73e Create: default to bitmap=internal for large arrays.
Here, "large" means components are 100G or more.  It is
usually beneficial to have write-intent bitmaps on such arrays.
They can be suppressed with --bitmap=none

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 10:36:21 +11:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
Lukasz Dorau 066e92f017 Create.c: check if freesize is equal 0
"freesize" can be equal 0, particularly after rounding to the chunk's size.
Creating should be aborted in such case.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:12:03 +11:00
NeilBrown 5d5002289c Replace a lot of leading spaces with tabs.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-10 18:33:26 +11:00
NeilBrown 72ca9bcff3 Allow data-offset to be specified per-device for create
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...

The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 40c9a66a5c Add --data-offset flag for Create and Grow
This can be used to over-ride the automatic assignment of
data offset.
For --create, it is useful to re-create old arrays where different
   defaults applied.
For --grow it may be able to force a reshape in the reverse direction.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00