Commit Graph

342 Commits

Author SHA1 Message Date
NeilBrown 027c099fd1 Assemble: add support for RAID0 layouts.
If you have a RAID0 array with varying sized devices
on a kernel before 5.4, you cannot assembling it on
5.4 or later without explicitly setting the layout.
This is now possible with
  --update=layout-original (For 3.13 and earlier kernels)
or
  --update=layout-alternate (for 3.14 and later kernels)

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-12-02 16:15:56 -05:00
Mariusz Dabrowski b068159891 mdadm: load default sysfs attributes after assemblation
Added new type of line to mdadm.conf which allows to specify values of
sysfs attributes for MD devices that should be loaded after the array is
assembled. Each line is interpreted as list of structures containing
sysname of MD device (md126 etc.) and list of sysfs attributes and their
values.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-07-10 16:12:09 -04:00
Pawel Baldysiak 2b57e4fe04 Assemble: Fix starting array with initial reshape checkpoint
If array was stopped during reshape initialization,
there might be a "0" checkpoint recorded in metadata.
If array with such condition (reshape with position 0)
is passed to kernel - it will refuse to start such array.

Treat such array as normal during assemble, Grow_continue() will
reinitialize and start the reshape.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-02-28 15:48:51 -05:00
Dimitri John Ledkov ebf3be9931 Fix spelling typos.
Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2019-02-11 14:42:50 -05:00
Gioh Kim 563ac10865 Assemble: mask FAILFAST and WRITEMOSTLY flags when finding the most recent device
If devices[].i.disk.state has MD_DISK_FAILFAST or MD_DISK_WRITEMOSTLY
flag, it cannot be the most recent device. Both flags should be masked
before checking the state.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-12-06 07:52:08 -05:00
Gioh Kim 0833f9c3db Assemble: keep MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY flag
Before updating superblock of slave disks, desired_state value
is set for the target state of the slave disks. But it forgets
to check MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY flags. Then
start_arrays() calls ADD_NEW_DISK ioctl-call and pass the state
without MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY.

Currenlty it does not generate any problem because kernel does not
care MD_DISK_FAILFAST or MD_DISK_WRITEMOSTLY flags.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-12-06 07:41:13 -05:00
Guoqing Jiang 783a4a93b9 Assemble: set devices to NULL when load_devices can't load device
Since load_devices frees "devices" when it can't find any
device, we should set it to NULL to avoid double free issue
which can be reproduced by below steps:

mdadm -CR /dev/md/vol -l0 -e 1.2 -n2 /dev/sd[b-c] --assume-clean
mdadm -Ss
mdadm -A /dev/md127 /dev/sd[b-c] --update metadata

Reported-by: Tkaczyk Mariusz <mariusz.tkaczyk@intel.com>
Tested-by: Tkaczyk Mariusz <mariusz.tkaczyk@intel.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-09-27 10:30:19 -04:00
Guoqing Jiang d8b0173894 Assemble: free resources in load_devices
Like other failure cases in load_devices, we need
to free those resources as well.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-09-27 10:29:48 -04:00
Guoqing Jiang 80d1256e98 Assemble: remove the protection when clustered raid do assemble
For HA product, RA (resource agent) assembles cluster raid
through call below cmd:

$MDADM --assemble $mddev --config=$RAIDCONF $MDADM_HOMEHOST

Sometimes node can't assemble array because all the nodes
need to contend dlm lock, which causes node fence in automatic
test.

And in fact, we don't need the protection since the assemble
cmd called by RA doesn't change superblock, so revert the
commit 76781701a4 ("Assemble:
provide protection when clustered raid do assemble") to remove
unneccessary protection.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-07-20 15:40:20 -04:00
Anthony Youngman 4a670aabdc Coverity: Resource leak: fix return without free
Signed-off-by: Anthony Youngman <anthony@youngman.org.uk>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-07-11 13:08:17 -04:00
Guoqing Jiang 898bd1ecef Free map to avoid resource leak issues
1. There are some places which didn't free map as
discovered by coverity.

CID 289661 (#1 of 1): Resource leak (RESOURCE_LEAK)12. leaked_storage: Variable mapl going out of scope leaks the storage it points to.
CID 289619 (#3 of 3): Resource leak (RESOURCE_LEAK)63. leaked_storage: Variable map going out of scope leaks the storage it points to.
CID 289618 (#1 of 1): Resource leak (RESOURCE_LEAK)26. leaked_storage: Variable map going out of scope leaks the storage it points to.
CID 289607 (#1 of 1): Resource leak (RESOURCE_LEAK)41. leaked_storage: Variable map going out of scope leaks the storage it points to.

2. If we call map_by_* inside a loop, then map_free
should be called in the same loop, and it is better
to set map to NULL after free.

3. And map_unlock is always called with map_lock,
if we don't call map_remove before map_unlock,
then the memory (allocated by  map_lock -> map_read
-> map_add -> xmalloc) could be leaked. So we
need to free it in map_unlock as well.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-06-11 06:35:41 -04:00
Mariusz Tkaczyk 7298c9a6fa Assemble.c Don't ignore faulty disk when array is auto assembled.
Since commit 20dc76d15b ("imsm: Set disk slot number") mdadm
sets slot number for each disk in imsm array. Now auto-assemble determines
devices using slot number and ignores devices on the same slot that have
older generation number.
It causes infinit loop if failed device is still visible in system
(it has metadata, but it is not merged with exisiting array).

To avoid it, out-of-sync device should be added to the best[]. Later
mdadm adds it as spare to the container.

Imsm doesn't support disk replacement feature, so it can use rooms for
replacements.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-06-08 12:47:14 -04:00
Guoqing Jiang 57908e9eba Assemble: cleanup the failure path
There are some failure paths which share common codes
before return, so simplify them by move common codes
to the end of function, and just goto out in case
failure happened.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-03-08 14:22:25 -05:00
Guoqing Jiang 76781701a4 Assemble: provide protection when clustered raid do assemble
The previous patch provides protection for other modes
such as CREATE, MANAGE, GROW and INCREMENTAL. And for
ASSEMBLE mode, we also need to protect during the process
of assemble clustered raid.

However, we can only know the array is clustered or not
when the metadata is ready, so the lock_cluster is called
after select_devices(). And we could re-read the metadata
when doing auto-assembly, so refresh the locking.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-03-08 14:18:41 -05:00
BingJing Chang 62f1aee7ad mdadm: prevent out-of-date reshaping devices from force assemble
With "--force", we can assemble the array even if some superblocks
appear out-of-date. But their data layout is regarded to make sense.
In reshape cases, if two devices claims different reshape progresses,
we cannot forcely assemble them back to array. Kernel will treat only
one of them as reshape progress. However, their data is still laid on
different layouts. It may lead to disaster if reshape goes on.

Reproducible Steps:
mdadm -C /dev/md0 --assume-clean -l5 -n3 /dev/loop[012]
mdadm -a /dev/md0 /dev/loop3
mdadm -G /dev/md0 -n4
mdadm -f /dev/md0 /dev/loop0 # after a period
mdadm -S /dev/md0 # after another period
mdadm -E /dev/loop[01] # make sure that they claims different ones

mdadm -Af -R /dev/md0 /dev/loop[023] # give no enough devices for
force_array() to pick non-fresh devices
cat /sys/block/md0/md/reshape_position # You can see that Kernel resume
reshape the from any progress of them.

Note: The unit of mdadm -E is KB, but reshape_position's is sector.

In order to prevent disaster, we add logics to prevent devices with
different reshape progress from being added into the array.

Reported-by: Allen Peng <allenpeng@synology.com>
Reviewed-by: Alex Wu <alexwu@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-02-23 11:05:00 -05:00
Andrea Righi 31b6f0cdc1 Assemble: prevent segfault with faulty "best" devices
I was able to trigger this curious problem that seems to happen only on
one of our server:

Segmentation fault

This md volume is a raid1 volume made of 2 device mapper (dm-multipath)
devices and the underlying LUNs are imported via iSCSI.

Applying the following patch (see below) seems to fix the problem:

mdadm: /dev/md/10.4.237.12-volume has been started with 2 drives.

But I'm not sure if it's the right fix or if there're some other
problems that I'm missing.

More details about the md superblocks that might help to better
understand the nature of the problem:

dev: 36001405a04ed0c104881100000000000p2
/dev/mapper/36001405a04ed0c104881100000000000p2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 5f3e8283:7f831b85:bc1958b9:6f2787a4
           Name : 10.4.237.12-volume
  Creation Time : Thu Jul 27 14:43:16 2017
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1073729503 (511.99 GiB 549.75 GB)
     Array Size : 536864704 (511.99 GiB 549.75 GB)
  Used Dev Size : 1073729408 (511.99 GiB 549.75 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
   Unused Space : before=8104 sectors, after=95 sectors
          State : clean
    Device UUID : 16dae7e3:42f3487f:fbeac43a:71cf1f63

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Aug  8 11:12:22 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 518c443e - correct
         Events : 167

   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
dev: 36001405a04ed0c104881200000000000p2
/dev/mapper/36001405a04ed0c104881200000000000p2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 5f3e8283:7f831b85:bc1958b9:6f2787a4
           Name : 10.4.237.12-volume
  Creation Time : Thu Jul 27 14:43:16 2017
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1073729503 (511.99 GiB 549.75 GB)
     Array Size : 536864704 (511.99 GiB 549.75 GB)
  Used Dev Size : 1073729408 (511.99 GiB 549.75 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
   Unused Space : before=8104 sectors, after=95 sectors
          State : clean
    Device UUID : ef612bdd:e475fe02:5d3fc55e:53612f34

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Aug  8 11:12:22 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : c39534fd - correct
         Events : 167

   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

dev: 36001405a04ed0c104881100000000000p2
00001000  fc 4e 2b a9 01 00 00 00  01 00 00 00 00 00 00 00  |.N+.............|
00001010  5f 3e 82 83 7f 83 1b 85  bc 19 58 b9 6f 27 87 a4  |_>........X.o'..|
00001020  31 30 2e 34 2e 32 33 37  2e 31 32 2d 76 6f 6c 75  |10.4.237.12-volu|
00001030  6d 65 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |me..............|
00001040  64 50 7a 59 00 00 00 00  01 00 00 00 00 00 00 00  |dPzY............|
00001050  80 cf ff 3f 00 00 00 00  00 00 00 00 02 00 00 00  |...?............|
00001060  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001080  00 20 00 00 00 00 00 00  df cf ff 3f 00 00 00 00  |. .........?....|
00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000010a0  00 00 00 00 00 00 00 00  16 da e7 e3 42 f3 48 7f  |............B.H.|
000010b0  fb ea c4 3a 71 cf 1f 63  00 00 08 00 48 00 00 00  |...:q..c....H...|
000010c0  54 f0 89 59 00 00 00 00  a7 00 00 00 00 00 00 00  |T..Y............|
000010d0  ff ff ff ff ff ff ff ff  9c 43 8c 51 80 00 00 00  |.........C.Q....|
000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001100  00 00 01 00 fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
*
00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  62 69 74 6d 04 00 00 00  5f 3e 82 83 7f 83 1b 85  |bitm...._>......|
00002010  bc 19 58 b9 6f 27 87 a4  a7 00 00 00 00 00 00 00  |..X.o'..........|
00002020  a7 00 00 00 00 00 00 00  80 cf ff 3f 00 00 00 00  |...........?....|
00002030  00 00 00 00 00 00 00 01  05 00 00 00 00 00 00 00  |................|
00002040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00003100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00004000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
003ffe00
dev: 36001405a04ed0c104881200000000000p2
00001000  fc 4e 2b a9 01 00 00 00  01 00 00 00 00 00 00 00  |.N+.............|
00001010  5f 3e 82 83 7f 83 1b 85  bc 19 58 b9 6f 27 87 a4  |_>........X.o'..|
00001020  31 30 2e 34 2e 32 33 37  2e 31 32 2d 76 6f 6c 75  |10.4.237.12-volu|
00001030  6d 65 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |me..............|
00001040  64 50 7a 59 00 00 00 00  01 00 00 00 00 00 00 00  |dPzY............|
00001050  80 cf ff 3f 00 00 00 00  00 00 00 00 02 00 00 00  |...?............|
00001060  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001080  00 20 00 00 00 00 00 00  df cf ff 3f 00 00 00 00  |. .........?....|
00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000010a0  01 00 00 00 00 00 00 00  ef 61 2b dd e4 75 fe 02  |.........a+..u..|
000010b0  5d 3f c5 5e 53 61 2f 34  00 00 08 00 48 00 00 00  |]?.^Sa/4....H...|
000010c0  54 f0 89 59 00 00 00 00  a7 00 00 00 00 00 00 00  |T..Y............|
000010d0  ff ff ff ff ff ff ff ff  5b 34 95 c3 80 00 00 00  |........[4......|
000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001100  00 00 01 00 fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
*
00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  62 69 74 6d 04 00 00 00  5f 3e 82 83 7f 83 1b 85  |bitm...._>......|
00002010  bc 19 58 b9 6f 27 87 a4  a7 00 00 00 00 00 00 00  |..X.o'..........|
00002020  a7 00 00 00 00 00 00 00  80 cf ff 3f 00 00 00 00  |...........?....|
00002030  00 00 00 00 00 00 00 01  05 00 00 00 00 00 00 00  |................|
00002040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00003100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00004000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
003ffe00

Assemble: prevent segfault with faulty "best" devices

In Assemble(), after context reload, best[i] can be -1 in some cases,
and before checking if this value is negative we use it to access
devices[j].i.disk.raid_disk, potentially causing a segfault.

Check if best[i] is negative before using it to prevent this potential
segfault.

Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Fixes: 69a481166b ("Assemble array with write journal")
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2018-01-21 16:36:43 -05:00
Jes Sorensen d16a749444 mdadm: Fixup != broken formatting
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-16 14:09:57 -04:00
Jes Sorensen d7be7d8736 mdadm: Fixup more broken logical operator formatting
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-16 13:59:43 -04:00
Jes Sorensen fc54fe7a7e mdadm: Fixup a large number of bad formatting of logical operators
Logical oprators never belong at the beginning of a line.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-16 13:52:15 -04:00
Zhilong Liu 9e04ac1c43 mdadm/util: unify stat checking blkdev into function
declare function stat_is_blkdev() to integrate repeated stat
checking blkdev operations, it returns 'true/1' when it is a
block device, and returns 'false/0' when it isn't.
The devname is necessary parameter, *rdev is optional, parse
the pointer of dev_t *rdev, if valid, assigned device number
to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:05:32 -04:00
Zhilong Liu 0a6bff09d4 mdadm/util: unify fstat checking blkdev into function
declare function fstat_is_blkdev() to integrate repeated fstat
checking block device operations, it returns true/1 when it is
a block device, and returns false/0 when it isn't.
The fd and devname are necessary parameters, *rdev is optional,
parse the pointer of dev_t *rdev, if valid, assigned the device
number to dev_t *rdev, if NULL, ignores.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-05 11:04:02 -04:00
NeilBrown cd6cbb08c4 Create: tell udev md device is not ready when first created.
When an array is created the content is not initialized,
so it could have remnants of an old filesystem or md array
etc on it.
udev will see this and might try to activate it, which is almost
certainly not what is wanted.

So create a mechanism for mdadm to communicate with udev to tell
it that the device isn't ready.  This mechanism is the existance
of a file /run/mdadm/created-mdXXX where mdXXX is the md device name.

When creating an array, mdadm will create the file.
A new udev rule file, 01-md-raid-creating.rules, will detect the
precense of thst file and set ENV{SYSTEMD_READY}="0".
This is fairly uniformly used to suppress actions based on the
contents of the device.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-02 09:41:39 -04:00
Jes Sorensen 0ef1043ce8 Assemble: Remove obsolete test for kernels older than 2.4
We only support 2.6.15+ at this point

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-12 14:50:02 -04:00
Jes Sorensen 94b53b777e Assemble: Clean up start_array()
This is purely cosmetic, no codeflow changes.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-12 14:23:45 -04:00
Jes Sorensen 32141c1765 Retire mdassemble
mdassemble doesn't handle container based arrays, no support for sysfs,
etc. It has not been actively maintained for years, so time to send it
off to retirement.

Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-04-11 12:54:26 -04:00
Jes Sorensen b6e60be628 Assemble/Assemble: Get rid of last use of md_get_version()
At this point in the code, we know we have a valid array, and any
recent kernel will return 9003, so no point in querying the kernel for
this.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 15:47:37 -04:00
Jes Sorensen 6142741d14 Assemble/Assemble: Stop checking kernel md driver version
Any kernel released during the last decade will return 9003 from
md_get_version() so no point in checking that.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-04-05 15:09:18 -04:00
Jes Sorensen dae131379f sysfs: Make sysfs_init() return an error code
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-30 16:52:37 -04:00
Artur Paszkiewicz e6e9dd3f1b Add 'ppl' and 'no-ppl' options for --update=
This can be used with --assemble for super1 and with --update-subarray
for imsm to enable or disable PPL in the metadata.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:35:07 -04:00
Artur Paszkiewicz e97a7cd011 super1: PPL support
Enable creating and assembling raid5 arrays with PPL for 1.x metadata.

When creating, reserve enough space for PPL and store its size and
location in the superblock and set MD_FEATURE_PPL bit. Write an initial
empty header in the PPL area on each device. PPL is stored in the
metadata region reserved for internal write-intent bitmap, so don't
allow using bitmap and PPL together.

While at it, fix two endianness issues in write_empty_r5l_meta_block()
and write_init_super1().

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:33:52 -04:00
Artur Paszkiewicz 2432ce9b32 imsm: PPL support
Enable creating and assembling IMSM raid5 arrays with PPL. Update the
IMSM metadata format to include new fields used for PPL.

Add structures for PPL metadata. They are used also by super1 and shared
with the kernel, so put them in md_p.h.

Write the initial empty PPL header when creating an array. When
assembling an array with PPL, validate the PPL header and in case it is
not correct allow to overwrite it if --force was provided.

Write the PPL location and size for a device to the new rdev sysfs
attributes 'ppl_sector' and 'ppl_size'. Enable PPL in the kernel by
writing to 'consistency_policy' before the array is activated.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
2017-03-29 11:32:49 -04:00
Jes Sorensen c5f71c2417 Introduce random_uuid() helper function
This gets rid of 5 nearly identical copies of the same code, and
reduces the binary size of mdadm by over 700 bytes on x86_64.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 15:41:34 -04:00
Jes Sorensen 0c79d8ca10 Assemble: No need for dummy NULL pointer when calling map_update()
assemble_container_content() doesn't need a dummy NULL pointer
variable for calling map_update. Passing NULL directly is sufficient.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:07:36 -04:00
Jes Sorensen 0a8e239c18 Assemble: assemble_container_content(): Avoid superfluous NULL initialization
No need to init avail to NULL since it will only be accessed after
assignment.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:06:28 -04:00
Jes Sorensen 30e19bf805 Assemble: Remove unnecesary NULL pointer checks when calling sysfs_free()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-08 12:19:03 -05:00
NeilBrown c61b1c0bb5 Release mdadm-3.4
My last release!

Signed-off-by: NeilBrown <neilb@suse.com>
2016-01-28 17:14:56 +11:00
NeilBrown d5ff855d47 super1: allow reshape that hasn't really started to be reverted.
A simple revert doesn't work here because the reshape_position is
in the critical section.
The best approach is to let the reshape progress a bit and then
go backwards.
If that isn't possible, assembling with --update=revert-reshape and
--invalid-backup should work.

Reported-by-tested-by: George Rapp <george.rapp@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2016-01-28 12:57:08 +11:00
Song Liu dbfbca4300 fix bug in assemble
In Assemble, getinfo_super() over-writes journal_clean.  To
ensure correct journal_clean, keep it in a local variable
before getinfo_super().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-09 07:35:50 +11:00
Song Liu 051f326550 mdadm: refactor write journal code in Assemble and Incremental
As discussed, standalone require_journal() in struct superswitch
is not a very good idea. Instead, journal related information
fits well in struct mdinfo.

This patch simplifies journal support code in Assemble and
Incremental as:

- Add journal_device_required and journal_clean to struct mdinfo;
- Remove function require_journal from struct superswitch;
- Update Assemble and Incremental to use journal_device_required
and journal_clean from struct mdinfo (instead of separate var).

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-22 12:19:09 +11:00
Song Liu 69a481166b Assemble array with write journal
Example output:

./mdadm --assemble /dev/md0 /dev/sd[c-f] /dev/sdb1
mdadm: /dev/md0 has been started with 4 drives and 1 journal.

mdadm checks superblock for journal devices. If the journal device
is missing or faulty, mdadm will show warning

./mdadm --assemble /dev/md0 /dev/sd[c-q] /dev/sdb1
mdadm: Not safe to assemble with missing or stale journal device, consider --force.

User can insist to start the array (read only) with --force

./mdadm --assemble /dev/md0 /dev/sd[c-q] /dev/sdb1 --force
mdadm: Journal is missing or stale, starting array read only.
mdadm: /dev/md0 has been started with 15 drives.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:15 +11:00
NeilBrown d80f7aa9a1 Assemble: correctly capture error from ->write_bitmap
else 'err' might be undefined.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-05 14:55:31 +10:00
NeilBrown 5997585200 Merge branch 'mdadm-3.3.x' 2015-08-03 16:21:37 +10:00
NeilBrown 8360760457 Assemble: really don't assemble IMSM array without OROM.
Previous patch missed on case.

Also print more useful information when rejecting
a device with IMSM metadata.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 16:06:51 +10:00
NeilBrown 7eee461e91 Assemble: don't assemble IMSM array without OROM.
If someone has an IMSM array, and disables RAID in the BIOS
and uses the devices for some other purpose, then they really don't
want mdadm to start syncing the array.

So don't assemble if OROM doesn't confirm it is OK.

There can still be problems for crash-dump not being able to find
the OROM.   Some explicit work-around might be needed for that
rather than a more general workaround that can corrupt data.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 15:42:16 +10:00
NeilBrown 9f2e55a421 Assemble: don't assemble IMSM array without OROM.
If someone has an IMSM array, and disables RAID in the BIOS
and uses the devices for some other purpose, then they really don't
want mdadm to start syncing the array.

So don't assemble if OROM doesn't confirm it is OK.

There can still be problems for crash-dump not being able to find
the OROM.   Some explicit work-around might be needed for that
rather than a more general workaround that can corrupt data.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-29 14:38:37 +10:00
NeilBrown 653299b699 Merge branch 'cluster'
Now that 3.3.3 is out, it is time to include the cluster-support code.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-27 11:01:08 +10:00
NeilBrown 86b77ddf87 Assemble: extend --homehost='<ignore>' to allow --name= to ignore homehost
Also make --homehost='<ignore>' work properly.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-24 12:50:54 +10:00
NeilBrown 00f23a8861 Assemble: improve tests for matching --name= request.
If the name in the array has a home-host, then
require that it matches, or is "any", or requested
homehost is "any".

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-22 09:24:36 +10:00
NeilBrown 29a312f2f3 Assemble: really ensure stripe_cache is bit enough to handle new chunk size
Earlier patch:
  56fcbcbb6f
calculated the proper chunk size - but didn't use it..

Let's actually use it this time.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-17 13:10:25 +10:00
NeilBrown 56fcbcbb6f Assemble: ensure stripe_cache is big enough to handle new chunk size
If you reshape to a larger chunk size, and need to restart,
it can have problems.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-06-18 15:49:52 +10:00