Commit Graph

3241 Commits

Author SHA1 Message Date
James Clarke 8e2bca513e Fix bus error when accessing MBR partition records
Since the MBR layout only has partition records as 2-byte aligned, the
32-bit fields in them are not aligned. Thus, they cannot be accessed on
some architectures (such as SPARC) by using a "struct MBR_part_record *"
pointer, as the compiler can assume that the pointer is properly aligned.
Instead, the records must be accessed by going through the MBR struct
itself every time.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-19 12:38:02 -04:00
Jes Sorensen 089f9d795e super-intel: Reduce excessive parenthesis abuse
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-19 12:31:00 -04:00
Mariusz Dabrowski ddab63c7de Allow level migration only for single-array container
IMSM doesn't allow to change RAID level of array in container with two
arrays but array count check is being done too late (after removing disks)
and in some cases (e. g. RAID 0 and RAID 1 migrated to RAID 0) both arrays
become degraded. This patch adds array count check before disks are being
removed.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-19 11:26:49 -04:00
Mariusz Dabrowski 2d2b0eb7b9 imsm: block chunk size change for RAID 10
Chunk size change of RAID 10 array fails because it is not supported but
invalid values still are being written to metadata and array cannot be
assembled after stop. Operation should be blocked before metadata update.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-19 11:22:36 -04:00
Guoqing Jiang 119b66a473 super1: make write_bitmap1 compatible with previous mdadm versions
For older mdadm version, v1.x metadata has different bitmap_offset,
we can't ensure all the bitmaps are on a 4K boundary since writing
4K for bitmap could corrupt the superblock, and Anthony reported
the bug about it at below link.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=837964

So let's check about the alignment for bitmap_offset before set
the boundary to 4096 unconditionally. Thanks for Neil's detailed
explanation.

Reported-by: Anthony DeRobertis <anthony@derobert.net>
Fixes: 95a05b37e8 ("Create n bitmaps for clustered mode")
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-19 11:21:15 -04:00
NeilBrown 681b7ae245 Fix some issues found by clang
The clang compiler complained about each of these.

The mdmon.h error will only affect 'far' RAID10 arrays using intel or DDF
metadata, and there is no such thing.

The mdopen.c will cause a problem if there are no free md device
numbers in the first 512.  That is fairly unlikely.

The restripe.c error would only affect the 'test_stripe' command, and
probably doesn't change its behaviour.

The super-intel.c fix is purely cosmetic.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-07 11:47:48 -04:00
Artur Paszkiewicz 21e9380b26 imsm: retrieve nvme serial from sysfs
Don't rely on SCSI ioctl for reading NVMe serials - SCSI emulation for
NVMe devices can be disabled in the kernel config. Instead, try to get a
serial from /sys/block/nvme*/device/serial. If that fails for whatever
reason (i.e. no such attribute in old kernels) - fall back to the SCSI
method.

This also moves some SCSI-specific code from imsm_read_serial() to
scsi_get_serial().

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-10-07 11:18:32 -04:00
Mariusz Dabrowski fa219dd26a Fix RAID metadata check
mdadm recognizes devices with partition table as part of an RAID array
and invalid warning message is displayed. After this fix proper warning
messages are being displayed for MBR/GPT disks and devices with RAID
metadata.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-09-22 11:35:02 -04:00
Artur Paszkiewicz 676e87a806 imsm: remove redundant characters from some error messages
Fix the cases that produced messages like "mdadm: : The message".

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-09-16 09:50:50 -04:00
Artur Paszkiewicz 83ca7d4527 imsm: do not activate spares for uninitialized member arrays
This fixes some issues when a member array is created with "missing"
devices in a container that has more devices than used in the member
array.

Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-09-15 12:16:07 -04:00
Song Liu 474267015b mdadm: fix a buffer overflow
struct mdp_superblock_1.set_name is 32B long, but struct mdinfo.name
is 33B long. So we need strncpy instead strcpy to avoid buffer
overflow.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-09-12 12:51:12 -04:00
Robert LeBlanc bd1fd72e13 mdopen: Prevent overrunning the devname buffer when copying devnm into it for long md names.
Linux allows for 32 character device names. When using the maximum
size device name and also storing "/dev/", devname needs to be 37
character long to store the complete device name.
i.e. "/dev/md_abcdefghijklmnopqrstuvwxyz12\0"

Signed-off-by: Robert LeBlanc<robert@leblancnet.us>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-25 13:43:31 -04:00
Jes Sorensen 6e88b3b3e5 bitmap: Mark a number of local functions static
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 16:35:28 -04:00
Jes Sorensen 34996a5f89 bitmap: Handle errors when reading bitmap info for cluster nodes
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 16:21:33 -04:00
Jes Sorensen 9ca0de6241 bitmap: Simplify code for bitmap_file_open()
By switching to open+fstat rather than stat+open the code can be
simplified and avoid duplicating the open handling.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 16:16:05 -04:00
Jes Sorensen 00fab7459a super0: Clean up formatting in examine_super0()
No funcionality change - should be purely cosmetic

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 15:56:23 -04:00
Jes Sorensen a8cb6604b6 super0: Fix spelling of 'version' in comment and fix formatting
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 15:49:59 -04:00
Jes Sorensen 055b766b1c super0: Use random_uuid() in init_super0()
This shaves another 80 bytes off the mdadm binary.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 15:48:56 -04:00
Jes Sorensen c5f71c2417 Introduce random_uuid() helper function
This gets rid of 5 nearly identical copies of the same code, and
reduces the binary size of mdadm by over 700 bytes on x86_64.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 15:41:34 -04:00
Jes Sorensen 977d12d739 mdadm.h: Fix build problem against newer glibc
Newer glibc requires direct include of sys/sysmacros.h in order to
access makedev().

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-15 11:30:39 -04:00
Song Liu 690e46c320 mdadm: put journal device in right place of --detail
When there is failed HDDs, journal device showed in wrong place
of --detail:

    Number   Major   Minor   RaidDevice State
       4       8       24        -      journal   /dev/sdb8
       1       8       18        1      active sync   /dev/sdb2
       2       8       19        2      active sync   /dev/sdb3
       3       8       21        3      active sync   /dev/sdb5

       0       8       17        -      faulty   /dev/sdb1

This patch fixed the output as:

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       18        1      active sync   /dev/sdb2
       2       8       19        2      active sync   /dev/sdb3
       3       8       21        3      active sync   /dev/sdb5

       0       8       17        -      faulty   /dev/sdb1
       4       8       24        -      journal   /dev/sdb8

Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-12 10:58:58 -04:00
Song Liu ff3c881f84 mdadm: add man page for --add-journal
Add the following to man page:

--add-journal
      Recreate journal for RAID-4/5/6 array that lost a journal device.
      In the current implementation, this command cannot add a journal
      to an array that had a failed journal. To avoid interrupting
      on-going write opertion --add-journal only works for array in
      Read-Only state.

Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-12 10:57:13 -04:00
Jes Sorensen ad7ac9ac66 lib: Various coding style cleanups
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 16:01:00 -04:00
Jes Sorensen 781f7efbac lib: Avoid if and return on the same line
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:53:29 -04:00
Jes Sorensen 36138e4e4b sysfs: Avoid if and return on the same line
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:52:48 -04:00
Jes Sorensen 7eef9be219 super1: Avoid if and return on the same line
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:52:02 -04:00
Jes Sorensen f1bbb5ff6d restripe: Avoid if and return on the same line
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:51:00 -04:00
Jes Sorensen 9f0ad56be0 util: Never have if and return on the same line
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:48:47 -04:00
Jes Sorensen 421c6c047e config: Various stylistic cleanups
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:48:09 -04:00
Jes Sorensen 6a674388f8 config: Use xcalloc() rather than xmalloc()+memset()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-11 15:32:34 -04:00
Artur Paszkiewicz c012223056 Incremental: don't try to load_container() for a subarray
mdadm -IRs would exit with a non-zero status because of this.

Reported-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-09 10:57:15 -04:00
Zhilong Liu e19a149c72 mdadm:add 'clustered' in typo prompt when specify wrong param for bitmap
mdadm: 'clustered' bitmap has already supported, thus add the
         prompt if users specify wrong value for bitmap param.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-08-02 10:06:43 -04:00
Tomasz Majchrzak 52209d6ee1 Monitor: release /proc/mdstat fd when no arrays present
If md kernel module is reloaded, /proc/mdstat cannot be accessed ("cat:
/proc/mdstat: No such file or directory"). The reason is mdadm monitor
still holds a file descriptor to previous /proc/mdstat instance. It
leads to really confusing outcome of the following operations - mdadm
seems to run without errors, however some udev rules don't get executed
and new array doesn't work.

Add a check if lseek was successful as it fails if md kernel module has
been unloaded - close a file descriptor then. The problem is mdadm
monitor doesn't always do it before next operation takes place. To
prevent it monitor always releases /proc/mdstat descriptor when there
are no arrays to be monitored, just in case driver unload happens in a
moment.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-07-21 11:37:17 -04:00
Tomasz Majchrzak c922221e25 Remove: container should wait for an array to release a drive
A 'faulty' drive is being removed from a container after it has been
released by an array, however there is a race there. The drive is
released asynchronously by a monitor but sometimes it doesn't happen
before container checks it. It results in a container refusing to remove
a drive as it still seems to be a part of some array.

It seems 'ping_monitor' could be a solution here to assure monitor has
had a chance to process the events, however it doesn't resolve the
problem - sometimes an array has to request a release of the drive few
times (as the array is busy) and single 'ping_monitor' call is not
sufficient. As there is no way to query monitor progress, it forces us
to retry a check several times before an error is returned.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-07-21 11:25:16 -04:00
Alexey Obitotskiy 0febb20c45 imsm: properly handle values of sync_completed
The sync_completed can be set to such values:
- two numbers of processed sectors and total during synchronization,
separated with '/';
- 'none' if synchronization process is stopped;
- 'delayed' if synchronization process is delayed.
Handle value of sync_completed not only as numbers but
also check for 'none' and 'delayed'.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:58:58 -04:00
Alexey Obitotskiy b2be2b628b imsm: add handling of sync_action is equal to 'idle'
After resync is stopped sync_action value become 'idle'.
We treat this case as normal termination of waiting, not as error.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:58:06 -04:00
Pawel Baldysiak 955aa6cf75 monitor: Make sure that last_checkpoint is set to 0 after sync
In a case of successful completion of a resync (in the last step)
- read_and_act sometimes still reads sync_action as "resync"
but sync_completed already is set to component_size.
When this race occurs, sync operation is
marked as finished, but last_checkpoint is
overwritten with sync_completed. It will cause next
sync operation (ie. reshape) to be reported as complete immediately
after start - mdmon will write successful completion of the reshape
to metadata. This patch sets last_checkpoint to 0 once the sync
is completed to stop it happening.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:55:09 -04:00
Xiao Ni 8800f85381 MDADM:Check mdinfo->reshape_active more times before calling Grow_continue
When reshaping a 3 drives raid5 to 4 drives raid5, there is a chance that
it can't start the reshape. If the disks are not enough to have spaces for
relocating the data_offset, it needs to call start_reshape and then run
mdadm --grow --continue by systemd. But mdadm --grow --continue fails
because it checkes that info->reshape_active is 0.

The info->reshape_active is got from the superblock of underlying devices.
Function start_reshape write reshape to /sys/../sync_action. Before writing
latest superblock to underlying devices, mdadm --grow --continue is called.
There is a chance info->reshape_active is 0. We should wait for superblock
updating more time before calling Grow_continue.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:53:45 -04:00
Nikhil Kshirsagar 6e6e98746d The sys_name array in the mdinfo structure is 20 bytes of storage.
Increasing the size of this array to 32 bytes to handle cases with
longer device names.

Signed-off-by: Nikhil Kshirsagar <nkshirsa@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-14 13:38:19 -04:00
Jes Sorensen 26c62b8e76 Monitor: Use sysfs_free() to free object returned by sysfs_read()
We should always use sysfs_free() to release sysfs_* allocated
objects.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-10 14:56:23 -04:00
Mike Lovell 2e466cce45 Change behavior in find_free_devnm when wrapping around.
Newer kernels don't allow for specifying an array larger than 511. This
makes it so find_free_devnm wraps to 511 instead of 2^20 - 1.

Signed-off-by: Mike Lovell <mlovell@bluehost.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-03 15:36:11 -04:00
Mike Lovell 13db17bd1f Use dev_t for devnm2devid and devid2devnm
Commit 4dd2df0966 added a trip through makedev(), major(), and minor() for
device major and minor numbers. This would cause mdadm to fail in operating
on a device with a minor number bigger than (2^19)-1 due to it changing
from dev_t to a signed int and back.

Where this was found as a problem was when a array was created with a device
specified as a name like /dev/md/raidname and there were already 128 arrays
on the system. In this case, mdadm would chose 1048575 ((2^20)-1) for the
array and minor number. This would cause the major and minor number to become
negative when generated from devnm2devid() and passed to major() and minor()
in open_dev_excl(). open_dev_excl() would then call dev_open() which would
detect the negative minor number and call open() on the *char containing the
major:minor pair which isn't a valid file.

Signed-off-by: Mike Lovell <mlovell@bluehost.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-03 15:35:26 -04:00
Pawel Baldysiak df2647fa5b IMSM: retry reading sync_completed during reshape
The sync_completed after restarting a reshape
(for example - after reboot) is set to "delayed" until
mdmon changes the state. Mdadm does not wait for that change with
old kernels. If this condition occurs - it exits and reshape
is not continuing. This patch adds retry of reading sync_complete
with a delay. It gives time for mdmon to change the "delayed" state.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-19 10:44:21 -04:00
Guoqing Jiang 45a87c2f31 super1: add more checks for NodeNumUpdate option
There are some cases which didn't need to check the space
is enough or not for NodeNumUpdate option.

1. for array which does not have clustered bitmap.
2. "--nodes" parameter is 0 (eg, add a disk to clustered raid).
3. if "--nodes" parameter is set to a smaller num than
   current bms->nodes.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:44:51 -04:00
Jes Sorensen 6ac963cef0 Grow: Apply some more consistent formatting to Grow_addbitmap()
This should be purely cosmetic and cause no functional change
... famous last words!

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:27:24 -04:00
Jes Sorensen 4ed129aca7 Grow: Simplify error paths in Grow_addbitmap()
This gets rid of some repeated exit paths, making the code a little
cleaner.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:27:18 -04:00
Jes Sorensen 2ec2b7e9d5 mdadm: Make add_internal_bitmap() return 0 on success
add_internal_bitmap() returned 1 on success and 0 on error which is
inconsistent. This changes it to return 0 on success and use more
reasonable error codes on error.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:19:16 -04:00
Jes Sorensen c152f3610f Grow: Handle failure to load superblock in Grow_addbitmap()
Reported-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 14:30:10 -04:00
Jes Sorensen dac1b1115f Grow: Grow_addbitmap() reduce indentation
This makes the code a little more readable.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 14:27:11 -04:00
Guoqing Jiang bbc24bb350 super1: make the check for NodeNumUpdate more accurate
We missed to check the version is BITMAP_MAJOR_CLUSTERED
or not, otherwise mdadm can't create array with other 1.x
metadatas (1.0 and 1.1).

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-09 14:59:59 -04:00