Commit Graph

3208 Commits

Author SHA1 Message Date
Tomasz Majchrzak c922221e25 Remove: container should wait for an array to release a drive
A 'faulty' drive is being removed from a container after it has been
released by an array, however there is a race there. The drive is
released asynchronously by a monitor but sometimes it doesn't happen
before container checks it. It results in a container refusing to remove
a drive as it still seems to be a part of some array.

It seems 'ping_monitor' could be a solution here to assure monitor has
had a chance to process the events, however it doesn't resolve the
problem - sometimes an array has to request a release of the drive few
times (as the array is busy) and single 'ping_monitor' call is not
sufficient. As there is no way to query monitor progress, it forces us
to retry a check several times before an error is returned.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-07-21 11:25:16 -04:00
Alexey Obitotskiy 0febb20c45 imsm: properly handle values of sync_completed
The sync_completed can be set to such values:
- two numbers of processed sectors and total during synchronization,
separated with '/';
- 'none' if synchronization process is stopped;
- 'delayed' if synchronization process is delayed.
Handle value of sync_completed not only as numbers but
also check for 'none' and 'delayed'.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:58:58 -04:00
Alexey Obitotskiy b2be2b628b imsm: add handling of sync_action is equal to 'idle'
After resync is stopped sync_action value become 'idle'.
We treat this case as normal termination of waiting, not as error.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:58:06 -04:00
Pawel Baldysiak 955aa6cf75 monitor: Make sure that last_checkpoint is set to 0 after sync
In a case of successful completion of a resync (in the last step)
- read_and_act sometimes still reads sync_action as "resync"
but sync_completed already is set to component_size.
When this race occurs, sync operation is
marked as finished, but last_checkpoint is
overwritten with sync_completed. It will cause next
sync operation (ie. reshape) to be reported as complete immediately
after start - mdmon will write successful completion of the reshape
to metadata. This patch sets last_checkpoint to 0 once the sync
is completed to stop it happening.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:55:09 -04:00
Xiao Ni 8800f85381 MDADM:Check mdinfo->reshape_active more times before calling Grow_continue
When reshaping a 3 drives raid5 to 4 drives raid5, there is a chance that
it can't start the reshape. If the disks are not enough to have spaces for
relocating the data_offset, it needs to call start_reshape and then run
mdadm --grow --continue by systemd. But mdadm --grow --continue fails
because it checkes that info->reshape_active is 0.

The info->reshape_active is got from the superblock of underlying devices.
Function start_reshape write reshape to /sys/../sync_action. Before writing
latest superblock to underlying devices, mdadm --grow --continue is called.
There is a chance info->reshape_active is 0. We should wait for superblock
updating more time before calling Grow_continue.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-16 13:53:45 -04:00
Nikhil Kshirsagar 6e6e98746d The sys_name array in the mdinfo structure is 20 bytes of storage.
Increasing the size of this array to 32 bytes to handle cases with
longer device names.

Signed-off-by: Nikhil Kshirsagar <nkshirsa@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-14 13:38:19 -04:00
Jes Sorensen 26c62b8e76 Monitor: Use sysfs_free() to free object returned by sysfs_read()
We should always use sysfs_free() to release sysfs_* allocated
objects.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-10 14:56:23 -04:00
Mike Lovell 2e466cce45 Change behavior in find_free_devnm when wrapping around.
Newer kernels don't allow for specifying an array larger than 511. This
makes it so find_free_devnm wraps to 511 instead of 2^20 - 1.

Signed-off-by: Mike Lovell <mlovell@bluehost.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-03 15:36:11 -04:00
Mike Lovell 13db17bd1f Use dev_t for devnm2devid and devid2devnm
Commit 4dd2df0966 added a trip through makedev(), major(), and minor() for
device major and minor numbers. This would cause mdadm to fail in operating
on a device with a minor number bigger than (2^19)-1 due to it changing
from dev_t to a signed int and back.

Where this was found as a problem was when a array was created with a device
specified as a name like /dev/md/raidname and there were already 128 arrays
on the system. In this case, mdadm would chose 1048575 ((2^20)-1) for the
array and minor number. This would cause the major and minor number to become
negative when generated from devnm2devid() and passed to major() and minor()
in open_dev_excl(). open_dev_excl() would then call dev_open() which would
detect the negative minor number and call open() on the *char containing the
major:minor pair which isn't a valid file.

Signed-off-by: Mike Lovell <mlovell@bluehost.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-06-03 15:35:26 -04:00
Pawel Baldysiak df2647fa5b IMSM: retry reading sync_completed during reshape
The sync_completed after restarting a reshape
(for example - after reboot) is set to "delayed" until
mdmon changes the state. Mdadm does not wait for that change with
old kernels. If this condition occurs - it exits and reshape
is not continuing. This patch adds retry of reading sync_complete
with a delay. It gives time for mdmon to change the "delayed" state.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-19 10:44:21 -04:00
Guoqing Jiang 45a87c2f31 super1: add more checks for NodeNumUpdate option
There are some cases which didn't need to check the space
is enough or not for NodeNumUpdate option.

1. for array which does not have clustered bitmap.
2. "--nodes" parameter is 0 (eg, add a disk to clustered raid).
3. if "--nodes" parameter is set to a smaller num than
   current bms->nodes.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:44:51 -04:00
Jes Sorensen 6ac963cef0 Grow: Apply some more consistent formatting to Grow_addbitmap()
This should be purely cosmetic and cause no functional change
... famous last words!

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:27:24 -04:00
Jes Sorensen 4ed129aca7 Grow: Simplify error paths in Grow_addbitmap()
This gets rid of some repeated exit paths, making the code a little
cleaner.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:27:18 -04:00
Jes Sorensen 2ec2b7e9d5 mdadm: Make add_internal_bitmap() return 0 on success
add_internal_bitmap() returned 1 on success and 0 on error which is
inconsistent. This changes it to return 0 on success and use more
reasonable error codes on error.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 15:19:16 -04:00
Jes Sorensen c152f3610f Grow: Handle failure to load superblock in Grow_addbitmap()
Reported-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 14:30:10 -04:00
Jes Sorensen dac1b1115f Grow: Grow_addbitmap() reduce indentation
This makes the code a little more readable.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-12 14:27:11 -04:00
Guoqing Jiang bbc24bb350 super1: make the check for NodeNumUpdate more accurate
We missed to check the version is BITMAP_MAJOR_CLUSTERED
or not, otherwise mdadm can't create array with other 1.x
metadatas (1.0 and 1.1).

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-09 14:59:59 -04:00
Guoqing Jiang 261b57fe21 super1: don't update node nums if it is not more than 1
We at least need two nodes for cluster raid so make the
check before update node nums.

Reported-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-09 14:59:05 -04:00
Guoqing Jiang 82d9485e06 Create: check the node nums when create clustered raid
It doesn't make sense to create a clustered raid
with only 1 node.

Reported-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-05-09 14:59:01 -04:00
Jes Sorensen 1dcee1c9cb super1: Clear memory allocated for superblock + bitmap before use
load_super1() did not clear memory allocated for the superblock +
bitmap. This causes issues if the superblock does not contain a bitmap
as later checks of bitmap features would rely on the bits being
cleared.

This bug has been around for a long time, but was only exposed in
mdadm-3.4 with the introduction of the clustering code.

Reported-by: Jan Stodola <jstodola@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-04-06 16:13:59 -04:00
Marko Hauptvogel 95b55f1875 Consistent use of metric prefix in manpage
Added the optional K suffix for completeness, as it
is allowed by util.c's parse_size(char*).

Signed-off-by: Marko Hauptvogel <marko.hauptvogel@googlemail.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-04-01 16:12:12 -04:00
Artur Paszkiewicz f96b130224 Introduce stat2kname() and fd2kname()
These are similar to stat2devnm() and fd2devnm() but not limited to md
devices. If the device is a partition they will return its kernel name,
not the whole device's name. For more information see commit:
8d83493 ("Introduce devid2kname - slightly different to devid2devnm.")

Also remove unsued declaration for fmt_devname().

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-30 11:23:52 -04:00
zhilong 6447a12705 mdadm:Add '--nodes' option in GROW mode
mdadm:add '--nodes' option in GROW mode, because
'Cluster nodes' is set 4 by default if the nodes
parameter is not specified when switch bitmap
from none to clustered.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-29 11:41:26 -04:00
Guoqing Jiang 81306e021e Change the option from NoUpdate to NodeNumUpdate
Actually, we need to use NodeNumUpdate here to
ensure there are enough spaces for those nodes.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-24 12:33:27 -04:00
Jes Sorensen 0c79d8ca10 Assemble: No need for dummy NULL pointer when calling map_update()
assemble_container_content() doesn't need a dummy NULL pointer
variable for calling map_update. Passing NULL directly is sufficient.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:07:36 -04:00
Jes Sorensen 0a8e239c18 Assemble: assemble_container_content(): Avoid superfluous NULL initialization
No need to init avail to NULL since it will only be accessed after
assignment.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:06:28 -04:00
Jes Sorensen e9ddbb2be9 Manage: Manage_subdevs(): Remove unnecessary NULL initialization
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:06:18 -04:00
Jes Sorensen fbd3e15c0a Manage: Manage_add(): Avoid NULL initialization of dev_st
dev_st is only ever assigned if array->not_persistent == 0, so move
the second use of it into the same scope where the assignment is
made.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:06:07 -04:00
Jes Sorensen 9d1fbf65a5 mdadm: Cleanup conditionals
Be more consistent in the formatting of conditionals. Don't split on
multiple lines if not needed, don't overflow the 80 character line
length, put the condition operator at the end of the line of
multi-line conditionals, etc.

This should be purely cosmetic.... famous last words!

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:05:47 -04:00
Jes Sorensen 79a16a9b35 super_intel: imsm_manage_reshape(): Fix potential NULL pointer dereference
If sra == NULL we cannot goto abort, as it would result in calls to
sysfs_set_num() which would dereference sra.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:05:27 -04:00
Jes Sorensen 594dc1b8f0 super-intel: Remove excessive NULL/0 variable initialization
This removes a pile of unnecessary NULL/0 initialization of variables.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:04:59 -04:00
Jes Sorensen d209181d96 Manage: Manage_add(): Fix memory leak
sysfs_read() allocates and populates a struct mdinfo, however the code
forgot to free it again, before dropping the reference to the pointer.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 14:03:12 -04:00
Guoqing Jiang 31dbeda730 Grow: goto release if Manage_subdevs failed
If failure happened when add disk to array
by grow mode, need to goto release instead
of continue the reshape.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-22 13:53:10 -04:00
Yi Zhang a58e0da443 Grow: analyse_change add notification about only 2-device can be convert from RAID1 to RAID5
Notify "Can only convert a 2-device array to RAID5" instead of
"Impossibly level change request for RAID1" when convert from
RAID1 to RAID5 if the disk num is not equal two like RAID4/5->RAID1
did.

Signed-off-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-11 12:40:47 -05:00
Pawel Baldysiak 1a6dd6b9c1 super-intel: Simplify for() loop in ahci_enumerate_ports
This patch simplifies for() loop used in
ahci_enumerate_ports(). It makes it more readable.
Similar thing was done in b913501
({platform,super}-intel: Fix two resource leaks).

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-11 12:33:52 -05:00
Pawel Baldysiak b5eece6925 super-intel: Make print_vmd_attached_devs() return int again
This patch reverts a0abe1e
(super-intel: Make print_found_intel_controllers() return void)
and make this function "return int" again.
Also, interpreting the return value is added.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-11 12:33:46 -05:00
Pawel Baldysiak ad2f464602 Grow: close fd earlier to avoid "cannot get excl access" when stopping
If this file descriptor is not closed here, it remains open during
reshape process and stopping process will end up with
"cannot get exclusive access to container".
Once this file descriptor is no longer needed - it can be closed.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-11 12:32:31 -05:00
Hannes Reinecke d31d0f5218 Fix regression during add devices
Commit d180d2aa2a ("Manage: fix test for 'is array failed'.")
introduced a regression which would not allow to re-add new
drivers to a failed array.

Fixes: d180d2aa2a ("Manage: fix test for 'is array failed'.")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Coly Li <colyli@suse.de>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-10 11:44:21 -05:00
NeilBrown eddaacc304 ddf: use 64bit 'size', not 32bit 'info->size' for create.
The 'size' field of mdu_disk_info_t is 32bit and should not be used
except for legacy ioctls.  super-ddf got this wrong :-(

This change makes it possible to create ddf arrays which used more than
2TB of each device.

Reported-by: Dan Russell <dpr@aol.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-10 11:42:41 -05:00
Jes Sorensen a0abe1e667 super-intel: Make print_found_intel_controllers() return void
The return value from print_found_intel_controllers() is never used,
so lets make it return void.

Reported-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen 4b3eb4d2c5 super1: Fix potential buffer overflows when copying cluster_name
cmap_get_string() used to retrieve cluster_name does not restrict it's
size. To prevent buffer overflows use the size of the destination
buffer, not strlen() of the source, and null terminate the copied
string.

Fixes: 0aa2f15b ("mdadm: add the ability to change cluster name)"
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen cc5083d114 Manage: Manage_subdevs() fix file descriptor leak
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen de12cdc7eb bitmap: Fix resource leak in bitmap_file_open()
The code would leak 'fd' if locate_bitmap() failed.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen b913501173 {platform,super}-intel: Fix two resource leaks
The code did not free 'dir' allocated by opendir(). An additional
benefit is that this simplifies the for() loops.

Fixes: 60f0f54d ("IMSM: Add support for VMD")
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen efdfcc9e95 Grow: Grow_addbitmap(): Add check to quiet down static code checkers
Grow_addbitmap() is only ever called with s->bitmap_file != NULL, but
not all static code checkers catch this. This adds a check to quiet
down the false positive warnings.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen 12add44564 Grow: Grow_continue_command() remove dead code
All cases where fd2 is used are completed with a close(fd2), so there
is no need to set fd2 = -1 or check for it before exiting.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen 193b6c0b26 load_sys(): Add a buffer size argument
This adds a buffer size argument to load_sys(), rather than relying on
a hard coded buffer size. The old behavior was safe because we knew
the kernel would never return strings overrunning the buffers, however
it was ugly, and would cause code checking tools to spit out warnings.

This caused a Coverity warning over the read into
sra->sysfs_array_state which is only 20 bytes.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-09 11:35:34 -05:00
Jes Sorensen 2a1990c0f4 Manage: Manage_add(): Fix potential NULL pointer dereference
sysfs_read() may return NULL, so we should check the validity of the
pointer before dereferencing it.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-08 12:22:26 -05:00
Jes Sorensen 30e19bf805 Assemble: Remove unnecesary NULL pointer checks when calling sysfs_free()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-08 12:19:03 -05:00
Jes Sorensen fe112c9eba Incremental: Remove unnecesary NULL pointer checks when calling sysfs_free()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-03-08 12:19:03 -05:00