Commit Graph

298 Commits

Author SHA1 Message Date
NeilBrown ca3b669603 Minor cosmetic fixes in various files.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:21 +10:00
NeilBrown 0a8b92a6f6 Fix default size calculations that were recently broken.
commit d04f65f48c
    Change the values for "max size" from -1 to 1.

Messed up 's->size' - leaving it as '1' (MAX_SIZE) in some cases and
causing the array reshape to fail.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-13 08:00:18 +10:00
NeilBrown 50f01ba5a1 Use new struct context and struct shape for Grow_addbitmap
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:22:12 +10:00
NeilBrown 32754b7d84 Use new struct context and struct shape in Grow_reshape
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:22:09 +10:00
NeilBrown d04f65f48c Change the values for "max size" from -1 to 1.
Both are impossible, and '1' allows size to be unsigned,
which is neater.
Also #define MAX_SIZE to be '1' to make it all more explicit.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:20:32 +10:00
NeilBrown ba728be72f Convert 'quiet' to 'not verbose' in various places.
If we change some functions to accept 'verbose', where <0 means to be
quiet, in place of 'quiet', then we will be able to merge
'quiet' and 'verbose' together for simplicity.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:18:09 +10:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown c456301a05 Grow: don't print message if unfreezing fails.
This is most likely to happen if the array has been stopped,
in which case the error is pointless.

Reported-by: Patrik Horník <patrik@dsl.sk>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-15 12:12:58 +10:00
NeilBrown 385167f364 Grow: fix --layout=preserve to match man page.
I think there was some confusion about what --layout=preserve
actually means, but in any case it wasn't doing what the man
page says it should.
So add some case analysis and make sure it does the right thing,
or complains if it cannot.

Reported-by: Patrik Horník <patrik@dsl.sk>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-15 11:59:40 +10:00
NeilBrown b0a658ffbc Grow: failing the set the per-device size is not an error.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-03 16:18:22 +10:00
Jes Sorensen 012a864129 Introduce sysfs_set_num_signed() and use it to set bitmap/offset
mdinfo->bitmap_offset is a signed long and needs to be treated as
such when passed to the kernel.

This resolves the problem with adding internal bitmaps to a 1.0 array.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-30 09:56:22 +10:00
Lukasz Dorau b51702b827 fix: correct extending size of raid0 array
Setting "sync_action" to "idle" while extending size of raid0 array
is racy and sometimes fails.
"sync_action" should be set to "frozen" instead.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-23 10:12:33 +10:00
Adam Kwolek 58d26a2a81 FIX: Size change is possible as standalone change only
Size change is possible as standalone change only. To make sure size change
is not requested pass '-1' as size parameter.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:38 +10:00
Adam Kwolek 65a9798b58 FIX: Detect error and rollback metadata
Some setting size error cases were not detected.
When error occurs, stop setting new size action and rollback metadata
changes.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:38 +10:00
Adam Kwolek 7e7e9a4d72 FIX: Respect metadata size limitations
When reshape_super() updates metadata with new size, due to some metadata
limitations saved value can be different than requested value by user.
Update size (read it from metadata) for setting it in md.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:37 +10:00
Adam Kwolek 44f6f18113 FIX: Extend size of raid0 array
For raid0, takeover operation is required for size change.
Add takeover to degraded raid4 before size change and back to raid0 after.
Array information has to be read again from md after takeover.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:37 +10:00
Adam Kwolek 016e00f546 FIX: Support metadata changes rollback
Function reshape_super() guards metadata changes.
It is used to apply changes rollback in error case also.
As change (apply and rollback) can be not bi-directional reshape_super()
has to know if current action is metadata change that should be guarded
using metadata restrictions, or this is metadata rollback change
executed due to error occurrence.

In second case change has to be unconditional.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:37 +10:00
Adam Kwolek 54397ed97a imsm: Execute size change for external metatdata
For external metatdata ioctl doesn't set new size. Set new size using sysfs.
Put code for size change in to function to re-use the same code as during
On-line Capacity Expansion

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-17 12:33:37 +10:00
NeilBrown 5ca3a902fd Grow: print useful error when converting RAID1->RAID5 will fail.
RAID1 can only be converted to RAID0 or RAID5 if the size is
a multiple of 4K as we cannot have chunks smaller than 4K.

If this might happen, report a useful error message.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 17:00:57 +11:00
NeilBrown 0073a6e189 Remove possible crash during RAID6 -> RAID5 reshape.
If a RAID6 array is in a state which doesn't have a
RAID5 equivalent, the code currently dereferences a NULL.

If it does have an equivalent - use that.
If it doesn't but it already in the RAID5-compatible layout
with the Q block last, handle that case,
else require the new layout to be explicitly requested.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 15:34:17 +11:00
Adam Kwolek 178950eacc FIX: Changes in '0' case for reshape position verification
Reading sysfs entry that is '0' long should cause an error.
Reshape position cannot be empty.

Absence of reshape position should be ignored. It is possible
that we are about raid0 reshape continuation and it is before takeover.
This means that according metadata (changed by mdmon) it should be reshaped
but md knows nothing about it at this moment. Reshape continuation
in reshape_array() will change it to raid4 and reshape position appears
in sysfs.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-20 14:10:11 +11:00
Adam Kwolek 1ca90aa648 FIX: Do not try to (continue) reshape using inactive array
When one of arrays is inactive, do not try to continue reshape
on this array. Just skip it.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-09 12:38:15 +11:00
Adam Kwolek e1dd332a09 FIX: restart reshape when reshape process is stopped just between 2 reshapes
When reshape is restarted from '0', very begin of array
it is possible that for external metadata reshape and array
configuration doesn't happen.
Check if md has the same opinion, and reshape is restarted
from 0. If so, this is regular reshape start after reshape
switch in metadata to next array only.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-09 12:37:40 +11:00
Adam Kwolek f93346ef07 FIX: use md position to reshape restart
When reshape is broken, it can occur that metadata is not saved properly.
This can cause that reshape process is farther in md than metadata states.

On reshape restart use md position as start position, if it is farther than
position specified in metadata. Opposite situation treat as error.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-09 12:36:41 +11:00
Adam Kwolek 78340e26a5 Flush mdmon before next reshape step during container operation
Using takeover operation for grow purposes, mdadm has to be sure
that mdmon processes all updates, and if necessary it will be closed
at takeover to raid0 operation. If mdmon is late, next array in container
is processed and due to race condition mdmon closes itself instead to monitor
next reshape operation.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-09 12:20:52 +11:00
Adam Kwolek 59ab9f54a0 FIX: Typo error in fprint command
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-01-30 11:36:25 +11:00
Adam Kwolek 3c20f9899b FIX: mdmon check in reshape_container() can cause a problem
When raid0 reshape is executed mdmon can dissappear due to raid level
takeover operation. If this happen before mdmon check, mdadm would treat
it as error condition. It is not true for this case.

Remove mdmon check from reshape_container() function.
Error condition check will remain using reshape_array() reentry test
for the same array (line 2577).

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-01-30 11:36:25 +11:00
Adam Kwolek 5d1c7cdaca FIX: External metadata sometimes is not updated
External metadata sometimes is not updated.
It can be observed during 2 raid0 arrays Capacity Expansion.
New array size is not set, because metadata is not updated and on the reshape
end mdadm doesn't read new array size from metadata.
This happens when mdmon finishes his work (due to takeover to raid0),
before all metadata updates are processed.

Make sure that all updates are flushed to disk before executing takeover.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-01-30 11:36:25 +11:00
NeilBrown c0c1acd691 Grow/bitmap: support adding bitmap via sysfs.
Adding a bitmap via ioctl can only add it at a fixed location.
That location is not suitable for 4K-block devices.

So allow setting the bitmap location via sysfs if kernel supports it
and aim to always use 4K alignments.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 14:10:41 +11:00
NeilBrown 24daa16fa1 Grow.c: fix lots of white-space issues.
Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 06:59:51 +11:00
NeilBrown ce4783d3d6 Grow: fix reshape-array for shrinking reshapes.
The value in info->array.raid_disks is the total number of
devices, which is the 'after' number when the number is increasing,
and the 'before' number when the number is decreasing.

The code currently assumes it is always the 'after' number - so fix
that.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 06:59:48 +11:00
NeilBrown 27a1e5b5a4 Grow: fix start_reshape for shrinking arrays.
When an array is being reshaped to fewer data devices the relationship
between sync_max and reshape_progress is different to when the number
of devices increases - we need to allow for that when setting
sync_max/sync_min.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 06:59:45 +11:00
Adam Kwolek 97a3490c0d FIX: Add error message in container_reshape()
Add proper error message for container reshape when device cannot be opened.
fd variable operation is moved down to display information what particular
device cannot be opened.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-15 14:34:46 +11:00
Adam Kwolek 2d04d7e5c3 FIX: Do not allow for multiple reshape_array() execution during reshape_container() call
It can happen during reshape restart that reshape_array() can exit without
error (e.g. Grow.c:1915) and reshape is not moved to next array.
reshape_array() is called again for the same device.
Do not allow for such execution and check if last reshaped array is not
the current one.
This patch can be treat not as solution, but it allows for such errors
detection.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-15 14:34:36 +11:00
Adam Kwolek 4584621ab4 FIX: Do not continue container reshape when mdmon is absent
When mdmon is absent metadata is not updated, and container_reshape()
can fall in to endless loop. This can cause user data corruption.

In case when mdmon is absent do not continue container reshape process.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-07 11:46:35 +11:00
Jes Sorensen 8e61e0d7f9 Grow_reshape(): Fix another 'sra' leak
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen 730ae51fdd Grow_restart(): free() offsets after use
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen e7344e9007 Grow_addbitmap(): don't try to close a file descriptor which failed to open
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
Jes Sorensen 68fe8c6ed0 Grow_Add_device(): dev_open() return a negative fd on error
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-02 10:48:53 +11:00
NeilBrown 446894ea8d Grow: fix check_reshape and open_code it.
check_reshape should not try to parse the subarray string - only
metadata handlers are allowed to do that.

The common code and only interpret a subarray string by passing it to
"container_content" which will then return only the member for that
subarray.

So remove check_reshape and place similar logic explicitly at the two
call-sites.  They are different enough that it is probably clearer to
have explicit code.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 15:45:46 +11:00
Jes Sorensen 2641101b2f Add missing return in case of trying to grow sub-array
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 14:55:14 +11:00
Jes Sorensen d152f53eaa Fix memory leaks in reshape_array()
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-01 13:33:48 +11:00
Labun, Marcin 81219e70f2 kill-subarray: fix, IMSM cannot kill-subarray with unsupported metadata
container_content retrieves volume information from disks in the
container.  For unsupported volumes the function was not returning
mdinfo. When all volumes were unsupported the function was returning
NULL pointer to block actions on the volumes. Therefore, such volumes
were not activated in Incremental and Assembly. As side effect they
also could not be deleted using kill-subarray since "kill" function
requires to obtain a valid mdinfo from container_content.

This patch fixes the kill-subarray problem by allowing to obtain
mdinfo of all volumes types including unsupported and introducing new
array.status flags.

There are following changes:

1. Added MD_SB_BLOCK_VOLUME for blocking an array, other arrays in the
   container can be activated.

2. Added MD_SB_BLOCK_CONTAINER_RESHAPE block container wide reshapes
   (like changing disk numbers in arrays).

3. IMSM container_content handler is to load mdinfo for all volumes
   and set both blocking flags in array.state field in mdinfo of
   unsupported volumes.  In case of some errors, all volumes can be
   affected. Only blocked array is not activated (also reshaped as
   result). The container wide reshapes are also blocked since by
   metadata definition they require modifications of both arrays.

4. Incremental_container and Assemble functions check array.state and
   do not activate volumes with blocking bits set.

5. assemble_container_content is changed to check container wide reshapes
   before activating reshapes of assembled containers.

6. Grow_reshape and Grow_continue_command checks blocking bits
   before starting reshapes or continueing (-G --continue) reshapes.

7. kill-subarray ignores array.state info and can remove requested array.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-31 11:29:46 +11:00
Adam Kwolek 9ad6f6e65a FIX: Close unused handle in child process during reshape restart
When array reshape (e.g. raid0->raid5 migration) is restarted during
array assembly, file system placed on this array cannot be mounted until
reshape is finished due to "busy" error.

This is caused when reshape is executed on array for external metadata
and array handle is cloned /forked/ to child process environment but not
closed.

Handle can't be closed before executing Grow_continue() because it is
used later in code.

Close unused handle in child process /reshape_container()/.
It is similar to close fd handle in reshape_array() before calling
manage_reshape()/child_monitor() in Grow.c:2290.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-27 15:49:51 +11:00
NeilBrown fde139b91e Grow: Only ping monitor on level change if array is container based.
Pinging the monitor for a NULL container is bad.

Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Tested-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-17 16:51:31 +11:00
Adam Kwolek 3bd58dc65f Always run Grow_continue() for started array.
So far there were 2 reshape continuation cases:
 1. array is started /e.g. reshape was already invoked during initrd
                      start-up stage using "--freeze-reshape" option/
 2. array is not started yet /"normal" assembling array under reshape case/

This patch narrows continuation cases in to single one. To do this
array should be started /set readonly in to array_state/ before calling
Grow_continue() function.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-07 09:46:07 +11:00
Adam Kwolek 6937e6d216 Set correct reshape restart position
This patch version is simplified compared to previous one.
There is no use of freeze_reshape flag in start_reshape(). It is assumed
that for reshape starting condition reshape_progress field contains
0 value /correct start position/. For reshape restart case, it contains
correct restart position. This approach doesn't make start_reshape()
difficult to read/manage and /imho/ kernel changes to change mdstat
reporting behavior are not necessary.

Setting correct position allows user to see it in the mdstat during
reshape restart and reshape process is not reported as resync.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-05 14:00:00 +11:00
Adam Kwolek 2370a4dc02 Remove freeze() call from Grow_continue()
Grow_continue() for external metadata should be executed on blocked
from monitoring array(s)/container.
Additional call to freeze() is not necessary in such case.
It produces meaningless error message only.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-05 13:33:29 +11:00
NeilBrown cc7f63e553 restore_backup() throws core dump
restore_backup() throws core dump during releasing fdlist.
Loop for closing handlers checks next_spare variable,
but iterates disk_count.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-05 13:29:16 +11:00