Commit Graph

110 Commits

Author SHA1 Message Date
Adam Kwolek e6e9d47b76 Grow: open backup file for reshape as function
Move opening backup file to the function for future reuse during
container reshape.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-03 15:00:16 +11:00
NeilBrown acab7bb189 Create/grow: improve checks on number of devices.
Check on upper limit of number of devices was in the wrong place.
Result was could not create array with more than 27 devices without
explicitly setting metadata, even though default metadata allows more.

Fixed, and also perform check when growing an array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-01 14:51:27 +11:00
NeilBrown c82afc17a8 Grow: disallow placing backup file on array being reshaped.
the tests here aren't perfect, but they could catch some cases.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-01 11:58:32 +11:00
NeilBrown 87f26d14f7 Assemble: allow an array undergoing reshape to be started without backup file
Though not having the proper backup file can cause data corruption, it
is not enough to justify not being able to start the array at all.
So allow "--invalid-backup" to be specified which says "just continue
even if a backup cannot be restored".

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-01 11:47:32 +11:00
NeilBrown ff63406404 Grow: give useful message when adding bitmap gives EBUSY.
If adding a bitmap fails with EBUSY, then it is because the array is
currently resyncing/recovering/reshaping.
As this is non-obvious, give a message explaining the fact.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 16:34:25 +11:00
NeilBrown b3bd581b1d Fix warning about host-endian bitmaps.
Hostendian bitmaps should be warned about on all arch's.
And fix a speeling mistake.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 16:25:26 +11:00
Adam Kwolek 1c009fc218 Compute backup blocks in function.
number of backup blocks evaluation is put in to function for code reuse.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 13:30:22 +11:00
Adam Kwolek 130994cb83 Prepare and free fdlist in functions
fd handles table creation is put in to function for code reuse.

In manage_reshape(), child_grow() function from Grow.c will be reused.
To prepare parameters for this function, code from Grow.c can be
reused also.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 13:27:08 +11:00
Adam Kwolek 62a48395f6 Disk removal support for Raid10->Raid0 takeover
Until now Raid10->Raid0 takeover was possible only if all the mirrors
where removed before md starts the takeover.  Now mdadm, when
performing Raid10->raid0 takeover, will remove all unwanted mirrors
from the array before actual md takeover is called.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-29 11:57:51 +11:00
Dan Williams 7f2ba464e4 External reshape (step 2): Freeze container
When growing the number of raid disks the reshape process will promote
container-spares to subarray-spares (later the kernel promotes them to
subarray-members in raid5_start_reshape()).  The automatic spare
promotion that mdmon performs upon seeing a degraded array must be
disabled until the reshape process has been initiated.  Otherwise, mdmon
may start a rebuild before the reshape parameters can be specified.

In the external case we arrange for the monitor to be blocked, and
turn off the safemode delay.
Mdmon is updated to check sync_action is not frozen before initiating
recovery.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 16:39:58 +11:00
Dan Williams 7bc7119671 External reshape (step 1): container reshape and ->reshape_super()
In the native metadata case Grow_reshape() and the kernel validate what
reshapes are possible / supported and the kernel handles all the metadata
updates.  In the external case the metadata format may have specific
constraints above this baseline.  External formats also introduce the
constraint of only permitting some reshapes at container scope versus subarray
scope.  For exmaple imsm changes to 'raiddisks' must be applied to all arrays
in the container.

This operation assumes that its 'st' parameter has been obtained from
super_by_fd() (such that st->subarray is up to date), and that a snapshot of
the metadata has been loaded from the container.

Why a new method, versus extending an existing one?
->validate_geometry: this routine assumes it is being called from Create(),
adding reshape complicates the cases that this routine needs to handle.  Where
we find that checks can be shared between the two cases those routines
refactored into common code internal to the metadata handler, i.e. no need to
provide a unified external interface.  ->validate_geometry() also does not
expect to update the metadata.

->update_super: this is meant to update single fields at Assembly() and only at
the container scope.  Reshape potentially wants to update multiple fields at
either container or subarray scope.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 16:09:27 +11:00
Dan Williams 72e4a37822 Grow: fix check for raid6 layout normalization
If the user does not specify a layout, don't skip asking about retaining
the non-standard raid6 layout which may be implicitly changed.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 15:11:37 +11:00
Dan Williams 4411fb1749 Grow: mark some functions static
Going through the Grow api found some local routines that could be
marked static.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 15:08:42 +11:00
NeilBrown 4725bc31fb super_by_fd: return subarray info explicitly.
Rather than hiding this in the 'st', return it explicitly.

In the one case we still need it, copy it into st where needed.
This will disappear in a future patch.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown a5d85af748 get_info_super: report which other devices are thought to be working/failed.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.

We should never include a device in an array if it thinks that the
primary device has failed.

This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown 925211e323 Grow: use raid_disks, not nr_disks
nr_disks is just wrong here - the arrays need room for all disk slots,
even if some are empty, plus spares, plus a possible backup file.
So raid_disks is correct.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-06 14:40:53 +10:00
NeilBrown 7204495377 Fix writing of second backup superblock during grow
There 'rv' tests were confused and sometimes wrong.
This resulted in not writing the second bsb.

Also fix the test script so the the critical section is long enough
that we have some hope of interrupting it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-05 21:39:17 +10:00
NeilBrown f21e18ca89 Compile with -Wextra by default
This produced lots of warning, some of which pointed to actual bugs.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-05 13:13:02 +10:00
NeilBrown 5f6ca90a9b Fix restarting of reshaping arrays.
We cannot get stripe_cache_size until after the array have
been activated!!

Signed-off-by: NeilBrown <neilb@suse.de>
2010-07-29 13:50:15 +10:00
NeilBrown b7e734fc22 Fix use of rv in Grow_reshape
1/ and extra local var was declared, which causes rv setting
   to be lost
2/ a -ve rv was left -ve while we should be return 1 on err.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-07-29 13:16:01 +10:00
Doug Ledford 0155af90d8 Bugfix: don't issue a read larger than the buffer to hold it
Signed-off-by: Doug Ledford <dledford@redhat.com>
2010-07-22 10:16:31 -04:00
Dan Williams b526e52dc7 Always assume SKIP_GONE_DEVS behaviour and kill the flag
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)

A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use.  When mdadm -I is called
on the new disk we fail the call to sysfs_read().  I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive.  So just make SKIP_GONE_DEVS the default behaviour.

Also fix up remaining unchecked usages of the sysfs_read() return value.

Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-16 17:26:04 -07:00
NeilBrown c03ef02d92 Grow: move error message closer to error cause.
A recent change move the sysfs_read call away from the check that it
succeeded.  This patch moves the check back next to the sysfs_read
call.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-05-18 12:29:28 +10:00
NeilBrown 200871adf9 Grow: avoid overflow of chunk sizes.
Chunks aren't particularly big, but when you could them in bytes
and multiply them together (as we do for calculating the backup
size for 'grow') they can overflow a 32bit int.

So group the division by 512 more closely with the
chunk size so were would need 30Meg chunks to come close to
overflowing 32bits.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-04-29 16:14:30 +10:00
NeilBrown ebeb366382 Don't attempt to create or read bitmaps where the metadata doesn't support it.
In particular, if the relevant bitmap method is NULL, don't try to
call it, print an error instead.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-04-07 09:18:01 +10:00
NeilBrown a847575aa1 Grow: fix recent breakage - lseek return status.
Recent fix to check lseek64 return status got it badly wrong.
It doesn't return 0 on success!!

Fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-10 15:21:28 +11:00
NeilBrown be1cabbd29 Grow: fix problem with validating chunk size
When checking if the new chunk size fit in the component size
we were confusing sectors and K, and so getting it wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-09 14:14:39 +11:00
NeilBrown fcf5762500 Add _FORTIFY_SOURCE to mdadm.O2 build.
When building mdadm.O2, set _FORTIFY_SOURCE to get more
warnings, and also build mdmon.O2 to find warnings in that
code too.
Then fix the warnings.

Suggested-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-03 10:54:17 +11:00
NeilBrown 53f5035339 Fix warning about unused variable.
Warning only appears with -O2, but is invalid.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-03 10:29:24 +11:00
NeilBrown 097075b611 Grow: be more relaxed about timestamp mismatches on backup file.
As backup file has a timestamp which is updated quite separately
from the metadata timestamp.  They should be largely in-sync but
sometimes are not.
So be more generous in the check, and allow it to be over-ridden
by an environment variable.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-24 11:59:11 +11:00
NeilBrown 39bbb39202 Grow: If bitmap interferes with grow, report this.
If a bitmap exists on an array, then current kernels cannot grow
that array.
So when we try to grow an array, test for EBUSY and if a bitmap is
present, report that the bitmap needs to be removed.

Resolves-Debian-Bug: 534571
Signed-off-by: NeilBrown <neilb@suse.de>
2010-01-28 11:48:03 +11:00
NeilBrown 080fd00521 Remove stray debugging printfs
These were never supposed to be released, and due
to a type issue they cause compile problems on
some architectures.

Resolves-Debian-Bug: 567167
Signed-off-by: NeilBrown <neilb@suse.de>
2010-01-28 08:55:18 +11:00
NeilBrown f98841b385 Grow: be more careful when using array.size
As array.size is 32bit we need to prefer the 'component_size'
read from sysfs when that is available.
Grow wasn't always suitably careful.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-26 16:28:35 +11:00
NeilBrown 2ed4f75388 Grow: avoid truncation error when checking size of array.
array.size is only 32bit so it is not safe to multiply it
up before casting to (long long).
Actually, we shouldn't be using array.size here at all, but that
will get fixed in a subsequent patch.

Reported-by: Andrew Burgess <aab@cichlid.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-26 14:19:26 +11:00
NeilBrown ff94fb86fd Grow: various fixes to recent breakages.
- I forgot to write the send backup-super-block on spares.
- I wasn't adding the data_offset to an offset

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-17 13:15:33 +11:00
NeilBrown 14e5b4d72b Grow: data_offset is in sectors, offsets[] is in bytes - convert
Another missed sectors->bytes conversion.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-16 11:06:44 +11:00
NeilBrown 9ce510be9c Grow: do not allow size changes with other changes.
A change the reduces the size of an array always happens
before any other change.  So it can cause data to be lost.
By themselves these changes are reversible.  But once another
change has started, the data would be permanently lost.
So recommend data integrity be checked between a size change
and any other change.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 17:26:47 +11:00
NeilBrown b5ea446ae7 Grow: goto release rather than just return
otherwise we exit with the array frozen.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 15:22:14 +11:00
NeilBrown d2505cff5a Grow: restrict to 2.6.32
2.6.31 has a bug which can lead to unsafe reshaping.
So only allow a reshape with 2.6.32.
When the required fixed get into 2.6.31.y, this can be relaxed
slightly

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 15:19:39 +11:00
NeilBrown 1b13faf757 Grow: use large block count and make sure stripe cache can hold it.
The bigger the backup is, the fast it goes to some extend.

16Meg is fairly arbitrary

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 14:48:10 +11:00
NeilBrown e380d3be42 Grow: get component_size before using it.
We were using ->component_size while it hadn't been set.
This effectively meant that 'blocks' wasn't multiplied by
16 and reshape was even slower than it should have been.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 14:18:49 +11:00
NeilBrown d44453876e Grow: handle array going degraded during reshape.
If an array goes degraded during reshape, we need to
adjust the devices we read from so as not to back up
stale data.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 13:56:05 +11:00
NeilBrown 92dcdf7c01 Grow: restore backup to proper location.
The 'arraystart' is in sectors while restore_stripes requires
bytes, so we need a conversion.

Without this, backups get restored to the wrong offset.

Reported-by: "KueiHuan Chen" <kueihuan.chen@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-06 13:38:43 +11:00
NeilBrown 9739642288 Grow: update backup-metadata mtime every time we write it.
Originally the backup-metadata was only written once at the
start of a raid5 reshape that made the array bigger.  So we only
set the mtime once.

Now that we can be writing metadata continually during an in-place
reshape, we need to update the mtime more often.

Also, allow the metadata mtime to be slightly in advance of the
array mtime.  Normally the difference will be less than a second,
so 10 minutes should be plenty.  This guards against an old backup
file being used to restart an array.  but starting two reshapes in the
10 minutes is sufficiently unlikely, and the possibility of an
accident is already sufficiently small, that 10 minutes is probably
fine.

Thanks to Guy Martin <gmsoft@tuxicoman.be> for discovering and
reporting that .mtime wasn't being updated properly.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-22 10:42:06 +11:00
NeilBrown 24d40069d7 Grow: reject raid-disks reduction in RAID5 etc before 2.6.32
2.6.31 has some bugs with restarting a RAID5 reduction, so
refuse to try unless at least 2.6.32.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 16:36:03 +11:00
NeilBrown ea0ebe9685 Assemble: print more verbose messages about restarting a reshape
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 16:23:45 +11:00
NeilBrown 22e305169f Add missing 'continue' in Grow_restart.
Thus we weren't checking the uuid properly.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 15:36:49 +11:00
NeilBrown 82f2d6abf0 Grow_restart to handle reducing number of devices in an array.
FIXME this is wrong . what direction does reshape_position move?

If the device count in an array is shrinking, the critical
region is different so the tests need to be different when
restarting.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-16 17:43:51 +11:00
NeilBrown eba7152931 Grow: don't make 'blocks' too large during in-place reshape.
On small (test) arrays, multiplying by 16 can make the 'chunk' size
larger than half the array, which is a problem.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-16 17:02:34 +11:00
NeilBrown 725cac4c56 Grow: ignore error from final wait_backup
The last time wait_backup is called, it might see reshape
finish and so return an error indicator.
But this is not an error, and we must go ahead and prepare
the array for full access.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-12 16:55:19 +11:00