mdadm

Commit Graph

Author	SHA1	Message	Date
NeilBrown	92d1991fff	Fix calculations for max_progress and completed. 'sync_completed' can sometimes have a value which is slightly high. So round-down relevant values to new-chunk size and that is what we want. Subtract from component_size after scaling down rather than before as that is easier. Make sure max_progress never goes negative when reshaping backwards. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-12 10:34:44 +11:00
NeilBrown	d0ab945ee1	Improve determination of when a backup is needed. The current code is right. Instead compute where we might eventually need to back up to, and then compare that to how far we have progressed. Also move suspend_point up towards where we might need to backup to, rather than just as far as max_progress - as max_progress can never exceed where we are currently suspended to. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 15:20:40 +11:00
NeilBrown	2f3dd5e4bc	Remove write_range from calculation of max_progress It isn't needed as we always work in multiples of full destination stripes. Also multiply by 'after' disks, not before. We can progress until the point we would write then lines up with where we would read now. We read now from array-address: reshape_progress device-address: read_offset So we write then to device-address: read_offset array-address: read_offset * after.disks Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 15:07:57 +11:00
NeilBrown	ec757320c2	Switch calculations of read_offset and write_offset These were backwards... we read from 'before' and write to 'after'. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:57:14 +11:00
NeilBrown	b4f8e38b94	Avoid confusing with 'blocks' number. The 'blocks' number computed by analyse_change is the number of blocks that it makes sense to back-up at a time. It is the smallest number of blocks that is a whole number of stripes in both the old and the new layout. However we are also using it as the smallest amount of progress that can be made at a time, which is wrong as it is always valid to progress a single stripe in the new layout. So change 'blocks' to be called 'backup_blocks' to make it more clear. And pass new_chunk size down so it can be used for 'minimum forward progress' calculations. Also set 'stripes' (the amount actually backed up) from the possibly-scaled 'blocks' number rather than ignoring it and using backup_blocks. Finally, get rid of 'read_range' as it isn't used (or needed). Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:51:09 +11:00
NeilBrown	6eb48844a5	Correctly abort level change when reshape_array fails. We were returning too early. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:41:49 +11:00
NeilBrown	374ba335cb	Misc reshape_array fixes. 1/ test on spares_needed is backwards 2/ stray white space 3/ reuse a goto instead of explicit exit(). Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:41:48 +11:00
NeilBrown	f94eedafc1	Avoid double-unfreeze of arrays during grow. Once we have called reshape_container or reshape_super we have handed on the responsibility for unfreezing the array, so Grow_reshape shouldn't call unfreeze. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:41:47 +11:00
NeilBrown	4a8703648a	analyse_change fixes When converting to RAID6, the new layout should match the old layout, not the RAID6 version of the old layout. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:41:46 +11:00
NeilBrown	b6b951557d	Fix some typos in fprintf messages. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:41:42 +11:00
NeilBrown	3656edcdc6	Grow: fix type when 'or'ing flags together. '\|\|' should have been '\|' Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 14:16:11 +11:00
NeilBrown	0f28668b93	Ensure start_reshape copes with unexpected state We want start_reshape to work no matter what the current values of suspend_lo/suspend_hi are. So initialise suspend_lo very high as this allows suspend_hi to be set to anything. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-11 13:23:16 +11:00
Adam Kwolek	1b3cbac57f	Raid0: execute backward takeover After raid0 reshape is finished backward takeover has to be executed. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 19:20:01 +11:00
Adam Kwolek	582496b2c2	Set array size after adding new disks When new disks are added array size has to be set by mdadm as array grows. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 18:29:20 +11:00
Adam Kwolek	7477d51305	FIX: Use sysfs to change array parameters For external metadata parameters has to be changed via sysfs. i.e. change of raid_disks requires handshake mdmon<->md (md_allow_write()) Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 18:29:03 +11:00
Adam Kwolek	9ff87c16ce	FIX: Get array information in reshape_array() Uninitialized array structure is used. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 18:29:02 +11:00
Adam Kwolek	1bb174ba0b	FIX: get updated information from metadata Metadata is not modified by metadata preparation handler. It has to be read again from array. There is 2 read required: 1. before 'for' entry to get updated information after reshape_super() call 2. inside 'for' loop to get updated information for every processed array (it can happen /i.e. imsm case/ that container operation is a set of array operations and information in metadata is changed after every loop). Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 16:56:05 +11:00
NeilBrown	7443ee8187	Refactor reshape monitoring. Combine all the non-backing-up code into a single function: progress_reshape. It is called repeatedly to monitor a reshape and allow it to happen safely. Have a single separate function 'child_monitor' which performs backups of data and calls progress_reshape to wait for the next backup to be needed. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 15:58:32 +11:00
NeilBrown	5da9ab9874	Grow_reshape re-factor Significant rewrite/refactor of Grow_reshape to make it easier to work with externally-managed-metadata. This patch it too big, but we'll just have to live with that. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-06 15:58:00 +11:00
NeilBrown	b5420ef325	Grow: add disks chosen by metadata handler to array for growth. With externally managed container based metadata, the ->reshape_super method must choose any spares that are to be added to the array. They should be prepared so that ->container_content will find them as spares (disk.state == 0) which are assigned to a slot (raid_disk >= 0). We need to take those and add them to the array(s). Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:52 +11:00
NeilBrown	4347544720	Grow: call start/abort_reshape as appropriate when reshaping a container. This means that ->manage_reshape will be called with reshape ready to roll. Also move the current start_reshape call earlier so that it is before the other ->manage_reshape call. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:52 +11:00
NeilBrown	d18bfbe3d0	Grow: make sure rv is set correctly in reshape_container_raid_disks Whenever there is an error, rv must be -1. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:52 +11:00
NeilBrown	47eb4d5a18	Grow: split out start_reshape for initiating reshape via sysfs. Rather than sprinkling various sysfs setting around, put them all in one place. This will make implementing ->manage_reshape easier. This changes behaviour slightly. Previously we would not set 'sync_action' to 'reshape' until we were ready for the process to start. Now we set sync_max to zero and set sync_action to 'reshape' at that time. When we want reshape to actually start we advance sync_max. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:52 +11:00
NeilBrown	7d469585fc	Grow: fix calculation of stripe_cache_size when reshaping. The two places that this was done were different. The original was most correct, thought it used odisks rather than odata. So fix that and make them both use the same calculation. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:52 +11:00
NeilBrown	76266030d6	Grow: be more careful about metadata updates. 1/ When we sunc_metadata, we must reset ->update_tail else future metadata updates might go direct to the device bypassing mdmon. 2/ When converting to an array with redundancy so we can add disks it is neater to sync_metadata before starting mdmon rather that artificially setting update_tail early. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:51 +11:00
NeilBrown	d7ca196cbd	Grow: check container is idle before freezing it. Before we freeze a container in preparation for growing a subarray, we need to be sure all the subarrays are idle. This test is racy as recovery could start at any moment following a failure. However it is still useful as it stops us from even trying to start a reshape while a reshape or recovery is active. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:51 +11:00
NeilBrown	691a36b76f	Grow: warn if growing an array will make it degraded. Growing an array when there aren't enough spares can make the array degraded. This works but might not be what is wanted. So warn the user in this case and require a --force to go ahead with the reshape. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-09 11:51:13 +11:00
Adam Kwolek	9376b5aac2	FIX: wait_backup() sometimes hungs Sometimes wait_backup() omits transition from reshape to idle state and mdadm seams to be hung. So check the 'complete' count before waiting rather than only after. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 21:33:55 +11:00
Adam Kwolek	92a19f1a78	FIX: Honor !reshape state on wait_reshape() entry When wait_reshape() function starts it can occurs that reshape is finished already, before wait_reshape() start. This can lead to wait for change state inside this function for a long time. To avoid this before wait we should test if finish conditions are not reached already. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 15:10:20 +11:00
Adam Kwolek	e6e9d47b76	Grow: open backup file for reshape as function Move opening backup file to the function for future reuse during container reshape. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 15:00:16 +11:00
NeilBrown	acab7bb189	Create/grow: improve checks on number of devices. Check on upper limit of number of devices was in the wrong place. Result was could not create array with more than 27 devices without explicitly setting metadata, even though default metadata allows more. Fixed, and also perform check when growing an array. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-01 14:51:27 +11:00
NeilBrown	c82afc17a8	Grow: disallow placing backup file on array being reshaped. the tests here aren't perfect, but they could catch some cases. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-01 11:58:32 +11:00
NeilBrown	87f26d14f7	Assemble: allow an array undergoing reshape to be started without backup file Though not having the proper backup file can cause data corruption, it is not enough to justify not being able to start the array at all. So allow "--invalid-backup" to be specified which says "just continue even if a backup cannot be restored". Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-01 11:47:32 +11:00
NeilBrown	ff63406404	Grow: give useful message when adding bitmap gives EBUSY. If adding a bitmap fails with EBUSY, then it is because the array is currently resyncing/recovering/reshaping. As this is non-obvious, give a message explaining the fact. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 16:34:25 +11:00
NeilBrown	b3bd581b1d	Fix warning about host-endian bitmaps. Hostendian bitmaps should be warned about on all arch's. And fix a speeling mistake. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 16:25:26 +11:00
Adam Kwolek	1c009fc218	Compute backup blocks in function. number of backup blocks evaluation is put in to function for code reuse. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 13:30:22 +11:00
Adam Kwolek	130994cb83	Prepare and free fdlist in functions fd handles table creation is put in to function for code reuse. In manage_reshape(), child_grow() function from Grow.c will be reused. To prepare parameters for this function, code from Grow.c can be reused also. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 13:27:08 +11:00
Adam Kwolek	62a48395f6	Disk removal support for Raid10->Raid0 takeover Until now Raid10->Raid0 takeover was possible only if all the mirrors where removed before md starts the takeover. Now mdadm, when performing Raid10->raid0 takeover, will remove all unwanted mirrors from the array before actual md takeover is called. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 11:57:51 +11:00
Dan Williams	7f2ba464e4	External reshape (step 2): Freeze container When growing the number of raid disks the reshape process will promote container-spares to subarray-spares (later the kernel promotes them to subarray-members in raid5_start_reshape()). The automatic spare promotion that mdmon performs upon seeing a degraded array must be disabled until the reshape process has been initiated. Otherwise, mdmon may start a rebuild before the reshape parameters can be specified. In the external case we arrange for the monitor to be blocked, and turn off the safemode delay. Mdmon is updated to check sync_action is not frozen before initiating recovery. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 16:39:58 +11:00
Dan Williams	7bc7119671	External reshape (step 1): container reshape and ->reshape_super() In the native metadata case Grow_reshape() and the kernel validate what reshapes are possible / supported and the kernel handles all the metadata updates. In the external case the metadata format may have specific constraints above this baseline. External formats also introduce the constraint of only permitting some reshapes at container scope versus subarray scope. For exmaple imsm changes to 'raiddisks' must be applied to all arrays in the container. This operation assumes that its 'st' parameter has been obtained from super_by_fd() (such that st->subarray is up to date), and that a snapshot of the metadata has been loaded from the container. Why a new method, versus extending an existing one? ->validate_geometry: this routine assumes it is being called from Create(), adding reshape complicates the cases that this routine needs to handle. Where we find that checks can be shared between the two cases those routines refactored into common code internal to the metadata handler, i.e. no need to provide a unified external interface. ->validate_geometry() also does not expect to update the metadata. ->update_super: this is meant to update single fields at Assembly() and only at the container scope. Reshape potentially wants to update multiple fields at either container or subarray scope. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 16:09:27 +11:00
Dan Williams	72e4a37822	Grow: fix check for raid6 layout normalization If the user does not specify a layout, don't skip asking about retaining the non-standard raid6 layout which may be implicitly changed. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 15:11:37 +11:00
Dan Williams	4411fb1749	Grow: mark some functions static Going through the Grow api found some local routines that could be marked static. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 15:08:42 +11:00
NeilBrown	4725bc31fb	super_by_fd: return subarray info explicitly. Rather than hiding this in the 'st', return it explicitly. In the one case we still need it, copy it into st where needed. This will disappear in a future patch. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:25 +11:00
NeilBrown	a5d85af748	get_info_super: report which other devices are thought to be working/failed. To accurately detect when an array has been split and is now being recombined, we need to track which other devices each thinks is working. We should never include a device in an array if it thinks that the primary device has failed. This patch just allows get_info_super to return a list of devices and whether they are thought to be working or not. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:25 +11:00
NeilBrown	925211e323	Grow: use raid_disks, not nr_disks nr_disks is just wrong here - the arrays need room for all disk slots, even if some are empty, plus spares, plus a possible backup file. So raid_disks is correct. Signed-off-by: NeilBrown <neilb@suse.de>	2010-08-06 14:40:53 +10:00
NeilBrown	7204495377	Fix writing of second backup superblock during grow There 'rv' tests were confused and sometimes wrong. This resulted in not writing the second bsb. Also fix the test script so the the critical section is long enough that we have some hope of interrupting it. Signed-off-by: NeilBrown <neilb@suse.de>	2010-08-05 21:39:17 +10:00
NeilBrown	f21e18ca89	Compile with -Wextra by default This produced lots of warning, some of which pointed to actual bugs. Signed-off-by: NeilBrown <neilb@suse.de>	2010-08-05 13:13:02 +10:00
NeilBrown	5f6ca90a9b	Fix restarting of reshaping arrays. We cannot get stripe_cache_size until after the array have been activated!! Signed-off-by: NeilBrown <neilb@suse.de>	2010-07-29 13:50:15 +10:00
NeilBrown	b7e734fc22	Fix use of rv in Grow_reshape 1/ and extra local var was declared, which causes rv setting to be lost 2/ a -ve rv was left -ve while we should be return 1 on err. Signed-off-by: NeilBrown <neilb@suse.de>	2010-07-29 13:16:01 +10:00
Doug Ledford	0155af90d8	Bugfix: don't issue a read larger than the buffer to hold it Signed-off-by: Doug Ledford <dledford@redhat.com>	2010-07-22 10:16:31 -04:00

1 2 3

139 Commits