mdadm

Commit Graph

Author	SHA1	Message	Date
Alexander Lyakas	135a31f5ed	Don't consider disks with a valid recovery offset as candidates for bumping up event count When we are looking for a candidate disk to bump up the event count, we consider only disks that have recovery_start==MaxSector. However, after we find one such disk, we agree to accept more disks having same event count, regardless of their recovery_start. Be consistent and don't accept disks with a valid recovery_start at all. Signed-off-by: NeilBrown <neilb@suse.de>	2012-05-15 14:20:42 +10:00
Adam Kwolek	4aecb54a21	FIX: Assembled second array is in read only state during reshape When arrays using external metadata are assembled, and one of array in container is under reshape, second array will remain in read only state (not auto read only). It is caused by array fact that array is frozen and mdmon doesn't has opportunity to switch array in r/w mode. Freezing not reshaped array just after it is being assembled allows mdmon to enable it for writing. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-17 12:33:38 +10:00
NeilBrown	e62b778573	Assemble: improve verbose logging when including old devices. Reporting: mdadm: added /dev/loop1 to /dev/md0 as 1 mdadm: added /dev/loop2 to /dev/md0 as 2 mdadm: added /dev/loop0 to /dev/md0 as 0 mdadm: /dev/md0 has been started with 2 drives (out of 3). is confusing - why only 2? Code now reports: mdadm: added /dev/loop1 to /dev/md0 as 1 mdadm: added /dev/loop2 to /dev/md0 as 2 (possibly out of date) mdadm: added /dev/loop0 to /dev/md0 as 0 mdadm: /dev/md0 has been started with 2 drives (out of 3). which is somewhat clearer. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-22 14:52:21 +11:00
NeilBrown	b720636a58	Assemble: support assembling of a RAID0 being reshaped. This is a bit of a hack and the code need to be made more general. But this adds the special case of a RAID0 being reshaped which looks like a RAID4 but doesn't need as many devices. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-07 10:47:34 +11:00
NeilBrown	56d1885944	Assemble: don't use O_EXCL until we have checked device content. If we open with O_EXCL before checking that the device is one that we really want, then that could cause some other process to think the device is busy when it isn't really. This particularly affects running "mdadm -A devname" in parallel for different arrays. One might be looking at a device that it won't end up using while another trys and fails to look at a device that it needs. So delay the O_EXCL until after all identity checks. Multiple "mdadm -As" will still have races, but that is fundamentally racy anyway. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-07 10:41:24 +11:00
Adam Kwolek	111e9fdaa8	FIX: Array is not run when expansion disks are added When added disk is disk added by expansion and this is last disk added to array, assemble_container_content() will not even try to run such array. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:51 +11:00
NeilBrown	da8fe5aa9b	Assemble: fix --force assemble during reshape. If we have to --force assembly during reshape, we need to check by the 'before' and 'after' cases to make sure there are enough devices. Reported-by: Richard Herd <2001oddity@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 14:06:44 +11:00
NeilBrown	de5a472ea3	Remove avail_disks arg from 'enough'. It can easily be calculated from 'avail' and 'raid_disks', and we will soon have a case where we don't have it easily available to pass in. Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 14:04:47 +11:00
NeilBrown	887162637f	Assemble: fix count in "assembled with .. but not started". We need to include the count of pre-existing devices here. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:49:07 +11:00
NeilBrown	576d028002	Assemble: make some plurals conditional. "1 devices" is ugly. Fix it. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:49:07 +11:00
NeilBrown	81a5b4f52f	Remove update_private This fields doesn't work any more as ->getinfo_super clears the info structure at an awkward time. So get rid of it and do it differently. The issue is that the metadata handler cannot tell if the uuid it has was randomly generated or explicitly requested, except on the first call. And we don't want to accept explicit requests for IMSM. So when it was auto-generated, make it look distinctive by having the same int copied in all 4 positions. If someone requests a uuid like that, I guess they get away with it. Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-20 10:30:34 +11:00
NeilBrown	a648241517	Resolve some more warnings unused variables when MDASSEMBLE is defined, and a typo in mdadm.8 Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-13 13:24:52 +11:00
Lukasz Dorau	7728e1c635	fix: correct metadata's update communication The problem occurs when array under migration is assembled incrementally. st->update_tail is not initialized in function assemble_container_content() and during reshape the checkpoint information in metadata is not being updated. The value of st->update_tail is now initialized in function assemble_container_content() and during reshape the checkpoint information in metadata is being updated correctly on all disks. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-21 16:17:56 +11:00
Jes Sorensen	518a60f385	Assemble(): don't dup_super() before we need it. Avoid resource leak in case we bail loop early Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
Jes Sorensen	22472ee1d2	assemble_container_content(): fix memory leak Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
Jes Sorensen	83366b3352	Fix memory leak Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-01 14:50:44 +11:00
Labun, Marcin	81219e70f2	kill-subarray: fix, IMSM cannot kill-subarray with unsupported metadata container_content retrieves volume information from disks in the container. For unsupported volumes the function was not returning mdinfo. When all volumes were unsupported the function was returning NULL pointer to block actions on the volumes. Therefore, such volumes were not activated in Incremental and Assembly. As side effect they also could not be deleted using kill-subarray since "kill" function requires to obtain a valid mdinfo from container_content. This patch fixes the kill-subarray problem by allowing to obtain mdinfo of all volumes types including unsupported and introducing new array.status flags. There are following changes: 1. Added MD_SB_BLOCK_VOLUME for blocking an array, other arrays in the container can be activated. 2. Added MD_SB_BLOCK_CONTAINER_RESHAPE block container wide reshapes (like changing disk numbers in arrays). 3. IMSM container_content handler is to load mdinfo for all volumes and set both blocking flags in array.state field in mdinfo of unsupported volumes. In case of some errors, all volumes can be affected. Only blocked array is not activated (also reshaped as result). The container wide reshapes are also blocked since by metadata definition they require modifications of both arrays. 4. Incremental_container and Assemble functions check array.state and do not activate volumes with blocking bits set. 5. assemble_container_content is changed to check container wide reshapes before activating reshapes of assembled containers. 6. Grow_reshape and Grow_continue_command checks blocking bits before starting reshapes or continueing (-G --continue) reshapes. 7. kill-subarray ignores array.state info and can remove requested array. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-31 11:29:46 +11:00
Adam Kwolek	3bd58dc65f	Always run Grow_continue() for started array. So far there were 2 reshape continuation cases: 1. array is started /e.g. reshape was already invoked during initrd start-up stage using "--freeze-reshape" option/ 2. array is not started yet /"normal" assembling array under reshape case/ This patch narrows continuation cases in to single one. To do this array should be started /set readonly in to array_state/ before calling Grow_continue() function. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-07 09:46:07 +11:00
Adam Kwolek	a93ada3b7d	Monitor reshaped array Reshape can be run for monitored arrays only /external metadata case/. Before reshape can be executed, make sure that just starter array/container is monitored. If not, run mdmon for it. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-05 13:59:28 +11:00
Adam Kwolek	6e75048bc5	Add recovery blocked field to mdinfo When container is assembled while reshape is active on one of its member whole container can be required to be blocked from monitoring. For such purpose field recovery blocked is added to mdinfo structure. When metadata handler finds active reshape in container it should set recovery_blocked field to disable whole container monitoring during reshape. For arrays that doesn't use containers, recovery_blocked field has the same value as reshape_active field e.g. super0/1. In fact,recovery is blocked during reshape for such arrays. For ddf, metadata handler doesn't set reshape_active field, so recovery_blocked is not set also. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-05 13:30:50 +11:00
Adam Kwolek	b76b30e0f9	Do not continue reshape during initrd phase During initrd phase continuing reshape will cause file system context lost. This blocks ability to control reshape using checkpoints. To avoid this, during initrd phase assemble has to be executed with '--freeze-reshape' option. This causes that mdadm restores reshape critical section only. Reshape can be continued later after system full boot. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-03 09:15:22 +11:00
Adam Kwolek	3f54bd62dc	Move restore backup code to function Reshape backup should be able to be restored during reshape continuation also. To reuse already existing code it is moved to function. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-21 12:17:30 +10:00
Adam Kwolek	910e9fa7f9	FIX: Memory leak during Assembly For fdlist pointer allocated in assemble_container_content() function, free() is never called. This patch fixes this memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-21 11:55:15 +10:00
NeilBrown	b787bec6bd	Don't index past the end of 'best' array in Assemble. The 'best' array only has 'bestcnt' entries allocated, so 'i' should always be "< bestcnt", not "<= bestcnt". Reported-by: "Lawrence, Joe" <Joe.Lawrence@stratus.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-17 14:48:33 +10:00
Adam Kwolek	ba53ea59ad	Add reshape restart support for external metadata Patch introduces support for reshape process restart for external metadata using metadata specific data handling methods. It introduces recover_backup() function that restores array to stable state It is equivalent to Grow_restart() functionality for native metadata. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-08 17:11:11 +10:00
NeilBrown	95eeceeb32	getinfo_super now clears the 'info' structure before filling it in. Some code currently clears 'info' before calling getinfo_super, some code doesn't. To be consistent, change it so no caller ever clears 'info', but ever getinfo_super function must clear it. Note that ->raid_disk may be meaningful if that 'map' is passed non-NULL. In that case it is copied out before the structure is zeroed. Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-08 15:54:13 +10:00
Adam Kwolek	7af0334155	FIX: Count correctly added devices When array is in reshape state raid_disks field contains final disks number. To know how many disks were added, disk.raid_disk index has to be compared against old disk number computed using delta_disks. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-04-18 10:31:43 +10:00
NeilBrown	a28232b83f	Assemble: improve efficacy of -Af in assembling degraded dirty arrays. If a degraded dirty array has some superblocks which are clean and others that are dirty, and the dirty ones are newer by precisely '1' in the event count, then the current code to force the array to be clean will not work. We need to make sure to find a superblock with most recent event count and force that one to be 'clean'. Reported-by: A J Wyborny <ajwyborny@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-23 12:10:31 +11:00
Adam Kwolek	983fff45a1	FIX: ping_monitor() usage causes memory leaks When for ping_monitor() input devnum2devname() is used, received string pointer should be passed to free() for memory release. It is not made in several places. This use case should have function to avoid memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-18 12:32:16 +11:00
NeilBrown	b8b8eda804	Remove incorrect use of open_dev open_dev can only be used for md array. To open an arbitrary device, dev_open must be used. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-10 11:36:47 +11:00
Adam Kwolek	1403201652	FIX: Make expansion counter usable Currently whole array geometry is set in sysfs_set_array(), so none of disks (even for expansion) should fail during sysfs_add_disk() Due to this expansion counter should be used for reshaped array when disk slot is bigger than number of disks in array. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-10 09:58:35 +11:00
Adam Kwolek	b8063f0770	FIX: Block reshaped array monitoring When array under reshape is assembled it has to be disabled from monitoring as soon as possible. It can occur that this is i.e second array in container and mdmon is loaded already. Lack of blocking monitoring can cause change array state to active, and reshape continuation will be not possible. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-10 09:57:39 +11:00
NeilBrown	4968025884	Run Grow_restart/Grow_continue when assembling the content of a container. As containers can now grow, we need to use both Grow_restart (to replay any backup-file) and Grow_continue when assembling the content of a container. Note that we don't pass a backup-file when doing incremental assembly. If such is needed in that case, the assembly will fail. To restart such arrays, explicit assembly is required. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-08 17:14:00 +11:00
Adam Kwolek	588bebfcc2	Continue reshape after assembling array assemble_container_content() cannot close mdfd handle, as it could be required by reshape continuation. mdfd handle is closed outside this function, when it is not longer necessary. Call to Grow_continue is added for reshape continuation after assembly. In the nearest future, simple condition: if (content->reshape_active) before Grow_continue() call will be replaced by check function for support container operation /reshape/. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-02 12:28:15 +11:00
Adam Kwolek	882029c86d	FIX: disks added beyond array should be counted during reshape During expansion there is more working disks that array can have. Disks with set raid_disk (not a spare disk) during reshape should be counted to allow array state transition to read_only state. Array reconfiguration to new geometry should be done before reshape will be started. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-27 17:26:42 +11:00
NeilBrown	71204a5029	Various compile fixes. Make "make everything" succeed. This fixed some real bugs. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 15:48:03 +11:00
NeilBrown	a5d10dcec8	Allow explicitly listed spared to be included by default. When the metadata doesn't identify which array a spare belongs to we normally require an explicit domain match to connect a spare with an array. However when the spare is explicitly listed in argv, it should be safe to include as long as there is no domain conflict. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 14:44:02 +11:00
NeilBrown	e5508b361d	Allow domain_test to report that no domains were found. Sometime we will need to know the difference between no domains found and domains didn't match. So allow domain_test to return different values and fix up all callers to maintain current behaviour. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 14:44:02 +11:00
NeilBrown	ac597b1c21	free_super after assembling a container Else the devices are held open. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 13:07:24 +11:00
NeilBrown	d438679977	Assemble: ignore unknown devices not listed on command line. If we find a device that has not superblock, we currently fail unless in auto_assem mode. However we really should only fail if the device was explicitly listed in the arg list. So add a test for that. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 13:07:07 +11:00
Czarnowska, Anna	3c7b4a2595	Assemble: allow to assemble container with uuid=0:0:0:0 When there are any arrays in config file the spares with domain not matching any array are not assembled because auto assembly is not attempted. Addition of ARRAY line with uuid=0:0:0:0 in config will work with modified condition for gathering spares. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 10:40:56 +11:00
Anna Czarnowska	ed7fc6b4d9	Assemble: allow to assemble spares on their own If we find spares but no members of given array we create container with just spares. This allows auto assemble to pick up all lose imsm spares when there is no config file. When there is a valid config file and any array is assembled from it we don't try auto assembly so we will not assemble spares that don't match any array. To remedy this we must add ARRAY metadata=imsm UUID=00000000:00000000:00000000:00000000 to config file. This container will include all remaining spares. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-05 13:54:18 +11:00
Anna Czarnowska	26b05aeaed	Assemble: we need to read policy to know array domains Policy must be read on all disks identified as array members to get array's domains list. Currently it is only read on first array member in auto assembly mode. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-05 13:42:59 +11:00
Anna Czarnowska	cbeeb0e5f0	Assemble imsm spares in matching domain only Imsm spare will only be taken if it matches domain of identified members of currently assembled array. This implies that: - spare with null domain will match first array assembled. - if array has null domain then no spare will match If we allow spares to set st they may block assembly of subarrays. This is because in auto-assembly tmpdev->used=0 for a spare not matching any array. If we find such spare before container and set st, the content will not get assembled. We allow uuid_zero match any uuid in assembly as unsuitable spares will be rejected on domain check. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-26 22:08:51 +11:00
Krzysztof Wojcik	a06d022db4	FIX: Bad block verification during assembling array We need to refuse to assemble an arrays with bad blocks. Initially there was condition in container_content function that returns error value in the case when metadata store information about bad blocks. When the container_content function is called from functions NOT connected with assemble (Kill_subarray, Detail) we get faulty error return value. Patch introduces new flag in array.status - MD_SB_BBM_ERRORS. It is set in container_content when bad blocks are detected and can be checked by container_content caller. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-26 21:41:57 +11:00
NeilBrown	87f26d14f7	Assemble: allow an array undergoing reshape to be started without backup file Though not having the proper backup file can cause data corruption, it is not enough to justify not being able to start the array at all. So allow "--invalid-backup" to be specified which says "just continue even if a backup cannot be restored". Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-01 11:47:32 +11:00
Hawrylewicz Czarnowski, Przemyslaw	417f346ee0	fix: assemble for external metadata generates segfault if invalid device found An attempt to invoke super_by_fd() on device that has metadata_version="none" always matches super0 (as test_version is ""). In Assemble() it results in segfault when load_container is invoked (=null for super0). As of now load_container is only started if it points to valid pointer. Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-01 11:06:09 +11:00
NeilBrown	484ae54d16	Assemble: call remove_partitions later. We shouldn't call remove_partitions until we have made a really firm decision to include the device into the array. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 16:56:01 +11:00
Dan Williams	dcc4210f58	Assemble: fix assembly in the delta_disks > max_degraded case Incremental assembly works on such an array because the kernel sees the disk as in-sync and that the array is reshaping. Teach Assemble() the same assumptions. This is only needed on kernels that do not initialize ->recovery_offset when activating spares for reshape. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 15:10:01 +11:00
NeilBrown	87477e6d5e	Assemble: get content before testing it. When checking that a container matches the required uuid, we need to call 'getinfo_super' before we have a 'content' to test. Reported-by: "Czarnowska, Anna" <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 11:34:36 +11:00

1 2 3 4 5

233 Commits