mdadm

Commit Graph

Author	SHA1	Message	Date
Mateusz Grzonka	b71de056ce	Correct checking if file descriptors are valid In some cases file descriptors equal to 0 are treated as invalid. Fix it. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-11-24 07:07:12 -05:00
Mateusz Grzonka	b2e4f08414	Incremental: Close unclosed mdfd in IncrementalScan() In addition to closing mdfd, propagate helpers to manage file descriptors across IncrementalScan(). Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-11-24 07:06:12 -05:00
Mariusz Tkaczyk	c7b8547c70	imsm: add verbose flag to compare_super IMSM does more than comparing metadata and errors reported directly from compare_super_imsm can be useful. Add verbose flag to compare_super method and make all not critical error printing configurable. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-03-08 10:43:29 -05:00
Mariusz Tkaczyk	69068584f9	Incremental: Remove redundant spare movement logic If policy is set then mdmonitor is responsible for moving spares. This logic is reduntant and potentialy dangerus, spare could be moved at initrd stage depending on drives appearance order. Remove it. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-12-20 13:45:30 -05:00
Mariusz Tkaczyk	ff6bb131a4	mdadm: Unify forks behaviour If mdadm is run by udev or systemd, it gets a pipe as each stream. Forks in the background may run after an event or service has been processed when udev is detached from pipe. As a result process fails quietly if any message is written. To prevent from it, each fork has to close all parent streams. Leave stderr and stdout opened only for debug purposes. Unify it across all forks. Introduce other descriptors detection by scanning /proc/self/fd directory. Add generic method for managing systemd services. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>	2020-11-25 18:15:55 -05:00
Mariusz Dabrowski	b068159891	mdadm: load default sysfs attributes after assemblation Added new type of line to mdadm.conf which allows to specify values of sysfs attributes for MD devices that should be loaded after the array is assembled. Each line is interpreted as list of structures containing sysname of MD device (md126 etc.) and list of sysfs attributes and their values. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2019-07-10 16:12:09 -04:00
NeilBrown	cd72f9d114	policy: support devices with multiple paths. As new releases of Linux some time change the name of a path, some distros keep "legacy" names as well. This is useful, but confuses mdadm which assumes each device has precisely one path. So change this assumption: allow a disk to have several paths, and allow any to match when looking for a policy which matches a disk. Reported-and-tested-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2018-12-06 07:43:19 -05:00
Mariusz Tkaczyk	cb8f537135	Incremental: remove external arrays and devices correctly Kernel returns EBUSY when device fail invokes array fail. In external metadata if kernel returns it, mdadm doesn't stop member arrays but it will try to stop container directly. It fails because container still has working arrays, so udev remove is triggered. Try to set faulty state on device in member arrays first. If kernel returns EBUSY, stop this array. After that remove the device from container. In external metadata mdmon has to remove faulty devices from degraded arrays, just remove device from container. Raid5 array doesn't return EBUSY, it allows to remove every device. Mdadm shouldn't block it. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2018-08-03 10:25:12 -04:00
Guoqing Jiang	898bd1ecef	Free map to avoid resource leak issues 1. There are some places which didn't free map as discovered by coverity. CID 289661 (#1 of 1): Resource leak (RESOURCE_LEAK)12. leaked_storage: Variable mapl going out of scope leaks the storage it points to. CID 289619 (#3 of 3): Resource leak (RESOURCE_LEAK)63. leaked_storage: Variable map going out of scope leaks the storage it points to. CID 289618 (#1 of 1): Resource leak (RESOURCE_LEAK)26. leaked_storage: Variable map going out of scope leaks the storage it points to. CID 289607 (#1 of 1): Resource leak (RESOURCE_LEAK)41. leaked_storage: Variable map going out of scope leaks the storage it points to. 2. If we call map_by_* inside a loop, then map_free should be called in the same loop, and it is better to set map to NULL after free. 3. And map_unlock is always called with map_lock, if we don't call map_remove before map_unlock, then the memory (allocated by map_lock -> map_read -> map_add -> xmalloc) could be leaked. So we need to free it in map_unlock as well. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2018-06-11 06:35:41 -04:00
NeilBrown	3bc6f786e1	Incremental: Use ->validate_geometry instead of ->avail_size Since mdadm 3.3 is has not been correct to call ->avail_size if metadata hasn't been read from the device. ->validate_geometry should be used instead. Unfortunately array_try_spare() didn't get the memo, and it can crash when adding a spare with no metdata. So change it to use ->validate_geometry(). Only one place remains that uses ->avail_size(), and that is safe. Also fix a comment with a typo. Reported-and-tested-by: Bjørnar Ness <bjornar.ness@gmail.com> Fixes: `641da74591` ("super1: separate to version of _avail_space1().") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-11-01 17:26:37 -04:00
Song Liu	3b8c712755	mdadm: set journal_clean after scanning all disks Summary: In Incremental.c:count_active(), max_events is tracked to show to which devices are up to date. If a device has events==max_events+1, getinfo_super() is called to reload the superblock from this device. getinfo_super1() blindly set journal_clean to 0, which is wrong. This patch fixes this by tracking max_journal_events for all the disks. After scanning all disks, journal_clean is set if max_journal_events >= max_events-1. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-09-01 11:12:16 -04:00
Tomasz Majchrzak	b13b52c80f	Get failed disk count from array state Recent commit has changed the way failed disks are counted. It breaks recovery for external metadata arrays as failed disks are not part of the array and have no corresponding entries is sysfs (they are only reported for containers) so degraded arrays show no failed disks. Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it is not set again if GET_STATE has not been requested. As GET_STATE provides the same information as GET_DEGRADED, the latter is not needed anymore. Remove GET_DEGRADED option and replace it with GET_STATE option. Don't count number of failed disks looking at sysfs entries but calculate it at the end. Do it only for arrays as containers report no disks, just spares. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-06-05 11:11:36 -04:00
Alexey Obitotskiy	4b57ecf6ce	Add sector size as spare selection criterion Add sector size as new spare selection criterion. Assume that 0 means there is no requirement for the sector size in the array. Skip disks with unsuitable sector size when looking for a spare to move across containers. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 14:18:38 -04:00
Alexey Obitotskiy	fbfdcb06dc	Allow more spare selection criteria Disks can be moved across containers in order to be used as a spare drive for reubild. At the moment the only requirement checked for such disk is its size (if it matches donor expectations). In order to introduce more criteria rename corresponding superswitch method to more generic name and move function parameter to a structure. This change is a big edit but it doesn't introduce any changes in code logic, it just updates function naming and parameters. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 14:18:36 -04:00
Jes Sorensen	00e56fd953	IncrementalScan: Use md_array_active() instead of md_get_array_info() This eliminates yet another case where GET_ARRAY_INFO was used to indicate whether the array was active. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 12:18:29 -04:00
Jes Sorensen	74d293a253	container_members_max_degradation: Switch to using syfs for disk info With sysfs now providing the necessary active_disks info, switch to sysfs and eliminate one more use of md_get_array_info(). We can do this unconditionally since we wouldn't get here witout sysfs being available. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 12:06:57 -04:00
Jes Sorensen	c2d1a6ec6b	Incremental: return is not a function Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:39:58 -04:00
Zhilong Liu	9e04ac1c43	mdadm/util: unify stat checking blkdev into function declare function stat_is_blkdev() to integrate repeated stat checking blkdev operations, it returns 'true/1' when it is a block device, and returns 'false/0' when it isn't. The devname is necessary parameter, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:05:32 -04:00
Zhilong Liu	0a6bff09d4	mdadm/util: unify fstat checking blkdev into function declare function fstat_is_blkdev() to integrate repeated fstat checking block device operations, it returns true/1 when it is a block device, and returns false/0 when it isn't. The fd and devname are necessary parameters, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned the device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:04:02 -04:00
Jes Sorensen	6921010d95	Incremental: Use md_array_active() to determine state of array One less call to md_get_array_info() Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-02 10:36:51 -04:00
NeilBrown	cd6cbb08c4	Create: tell udev md device is not ready when first created. When an array is created the content is not initialized, so it could have remnants of an old filesystem or md array etc on it. udev will see this and might try to activate it, which is almost certainly not what is wanted. So create a mechanism for mdadm to communicate with udev to tell it that the device isn't ready. This mechanism is the existance of a file /run/mdadm/created-mdXXX where mdXXX is the md device name. When creating an array, mdadm will create the file. A new udev rule file, 01-md-raid-creating.rules, will detect the precense of thst file and set ENV{SYSTEMD_READY}="0". This is fairly uniformly used to suppress actions based on the contents of the device. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-02 09:41:39 -04:00
Jes Sorensen	f8c432bfc9	Incremental: Cleanup some if() statement spaghetti Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-25 15:07:26 -04:00
Jes Sorensen	ff4ad24b1c	Incremental: Use md_array_active() where applicable md_get_array_info() == 0 implies an array is active, however this is more correct. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-25 14:57:46 -04:00
Jes Sorensen	dae131379f	sysfs: Make sysfs_init() return an error code Rather than have the caller inspect the returned content, return an error code from sysfs_init(). In addition make all callers actually check it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-30 16:52:37 -04:00
Jes Sorensen	5b13d2e1fb	Incremental: Remove redundant call for GET_ARRAY_INFO The code above just called md_get_array_info() and only reached this point if it returned an error that isn't ENODEV, so it's pointless to check this again here. In addition it was incorrectly retrieving ioctl data into a mdu_bitmap_file_t instead of mdu_array_info_t. Fixes: ("8382f19 Add new mode: --incremental") Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 14:40:36 -04:00
Jes Sorensen	9cd39f0155	util: Introduce md_get_array_info() Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 14:35:41 -04:00
Artur Paszkiewicz	e97a7cd011	super1: PPL support Enable creating and assembling raid5 arrays with PPL for 1.x metadata. When creating, reserve enough space for PPL and store its size and location in the superblock and set MD_FEATURE_PPL bit. Write an initial empty header in the PPL area on each device. PPL is stored in the metadata region reserved for internal write-intent bitmap, so don't allow using bitmap and PPL together. While at it, fix two endianness issues in write_empty_r5l_meta_block() and write_init_super1(). Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 11:33:52 -04:00
NeilBrown	e22fe3ae15	Introduce enum flag_mode for setting and clearing flags. We currently use '1' to indicate that a flag (writemostly or failfast) needs to be set, and '2' to indicate that it needs to be cleared. Using magic number like this is not a best-practice. So replaced them with values from a enum. No functional change. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-11-29 17:12:13 -05:00
NeilBrown	71574efb07	Add failfast support. Allow per-device "failfast" flag to be set when creating an array or adding devices to an array. When re-adding a device which had the failfast flag, it can be removed using --nofailfast. failfast status is printed in --detail and --examine output. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-11-28 08:50:36 -05:00
Artur Paszkiewicz	c012223056	Incremental: don't try to load_container() for a subarray mdadm -IRs would exit with a non-zero status because of this. Reported-by: Xiao Ni <xni@redhat.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-08-09 10:57:15 -04:00
Jes Sorensen	fe112c9eba	Incremental: Remove unnecesary NULL pointer checks when calling sysfs_free() Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-08 12:19:03 -05:00
NeilBrown	a0d12d51a7	Merge branch 'fix-unlikely-potential-overflows' of https://github.com/sjvs/mdadm	2015-12-21 13:01:10 +11:00
Guoqing Jiang	41dbb4da22	mdadm: let cluster raid could also add disk within incremental mode For cluster raid, the disc.state need to be changed accordingly under incremental mode. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-16 13:23:54 +11:00
Bas van Schaik	fa9aca4930	avoid confusion with parameter 'devname' with same name, ensure buffer is large enough for two ints plus extras	2015-12-03 13:48:46 +00:00
Bas van Schaik	a90ed30e74	ensure buffer is large enough for two ints and some extras	2015-12-03 13:48:37 +00:00
Song Liu	051f326550	mdadm: refactor write journal code in Assemble and Incremental As discussed, standalone require_journal() in struct superswitch is not a very good idea. Instead, journal related information fits well in struct mdinfo. This patch simplifies journal support code in Assemble and Incremental as: - Add journal_device_required and journal_clean to struct mdinfo; - Remove function require_journal from struct superswitch; - Update Assemble and Incremental to use journal_device_required and journal_clean from struct mdinfo (instead of separate var). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-10-22 12:19:09 +11:00
Song Liu	5c6ad21150	Check write journal in incremental If journal device is missing, do not start the array, and shows: ./mdadm -I /dev/sdf mdadm: journal device is missing, not safe to start yet. The array will be started when the journal device is attached with -I ./mdadm -I /dev/sdb1 mdadm: /dev/sdb1 attached to /dev/md/0_0, which has been started. To force start without journal device: ./mdadm -I /dev/sdf --run mdadm: Trying to run with missing journal device mdadm: /dev/sdf attached to /dev/md/0_0, which has been started. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-10-19 13:06:18 +11:00
Goldwyn Rodrigues	9d9202e301	Fix --incremental handling on cluster array. Commit `06bd679317` ("Skip clustered devices in incremental") disabled incremental completely on clustered arrays. What we really want is that mdadm should not start or create a clustered array but still be able to add or readd to an existing device. This would enable udev scripts to automatically add or re-add a device after transient errors. Signed-off-by: NeilBrown <neilb@suse.com>	2015-09-28 14:42:55 +10:00
NeilBrown	5997585200	Merge branch 'mdadm-3.3.x'	2015-08-03 16:21:37 +10:00
NeilBrown	8360760457	Assemble: really don't assemble IMSM array without OROM. Previous patch missed on case. Also print more useful information when rejecting a device with IMSM metadata. Signed-off-by: NeilBrown <neilb@suse.com>	2015-08-03 16:06:51 +10:00
NeilBrown	7eee461e91	Assemble: don't assemble IMSM array without OROM. If someone has an IMSM array, and disables RAID in the BIOS and uses the devices for some other purpose, then they really don't want mdadm to start syncing the array. So don't assemble if OROM doesn't confirm it is OK. There can still be problems for crash-dump not being able to find the OROM. Some explicit work-around might be needed for that rather than a more general workaround that can corrupt data. Signed-off-by: NeilBrown <neilb@suse.com>	2015-08-03 15:42:16 +10:00
NeilBrown	9f2e55a421	Assemble: don't assemble IMSM array without OROM. If someone has an IMSM array, and disables RAID in the BIOS and uses the devices for some other purpose, then they really don't want mdadm to start syncing the array. So don't assemble if OROM doesn't confirm it is OK. There can still be problems for crash-dump not being able to find the OROM. Some explicit work-around might be needed for that rather than a more general workaround that can corrupt data. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-29 14:38:37 +10:00
NeilBrown	653299b699	Merge branch 'cluster' Now that 3.3.3 is out, it is time to include the cluster-support code. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-27 11:01:08 +10:00
NeilBrown	9581efb1ae	mdstat: discard 'dev' field, just use 'devnm' These both have the same value, and have done since the 'devnm' concept was introduced. So discard the pointless duplicate. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-02 08:15:10 +10:00
Guoqing Jiang	06bd679317	Skip clustered devices in incremental We want the clustered devices to be started exclusively by a cluster resource-agent. So, avoid starting using the incremental option. This also skips a clustered md from starting during boot in inactive mode. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:33:18 +10:00
Pawel Baldysiak	4d149ab517	IncRemove: Set "auto-read" only after successful excl open. "mdadm -If" - triggered from udev rules when disk is removed from OS - tries to set array in auto-read-only mode. This can interrupt rebuild process which is started automatically, e.g. if array is mounted and spare disk is available (I/O error is detected faster than removing failed disk by mdadm). This patch prevents "mdadm -If" from setting array into "auto-read-only", by requiring exclusive open to succeed. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-03-04 15:59:53 +11:00
Jes Sorensen	5d94384e93	IncrementalScan(): Make sure 'st' is valid before dereferencing it Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-03-04 15:56:46 +11:00
NeilBrown	7a862a020f	Don't break long strings onto multiple lines. It is best to keep strings all together so that they are easier to search for in the source code. If a string is so long that it looks ugly one line, them maybe it should be broken into multiple lines for display too. Only strings which contain a newline can be broken into multiple lines: "It is OK to\n" "break this string\n" Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:46:53 +11:00
NeilBrown	1ade5cc15a	Consistently print program Name and __func__ in debug messages. make dprintf() print program name and __func__, so that this messaging is consistent. Also remove all __func__ messages from pr_err(). We shouldn't leak that internal data in error message. If we really want function name there, we new pr_XXX might be wanted. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:21:17 +11:00
Pawel Baldysiak	d56dd607ba	Change way of printing name of a process Sometimes mdadm prints messages with wrong name "mdmon", and vice versa. This patch solves this problem by changing method of determining process name. Now "Name" will be set in const at start of a program, previously was hardcoded as #define. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 12:11:01 +11:00

1 2 3 4 5

233 Commits