mdadm

Commit Graph

Author	SHA1	Message	Date
Guoqing Jiang	f4c8a605d2	uuid.c: split uuid stuffs from util.c Currently, 'make raid6check' is build broken since commit `b06815989` ("mdadm: load default sysfs attributes after assemblation"). /usr/bin/ld: sysfs.o: in function `sysfsline': sysfs.c:(.text+0x2707): undefined reference to `parse_uuid' /usr/bin/ld: sysfs.c:(.text+0x271a): undefined reference to `uuid_zero' /usr/bin/ld: sysfs.c:(.text+0x2721): undefined reference to `uuid_zero' Apparently, the compile of mdadm or raid6check are coupled with uuid functions inside util.c. However, we can't just add util.o to CHECK_OBJS which raid6check is needed, because it caused other worse problems. So, let's introduce a uuid.c file which is indenpended file to fix the problem, all the contents are splitted from util.c. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-05-18 20:10:39 -04:00
Kinga Tanska	42e641abeb	Add support for Tebibytes Adding support for Tebibytes enables display size of volumes in Tebibytes and Terabytes when they are bigger than 2048 GiB (or GB). Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-01-21 13:37:14 -05:00
Nigel Croxon	2c2d9c48d2	mdadm: force a uuid swap on big endian The code path for metadata 0.90 calls a common routine fname_from_uuid that uses metadata 1.2. The code expects member swapuuid to be setup and usable. But it is only setup when using metadata 1.2. Since the metadata 0.90 did not create swapuuid and set it. The test (st->ss == &super1) ? 1 : st->ss->swapuuid fails. The swapuuid is set at compile time based on byte order. Any call based on metadata 0.90 and on big endian processors, the --export uuid will be incorrect. Signed-Off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2019-09-30 15:14:44 -04:00
Michal Zylowski	5a5b3a6725	imsm: Do not require MDADM_EXPERIMENTAL flag anymore Grow feature for IMSM metadata is currently fully supported and tested. Reshape operation is not in experimental state anymore, so usage of this flag is unnecessary. Do not require MDADM_EXPERIMENTAL flag and remove obsolete information from manual. Signed-off-by: Michal Zylowski <michal.zylowski@intel.com> Acked-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Acked-by: Roman Sobanski <roman.sobanski@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2018-05-31 11:30:36 -04:00
Guoqing Jiang	1b7eb962db	mdadm: improve the dlm locking mechanism for clustered raid Previously, the dlm locking only protects several functions which writes to superblock (update_super, add_to_super and store_super), and we missed other funcs such as add_internal_bitmap. We also need to call the funcs which read superblock under the locking protection to avoid consistent issue. So let's remove the dlm stuffs from super1.c, and provide the locking mechanism to the main() except assemble mode which will be handled in next commit. And since we can identify it is a clustered raid or not based on check the different conditions of each mode, so the change should not have effect on native array. And we improve the existed locking stuffs as follows: 1. replace ls_unlock with ls_unlock_wait since we should return when unlock operation is complete. 2. inspired by lvm, let's also try to use the existed lockspace first before creat a lockspace blindly if the lockspace not released for some reason. 3. try more times before quit if EAGAIN happened for locking. Note: for MANAGE mode, we do not need to get lock if node just want to confirm device change, otherwise we can't add a disk to cluster since all nodes are compete for the lock. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2018-03-08 14:16:42 -05:00
Guoqing Jiang	5339f99606	To support clustered raid10 We are now considering to extend clustered raid to support raid10. But only near layout is supported, so make the check when create the array or switch the bitmap from internal to clustered. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-11-09 11:56:10 -05:00
Pawel Baldysiak	b251424242	Zeroout whole ppl space during creation/force assemble PPL area should be cleared before creation/force assemble. If the drive was used in other RAID array, it might contains PPL from it. There is a risk that mdadm recognizes those PPLs and refuses to assemble the RAID due to PPL conflict with created array. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-10-02 16:11:42 -04:00
Jes Sorensen	b7a462e561	util: Code is 80 characters wide Lets not make things uglier than they need to be. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-09-29 18:15:23 -04:00
Mariusz Tkaczyk	a822017f30	Detail: correct output for active arrays The check for inactive array is incorrect as it compares it against active array. Introduce a new function md_is_array_active so the check is consistent across the code. As the output contains list of disks in the array include this information in sysfs read. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-08-16 08:19:38 +00:00
Tomasz Majchrzak	9b8fea914f	Detail: don't exit if ioctl has been successful When GET_ARRAY_INFO ioctl is successful, mdadm exits with an error. It breaks udev and no links in /dev/md are created. Also change debug print to error print in the message indicating lack of the link to facilitate debugging similar issues in the future. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-24 13:28:33 -04:00
Jes Sorensen	d7be7d8736	mdadm: Fixup more broken logical operator formatting Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-16 13:59:43 -04:00
Jes Sorensen	fc54fe7a7e	mdadm: Fixup a large number of bad formatting of logical operators Logical oprators never belong at the beginning of a line. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-16 13:52:15 -04:00
Alexey Obitotskiy	4b57ecf6ce	Add sector size as spare selection criterion Add sector size as new spare selection criterion. Assume that 0 means there is no requirement for the sector size in the array. Skip disks with unsuitable sector size when looking for a spare to move across containers. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 14:18:38 -04:00
Alexey Obitotskiy	fbfdcb06dc	Allow more spare selection criteria Disks can be moved across containers in order to be used as a spare drive for reubild. At the moment the only requirement checked for such disk is its size (if it matches donor expectations). In order to introduce more criteria rename corresponding superswitch method to more generic name and move function parameter to a structure. This change is a big edit but it doesn't introduce any changes in code logic, it just updates function naming and parameters. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 14:18:36 -04:00
Zhilong Liu	9e04ac1c43	mdadm/util: unify stat checking blkdev into function declare function stat_is_blkdev() to integrate repeated stat checking blkdev operations, it returns 'true/1' when it is a block device, and returns 'false/0' when it isn't. The devname is necessary parameter, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:05:32 -04:00
Zhilong Liu	0a6bff09d4	mdadm/util: unify fstat checking blkdev into function declare function fstat_is_blkdev() to integrate repeated fstat checking block device operations, it returns true/1 when it is a block device, and returns false/0 when it isn't. The fd and devname are necessary parameters, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned the device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:04:02 -04:00
Jes Sorensen	9db2ab4e9b	util: md_array_valid(): Introduce md_array_valid() helper Using md_get_array_info() to determine if an array is valid is broken during creation, since the ioctl() returns -ENODEV if the device is valid but not active. Where did I leave my stash of brown paper bags? Fixes: ("40b054e mdopen/open_mddev: Use md_get_array_info() to determine valid array") Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-03 16:15:16 -04:00
Jes Sorensen	44356754ec	util: Get rid of unused enough_fd() enough_fd() is no longer used, so lets get rid of it. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-20 11:53:30 -04:00
Jes Sorensen	3ab8f4bf33	util: Introduce md_array_active() helper Rather than querying md_get_array_info() to determine whether an array is valid, do the work in md_array_active() using sysfs, and fall back on md_get_array_info() if sysfs fails. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-20 00:12:34 -04:00
Jes Sorensen	32141c1765	Retire mdassemble mdassemble doesn't handle container based arrays, no support for sysfs, etc. It has not been actively maintained for years, so time to send it off to retirement. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-11 12:54:26 -04:00
Jes Sorensen	303949f6f0	util: Finally kill off md_get_version() Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-04-05 15:49:18 -04:00
Jes Sorensen	700483a223	util/set_array_info: Simplify code since md_get_version returns a constant md_get_version() always returns (0 * 1000) + (90 * 100) + 3, so no point in calling it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-04-05 15:06:24 -04:00
Jes Sorensen	f5c924f441	util/must_be_container: Use sysfs_read(GET_VERSION) to determine valid array Use sysfs_read() instead of ioctl(RAID_VERSION) to determine this is in fact a valid raid array fd. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-04-05 14:01:30 -04:00
Jes Sorensen	018a488238	util: Introduce md_set_array_info() Switch from using ioctl(SET_ARRAY_INFO) to using md_set_array_info() Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:43:53 -04:00
Jes Sorensen	d97572f5a5	util: Introduce md_get_disk_info() This removes all the inline ioctl calls for GET_DISK_INFO, allowing us to switch to sysfs in one place, and improves type checking. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:23:50 -04:00
Jes Sorensen	9cd39f0155	util: Introduce md_get_array_info() Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 14:35:41 -04:00
Jes Sorensen	efa295309f	util: Cosmetic changes Fixup a number of indentation and whitespace issues Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 12:05:12 -04:00
NeilBrown	1ab9ed2afb	Add 'force' flag to hot_remove_disk(). In rare circumstances, the short period that hot_remove_disk() waits isn't long enough to IO to complete. This particularly happens when a device is failing and many retries are still happening. We don't want to increase the normal wait time for "mdadm --remove" as that might be use just to test if a device is active or not, and a delay would be problematic. So allow "--force" to mean that mdadm should try extra hard for a --remove to complete, waiting up to 5 seconds. Note that this patch fixes a comment which claim the previous wait time was half a second, where it was really 50msec. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:32:35 -04:00
NeilBrown	fdd015696c	Introduce sys_hot_remove_disk() The new hot_remove_disk() will retry HOT_REMOVE_DISK several times in the face of EBUSY. However we sometimes remove a device by writing "remove" to the "state" attributed. This should be retried as well. So introduce sys_hot_remove_disk() to repeat this action a few times. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:30:49 -04:00
NeilBrown	2dd271fe70	Retry HOT_REMOVE_DISK a few times. HOT_REMOVE_DISK can fail with EBUSY if there are outstanding IO request that have not completed yet. It can sometimes be helpful to wait a little while for these to complete. We already do this in impose_level() when reshaping a device, but not in Manage.c in response to an explicit --remove request. So create hot_remove_disk() to central this code, and call it where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:25:23 -04:00
Xiao Ni	ff9239ee31	mdadm: Specify enough length when write to buffer In Detail.c the buffer path in function Detail is defined as path[200], in fact the max lenth of content which needs to write to the buffer is 287. Because the length of dname of struct dirent is 255. During building it reports error: error: ‘%s’ directive writing up to 255 bytes into a region of size 189 [-Werror=format-overflow=] In function examine_super0 there is a buffer nb with length 5. But it need to show a int type argument. The lenght of max number of int is 10. So the buffer length should be 11. In human_size function the length of buf is 30. During building there is a error: output between 20 and 47 bytes into a destination of size 30. Change the length to 47. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-17 15:58:16 -04:00
Mariusz Dabrowski	31208db97e	Always return last partition end address in 512B blocks For 4K disks 'endofpart' is an index of the last 4K sector used by partition. mdadm is using number of 512-byte sectors, so value returned by get_last_partition_end must be multiplied by 8 for devices with 4K sectors. Also, unused 'ret' variable has been removed. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-12-13 09:09:25 -05:00
Mariusz Dabrowski	41b06495ba	Use disk sector size value to set offset for reading GPT mdadm is using invalid byte-offset while reading GPT header to get partition info (size, first sector, last sector etc.). Now this offset is hardcoded to 512 bytes and it is not valid for disks with sector size different than 512 bytes because MBR and GPT headers are aligned to LBA, so valid offset for 4k drives is 4096 bytes. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-12-12 14:26:22 -05:00
Pawel Baldysiak	329715091c	Add function for getting member drive sector size This patch introduces the function for getting sector size of given device (fd). Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-11-17 09:24:18 -05:00
James Clarke	8e2bca513e	Fix bus error when accessing MBR partition records Since the MBR layout only has partition records as 2-byte aligned, the 32-bit fields in them are not aligned. Thus, they cannot be accessed on some architectures (such as SPARC) by using a "struct MBR_part_record *" pointer, as the compiler can assume that the pointer is properly aligned. Instead, the records must be accessed by going through the MBR struct itself every time. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-10-19 12:38:02 -04:00
Mariusz Dabrowski	fa219dd26a	Fix RAID metadata check mdadm recognizes devices with partition table as part of an RAID array and invalid warning message is displayed. After this fix proper warning messages are being displayed for MBR/GPT disks and devices with RAID metadata. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-09-22 11:35:02 -04:00
Jes Sorensen	c5f71c2417	Introduce random_uuid() helper function This gets rid of 5 nearly identical copies of the same code, and reduces the binary size of mdadm by over 700 bytes on x86_64. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-08-15 15:41:34 -04:00
Jes Sorensen	9f0ad56be0	util: Never have if and return on the same line Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-08-11 15:48:47 -04:00
Mike Lovell	13db17bd1f	Use dev_t for devnm2devid and devid2devnm Commit `4dd2df0966` added a trip through makedev(), major(), and minor() for device major and minor numbers. This would cause mdadm to fail in operating on a device with a minor number bigger than (2^19)-1 due to it changing from dev_t to a signed int and back. Where this was found as a problem was when a array was created with a device specified as a name like /dev/md/raidname and there were already 128 arrays on the system. In this case, mdadm would chose 1048575 ((2^20)-1) for the array and minor number. This would cause the major and minor number to become negative when generated from devnm2devid() and passed to major() and minor() in open_dev_excl(). open_dev_excl() would then call dev_open() which would detect the negative minor number and call open() on the *char containing the major:minor pair which isn't a valid file. Signed-off-by: Mike Lovell <mlovell@bluehost.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-06-03 15:35:26 -04:00
Jes Sorensen	15d230f730	util: Remove unnecesary NULL pointer checks when calling sysfs_free() Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-08 12:19:03 -05:00
Guoqing Jiang	21f541cc31	Remove dead code about LKF_CONVERT flag Since flags is only set as LKF_NOQUEUE, the code with LKF_CONVERT flag should be delete. Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-07 15:21:04 -05:00
Maxin B. John	986b868817	util.c: include poll.h instead of sys/poll.h This fixes a compile warning when building with musl: In file included from util.c:27:0: \| qemux86-64/usr/include/sys/poll.h:1:2: error: #warning redirecting incorrect #include <sys/poll.h> to <poll.h> [-Werror=cpp] \| #warning redirecting incorrect #include <sys/poll.h> to <poll.h> \| ^ Signed-off-by: Maxin B. John <maxin.john@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-02-08 10:59:00 -05:00
Xiao Ni	1d13b59960	Fix some type comparison problems As `26714713cd` said, 32 bit signed timestamps will overflow in the year 2038. It already changed the utime and ctime in struct mdu_array_info_s from int to unsigned int. So we need to change the values that compared with them to unsigned int too. Signed-off-by : Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-02-08 10:49:22 -05:00
NeilBrown	7071320a18	Assorted fixed for a "make everything" build Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-28 13:28:58 +11:00
Guoqing Jiang	32539f74d2	util: fix wrong return value of cluster_get_dlmlock Actually lksb.sb_status means that a node got the lock or not instead of the return value of dlm_lock. Signed-off-by: Guoqing Jiang <gqjiang@suse.com>	2016-01-27 11:43:02 +11:00
Guoqing Jiang	81a8a69415	mdadm: improve the safeguard for change cluster raid's sb This commit does the following jobs: 1. rename is_clustered to dlm_funs_ready since it match the function better. 2. st->cluster_name can't be use to identify the raid is a clustered or not, we should check the bitmap's version to perform the identification. 3. for cluster_get_dlmlock/cluster_release_dlmlock funcs, both of them just need the lockid as parameter since the cluster name can get by get_cluster_name(). Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-17 09:53:37 +11:00
Guoqing Jiang	e80357f825	Make cmap_* also has same policy as dlm_* Let libcmap lib and related funs also only need one-time setup during mdadm running period. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-10-21 11:19:35 +11:00
Guoqing Jiang	d15a1f72bd	Safeguard against writing to an active device of another node Modifying an exiting device's superblock or creating a new superblock on an existing device needs to be checked because the device could be in use by another node in another array. So, we check this by taking all superblock locks in userspace so that we don't step onto an active device used by another node and safeguard against accidental edits. After the edit is complete, we release all locks and the lockspace so that it can be used by the kernel space. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-10-21 11:19:05 +11:00
NeilBrown	7d55dca2cc	mdassemble: don't try to perform cluster check. mdassemble is meant to be small an simple, so avoid trying to check for a cluster. Currently it doesn't, but it still includes the code, which doesn't build because the library isn't provided. So just exclude the get_cluster_name code from mdassemble. Signed-off-by: NeilBrown <neilb@suse.com>	2015-08-03 11:53:01 +10:00
Guoqing Jiang	4de9091302	Add a new clustered disk A clustered disk is added by the traditional --add sequence. However, other nodes need to acknowledge that they can "see" the device. This is done by --cluster-confirm: --cluster-confirm SLOTNUM:/dev/whatever (if disk is found) or --cluster-confirm SLOTNUM:missing (if disk is not found) The node initiating the --add, has the disk state tagged with MD_DISK_CLUSTER_ADD and the one confirming tag the disk with MD_DISK_CANDIDATE. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:21:29 +10:00

1 2 3 4 5 ...

304 Commits