Commit Graph

44 Commits

Author SHA1 Message Date
Jakub Radtke b090e91075 Modify mdstat parsing for volumes with the bitmap
Current mdstat read functionality is not working correctly
for the volumes with the write-intent bitmap.
It affects rebuild and reshape use cases.

Signed-off-by: Jakub Radtke <jakub.radtke@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2021-03-09 17:17:39 -05:00
Blazej Kucman cab9c67d46 mdmonitor: set small delay once
If mdmonitor is awakened by event, set small delay once
to deal with udev and mdadm.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2020-10-14 11:33:47 -04:00
Mariusz Tkaczyk e230873391 Monitor: refresh mdstat fd after select
After 52209d6ee1 ("Monitor: release /proc/mdstat fd when no arrays
present") mdstat fd is closed if mdstat is empty or cannot be opened.
It causes that monitor is not able to select on mdstat. Select
doesn't fail because it gets valid descriptor to a different resource.
As a result any new event will be unnoticed until timeout (delay).

Refresh mdstat after wake up, don't poll on wrong resource.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2020-10-14 11:31:39 -04:00
Lidong Zhong 1c294b5d96 Detail: adding sync status for cluster device
On the node with /proc/mdstat is

Personalities : [raid1]
md0 : active raid1 sdb[4] sdc[3] sdd[2]
      1046528 blocks super 1.2 [3/2] [UU_]
        recover=REMOTE
      bitmap: 1/1 pages [4KB], 65536KB chunk

Let's change the 'State' of 'mdadm -Q -D' accordingly
State : clean, degraded
With this patch, it will be
State : clean, degraded, recovering (REMOTE)

Signed-off-by: Lidong Zhong <lidong.zhong@suse.com>
Acked-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2020-04-27 10:33:46 -04:00
Zhilong Liu 35c34037b5 mdadm/mdstat: correct the strncmp number 4 as 6
mdstat: it should be corrected as 6 when strncmp 'resync'.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-10-10 16:38:02 -04:00
Zhilong Liu a9db89956e mdadm/mdstat: fixup a number of '==' broken formatting
This commit doesn't change any codes, just tidy up the
code formatting.

Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-10-10 16:37:05 -04:00
Jes Sorensen d7be7d8736 mdadm: Fixup more broken logical operator formatting
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2017-05-16 13:59:43 -04:00
Tomasz Majchrzak 52209d6ee1 Monitor: release /proc/mdstat fd when no arrays present
If md kernel module is reloaded, /proc/mdstat cannot be accessed ("cat:
/proc/mdstat: No such file or directory"). The reason is mdadm monitor
still holds a file descriptor to previous /proc/mdstat instance. It
leads to really confusing outcome of the following operations - mdadm
seems to run without errors, however some udev rules don't get executed
and new array doesn't work.

Add a check if lseek was successful as it fails if md kernel module has
been unloaded - close a file descriptor then. The problem is mdadm
monitor doesn't always do it before next operation takes place. To
prevent it monitor always releases /proc/mdstat descriptor when there
are no arrays to be monitored, just in case driver unload happens in a
moment.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2016-07-21 11:37:17 -04:00
NeilBrown 9581efb1ae mdstat: discard 'dev' field, just use 'devnm'
These both have the same value, and have done since the
'devnm' concept was introduced.
So discard the pointless duplicate.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-07-02 08:15:10 +10:00
NeilBrown a7a0d8a116 Grow: use mdstat_wait to wait for delayed reshape.
Having a fix time for a wait is clumsy and can make us
wait much too long.
So use mdstat_wait and keep the mdstat_fd open.
This requires an 'mdstat_close' so it doesn't stay open
forever.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 11:10:54 +10:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown 9dad51d418 Monitor: fix inconsistencies in values for ->percent
->percent sometimes stores negative values recording states
like 'pending' or 'delayed'.
The value '-2' means both 'delayed' and in Monitor, 'unknown'.
Also, '-1' has a meaning but not #define.

So change the #defines to be prefixed with "RESYNC_", instead
of "PROCESS_", add new "_NONE" and "_UNKNOWN", and use correct
value in each location.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-06-04 12:31:40 +10:00
Jes Sorensen d94a4f62bf mdstat_read(): Check return value of dup() before using file descriptor
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-03 08:07:21 +11:00
Krzysztof Wojcik 2d3603ba0c Show DELAYED, PENDING status of resync process in "--detail"
Initially there is no proper translation mdstat's DELAYED/PENDING processes
to "--detail" output.
For example, if we have recover=DELAYED in mdstat, "--detail"
shows "State: recovering" and "Rebuild Status = 0%".
It was incorrect in case of process waiting on checkpoint different
than 0%. In fact rebuild status is differnt than 0% and user is misled.

The patch fix the problem. Current "--detail" command shows
in the exampe: "State: recovering (DELAYED)" and no information
about precentage.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-23 12:06:47 +10:00
Krzysztof Wojcik f6f53092ff FIX: Position calculation in mdstat_by_subdev
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-06 16:07:20 +11:00
NeilBrown 78b10e663c imsm: Prepare reshape_update in mdadm
During Online Capacity Expansion metadata has to be updated to show
array changes and allow for future assembly of array.  To do this
mdadm prepares and sends reshape_update metadata update to mdmon.
The update contains the old and new number of raid disks, and the
indices of the spare disks that will be used to fill the spaces.

This works as follows:
1. reshape_super() prepares metadata update.
2. mdadm discovers the spares and adds them to the array
3. mdadm sends the update to mdmon
4. managemon in prepare_update() allocates required memory for bigger
   device object
5. monitor in process_update() updates the metadata to record the
   new sizes and the newly assigned devices.
6. mdadm initiates the reshape

Based on code From: Adam Kwolek <adam.kwolek@intel.com>

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 11:45:21 +11:00
Hawrylewicz Czarnowski, Przemyslaw aa8d7dc714 fix: mdstat_read() incorrectly translates value of mdstat_ent->reshape for recovering
it results in wrong output of mdadm --detail (shows reshaping instead
of recovering)

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-09 08:42:57 +11:00
NeilBrown ab2bb0b621 mdmon: don't copy an invalid chunk_size
As chunk_size in mdstat_ent is never set, we shouldn't copy
it into a->info.array.
In fact, it is safest to get rid of the field altogether.

Reported-by: "Kwolek, Adam" <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 18:35:36 +11:00
NeilBrown f94c116f56 detail/wait: better handling of monitoring sync action.
Detail: report reshape and check as well as resync and recovery
Wait: if the resync is pending or delayed, wait for that too.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
Marcin Labun 0ad6835c98 Fix the count of member devices in mdstat_read function.
Correction of the number of container or volume member devices (devcnt
in struct mdstat_ent). The number after the last devices was counted
towards member of devices.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-07-06 18:02:22 +10:00
NeilBrown 3b57c4661a Add mdstat_by_component
This allows finding the array which contains a given component.
Components are named using the kernel-internal string name such
as "sda1" or "hdb".
Don't return member arrays, only the contain that contains them.

Also tidy up the parsing of 'inactive' arrays in /proc/mdstat.
If we see 'inactive' we need to set 'in_devs' immediately as there
is no level coming.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-06-30 16:55:17 +10:00
Jeff DeFouw b6d7a7fbaa Fix parsing of inactive arrays in /proc/mdstat
They don't have a level, so we should not expect one, and should
expect devices instead.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-06-29 16:42:48 +10:00
NeilBrown 58a4ba2a6b mdmon: don't monitor /proc/mounts to decide when to create .pid file.
Monitoring /proc/mounts and creating a .pid file as soon as /var/run
is writable is racy.  Most distros clean all non-directories from
/var/run early in boot and if mdmon races with this it could
lose the files as soon as they are created.

Instead require that "mdmon --takeover" be run after /var is writable.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-08 17:26:18 +11:00
NeilBrown 5d4d1b26d3 mdmon: allow pid to be stored in different directory.
/var/run probably doesn't persist from early boot.
So if necessary, store in in /lib/init/rw or somewhere else
that does persist.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 16:47:28 +11:00
NeilBrown e736b62389 Update copyright dates and remove references to @cse.unsw.edu.au
Also removed 'paper' addresses.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-02 14:35:45 +10:00
NeilBrown 2800528713 Wait for POLLPRI on /proc or /sys files.
From 2.6.30, /proc/mounts and various /sys files will
probably always returns 'readable' to select, so we will need
to wait on POLLPRI to get the 'new data is available' signal.

When using select, this corresponds to an 'exception', so
adjust calls to select accordingly.
In one case we sometimes wait on a socket and sometime on
/proc/mounts, so we need to test which.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-04-14 14:59:24 +10:00
Dan Williams 295646b3d5 mdmon: recreate socket/pid file on SIGHUP
Allow mdmon to start while /var/run/mdadm is readonly.  Later a SIGHUP
can trigger mdmon to drop its pid and socket once /var/run/mdadm is
writable.  Of course one needs the pid to send a HUP, that can be stored
in a distribution specific rw-init directory... For now, rely on a
killall -HUP mdmon to get the files dumped.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:43 -07:00
NeilBrown 0b5ec75e01 Fix mdstat_wait_fd
It didn't necessarily wait for the fd.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:55:07 +10:00
Neil Brown 77472ff8d0 Introduce devname2devnum
and use it instead of opencoding.
2008-07-12 20:28:38 +10:00
Neil Brown 1ed3f38758 Remove stopped arrays.
When an array becomes inactive, clean up and forget it.

This involves signalling the manager.
2008-05-27 09:18:39 +10:00
Dan Williams 4fa5aef966 close some memory leaks
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:56 +10:00
Neil Brown 549e9569c6 Merge mdmon 2008-05-15 16:48:37 +10:00
Neil Brown aba69144fd Remove spaces/tabs from ends of lines. 2007-12-14 20:13:43 +11:00
Doug Ledford e4dc510628 Mark some files FD_CLOEXEC to protect sendmail from them.
From: Doug Ledford <dledford@redhat.com>

When running with SELinux enabled and using mdadm to monitor devices,
attempts to send emails to an admin will be blocked because mdadm is
holding open /proc/mdstat without setting the FD_CLOEXEC flag.  As a
result, sendmail has an open descriptor to /proc/mdstat after the
popen() call, which SELinux decides isn't really any of sendmail's
business and so sendmail gets denied.
2007-07-09 09:59:54 +10:00
Neil Brown 8382f19bdc Add new mode: --incremental
--incremental allows arrays to be assembled one device at a time.
This is expected to be used with udev.
2006-12-21 17:10:52 +11:00
Neil Brown 8f21823f39 Minor man page and comment fixes
Thanks To: "Scott Weikart" <Scott.W@Benetech.org>
2006-10-09 11:16:33 +10:00
Neil Brown 4f589ad0c5 Just updaqte copyright dates and email address
Signed-off-by: Neil Brown <neilb@suse.de>
2006-05-19 05:25:11 +00:00
Neil Brown 22a8899586 Sort mdstat entries so that composites are well-ordered.
This means that "-Ds" lists arrays in an approprate order
for assembly.

Signed-off-by: Neil Brown <neilb@suse.de>
2006-01-31 00:39:50 +00:00
Neil Brown e5329c3747 mdadm-1.7.0 2004-08-11 02:16:01 +00:00
Neil Brown dd0781e505 mdadm-1.6.0 2004-06-04 12:03:19 +00:00
Neil Brown 98c6faba80 mdadm-1.5.0 2004-01-22 02:10:29 +00:00
Neil Brown e0d1903663 mdadm-0.8 2002-04-04 01:58:32 +00:00