Commit Graph

97 Commits

Author SHA1 Message Date
NeilBrown e98ef22509 mdmon: improve switchroot handling.
The change to get mdmon to re-exec itself from the switchroot
filesystem broken switchroot in various ways.  This fixes it.

If the switchroot path is not '/', mdmon will find the pid and
socket for the monitor, chroot to the new root, and exec mdmon
passing the pid in argv[2] and the socket in stdin.

If the switchroot path is actually a number, mdmon will not chroot,
but will kill that pid before taking over the array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 12:04:18 +11:00
NeilBrown af7ca33487 mdmon: simplify try_kill_monitor
After we SIGTERM the monitor we need to wait for it to finish up.
Rather than the complexity of waiting for every md array to be clean,
we can simply read from the sock connected to the monitor.
When the monitor dies, we will get EOF.  Before then we will block.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 12:04:16 +11:00
NeilBrown 3e7312a96c mdmon: remove scan variable from mdmon()
It is redundant as each place that it is used, it can only
have one possible value.
Also change the related arg to mdmon() to have a more meaningful
name.
And make mdmon() static.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 12:04:15 +11:00
NeilBrown 417a4b046d mdmon: fix fd leak and possible buffer overrun.
We normally wouldn't close 'fd', and as 'buf' might not have
had a nul, strstr could have overrun it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-01-29 10:15:15 +11:00
NeilBrown 1373b07d75 mdmon: lock current memory as well as future memory.
mlockall(MCL_FUTURE) only locks mappings that have not yet
been created.  To lock all memory used by the process, we need
 MCL_CURRENT | MCL_FUTURE

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 13:04:16 +11:00
Dan Williams 9f1da82421 mdmon: preserve socket over chroot
Connect to the monitor in the old namespace and use that connection for
WaitClean requests when stopping the victim mdmon instance.  This allows
ping_monitor() to work post chroot().

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:58 -07:00
Dan Williams b928b5a038 mdmon: exec(2) when the switchroot argument is not "/"
Try to execute mdmon from the target namespace.  When used for initramfs
handovers we need to drop all references to the initramfs filesystem for
that memory to be freed.

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:58 -07:00
Dan Williams 96a8270d46 mdmon: avoid writes in the startup path for mdmon on root arrays
When killing a previous monitor be careful not to cause writes to the
filesystem until the reads necessary to get the monitor operational have
completed.

The code is already prepared for errors creating the pid and socket
files, so simply defer creation of these files until after the first
call to manage().

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:57 -07:00
Hans de Goede f5df5d69a7 mdmon: fix freeing unallocated memory
mdmon was creating a supertype struct with malloc, and thus not
necessarily getting zero-d memory.

This was causing it to segfault when called like this from the initrd:
/sbin/mdmon /proc/mdstat /sysroot

The problem was that  load_super_imsm would get called on the non-zero'd
super struct, whcih in turn calls free_super_imsm, which checks st->sb,
which should be zero but isn't and then starts freeing bogus memory.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-24 06:52:06 -07:00
NeilBrown e736b62389 Update copyright dates and remove references to @cse.unsw.edu.au
Also removed 'paper' addresses.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-02 14:35:45 +10:00
Dan Williams 1b34f51997 mdmon: update cmdline when scanning
Allows ps -ax | grep mdmon to show:
	mdmon md127
	mdmon md126
...rather than:
	mdmon /proc/mdstat
	mdmon /proc/mdstat

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 7da80e6faa mdmon: fix removed disk handling
Use SKIP_GONE_DEVS when reading the container, and correct some confused
logic in manage_new().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00
Dan Williams 5746141e3f mdmon: make switchroot an undecorated option
Simplify the usage from:
	mdmon [--switch-root dir] /device/name/for/container
to...
	mdmon /device/name/for/container [target_dir]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 1ffd2840df mdmon: support scanning for containers
When the given container is '/proc/mdstat' then launch an mdmon instance
per container found in /proc/mdstat.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams 6f4098a6fd mdmon: expand permissible container device names
Allow any path that dereferences to an md device to be used in addition
to the current symbolic md device names.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-20 01:36:50 -07:00
Dan Williams c1363b408f mdmon: fix missing ->subarray initialization
This can cause mdmon to fail at startup.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-01-13 15:46:05 -07:00
NeilBrown e8a70c8958 mdmon: pass symbolic name to mdmon instead of device name.
Now that names in /dev are usually created (eventually) by udev,
it isn't really safe to rely in finding a name in /dev to pass to
mdmon to identify which array to monitor.
And it isn't really necessary to have a name in /dev.
So just pass the symbolic name, e.g. md127 or md123.

Change util.c to pass that name, and change mdmon to process the
name sensibly.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-11-20 14:51:42 +11:00
NeilBrown 97f734fde2 A couple of bugfixes found by suse autobuilding:
1/ ia64 appear to have __clone2, not clone.
2/ Including "++" in the arg to a macro is a bad thing to do.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-11-07 14:46:30 +11:00
Dan Williams ce744c97bc Assemble: revert preliminary -As support
I have seen the light.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-11-04 20:51:11 +11:00
NeilBrown 40ebbb9cfe util: make env checking more generic
Change the "env_check_mdmon" function to be more generic, accepting
and environment variable name, as soon we will have a new use for it.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-11-04 10:35:43 +11:00
Dan Williams 71d60c480a Preliminary -As support for container member arrays
Given an mdadm.conf like the following allow /dev/imsm and /dev/md/r1 to be
created by "mdadm -As".

DEVICES partitions 
ARRAY /dev/imsm metadata=imsm auto=md UUID=b98f5dbe-aa859e7b-0e369b89-a80986d4 
ARRAY /dev/md/r1 container=/dev/imsm member=0 auto=mdp UUID=3538e39c-b397c2e9-1aa031f9-2bc0eca4 
   spares=1

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-28 10:55:31 -07:00
Dan Williams a54d52625a update copyright headers
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-28 10:55:29 -07:00
Dan Williams 8aae4219a2 mdmon: suicide prevention
mdmon cannot remove the pidfile at shutdown becuase it needs to stay
running across the "mount -o remount,ro /" event.  When it relaunches
after a reboot there is a good chance that the pid will match what was
there previously.  The result is that the "take over for unresponsive
mdmon" logic results in self termination.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:43:57 -07:00
Dan Williams 27dec8fae3 quiet WaitClean()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:43:57 -07:00
Dan Williams 13047e4c07 mdmon: --switch-root
For raid rootfs we cannot run the array unmonitored for any length of
time.  At least XFS will not mount/replay the journal if the underlying
block device is readonly (FIXME it also seems that XFS does not always
honor the ro status of the backing device as I was able to hit the
BUG_ON(mddev->ro == 1) in md_write_start... but I digress).

So we need to start mdmon in the initramfs before '/' is mounted and
then restart it after the real rootfs is available.  Upon seeing the
--switch-root option, mdmon will kill any victims in the current
/var/run/mdadm directory and then chroot(2) before continuing.

The option is deliberately called 'switch-root' instead of 'chroot' to
hopefully indicate that this is different than doing "chroot mdmon
/dev/imsm".

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:43:57 -07:00
Dan Williams 883a6142e6 mdmon: wait after trying to kill
Now that mdmon handles sigterm if another monitor wants to take over it
should wait until all managed arrays are clean.  So make WaitClean()
available to mdmon and teach try_kill_monitor() to wait on each subarray
in the container.

...since we may be communicating with a dieing process, we need to
block SIGPIPE earlier.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:43:57 -07:00
Dan Williams 6144ed4414 mdmon: terminate clean
We generally don't want mdmon to be terminated, but if a SIGTERM gets
through try to leave the monitored arrays in a clean state, block
attempts to mark the array dirty, and stop servicing the socket.

When we are killed by sigterm don't remove the pidfile let that be
cleaned up by the next monitor.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:43:57 -07:00
Dan Williams 695154b2e7 mdmon: periodically retry to create the socket
If initial socket creation fails, EROFS, set a periodic alarm to wake up
the manager and retry.  Include a kernel patch that will wake us up if
the mount flags are changed.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:15:52 -07:00
Dan Williams 3d2c4fc7b6 trivial warn_unused_result squashing
Made the mistake of recompiling the F9 mdadm rpm which has a patch to
remove -Werror and add "-Wp,-D_FORTIFY_SOURCE -O2" which turns on lots
of errors:

config.c:568: warning: ignoring return value of asprintf
Assemble.c:411: warning: ignoring return value of asprintf
Assemble.c:413: warning: ignoring return value of asprintf
super0.c:549: warning: ignoring return value of posix_memalign
super0.c:742: warning: ignoring return value of posix_memalign
super0.c:812: warning: ignoring return value of posix_memalign
super1.c:692: warning: ignoring return value of posix_memalign
super1.c:1039: warning: ignoring return value of posix_memalign
super1.c:1155: warning: ignoring return value of posix_memalign
super-ddf.c:508: warning: ignoring return value of posix_memalign
super-ddf.c:645: warning: ignoring return value of posix_memalign
super-ddf.c:696: warning: ignoring return value of posix_memalign
super-ddf.c:715: warning: ignoring return value of posix_memalign
super-ddf.c:1476: warning: ignoring return value of posix_memalign
super-ddf.c:1603: warning: ignoring return value of posix_memalign
super-ddf.c:1614: warning: ignoring return value of posix_memalign
super-ddf.c:1842: warning: ignoring return value of posix_memalign
super-ddf.c:2013: warning: ignoring return value of posix_memalign
super-ddf.c:2140: warning: ignoring return value of write
super-ddf.c:2143: warning: ignoring return value of write
super-ddf.c:2147: warning: ignoring return value of write
super-ddf.c:2150: warning: ignoring return value of write
super-ddf.c:2162: warning: ignoring return value of write
super-ddf.c:2169: warning: ignoring return value of write
super-ddf.c:2172: warning: ignoring return value of write
super-ddf.c:2176: warning: ignoring return value of write
super-ddf.c:2181: warning: ignoring return value of write
super-ddf.c:2686: warning: ignoring return value of posix_memalign
super-ddf.c:2690: warning: ignoring return value of write
super-ddf.c:3070: warning: ignoring return value of posix_memalign
super-ddf.c:3254: warning: ignoring return value of posix_memalign
bitmap.c:128: warning: ignoring return value of posix_memalign
mdmon.c:94: warning: ignoring return value of write
mdmon.c:221: warning: ignoring return value of pipe
mdmon.c:327: warning: ignoring return value of write
mdmon.c:330: warning: ignoring return value of chdir
mdmon.c:335: warning: ignoring return value of dup
monitor.c:415: warning: rv may be used uninitialized in this function

...some of these like the write() ones are not so trivial so save those
fixes for the next patch.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-15 14:15:52 -07:00
NeilBrown 5775279572 Remove .sock file when removing .pid file for mdmon 2008-09-18 16:43:59 +10:00
Dan Williams 295646b3d5 mdmon: recreate socket/pid file on SIGHUP
Allow mdmon to start while /var/run/mdadm is readonly.  Later a SIGHUP
can trigger mdmon to drop its pid and socket once /var/run/mdadm is
writable.  Of course one needs the pid to send a HUP, that can be stored
in a distribution specific rw-init directory... For now, rely on a
killall -HUP mdmon to get the files dumped.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-09-15 20:58:43 -07:00
Dan Williams 7bc1962f8c mdmon: remove devices from container
Once the monitor thread has kicked a drive from all managed arrays mdadm
-r is permitted.  We are guaranteed that the drive is marked failed at
this point, so allow the drive to be re-added as a spare.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-19 14:55:12 +10:00
Dan Williams 16ddab0daf mdmon: don't fork if DEBUG 2008-07-24 17:26:24 -07:00
NeilBrown 9fe3204317 mdmon: fork and run as a daemon.
start_mdmon now waits for mdmon to complete initialisation and,
importantly, listen on the socket, before continuing.

Signed-off-by: Neil Brown <neilb@suse.de>
2008-07-18 16:37:20 +10:00
Dan Williams 2cc98f9ea5 mdmon: close small window of invalid mon_tid
There is a small chance that the manager tries to wake the monitor before
mon_tid is set.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 14:59:39 -07:00
Neil Brown c7c149300b Keep container device open in monitor
... so that it cannot be stopped while there are active arrays.
I don't know where that second 'close' came from ....
2008-07-12 20:27:42 +10:00
Neil Brown bfa44e2e7a Revise message passing code.
More here
2008-07-12 20:27:40 +10:00
Neil Brown 4d43913ce0 Remove mgr_pipe for communicating from manage to monitor.
Data is being passed in shared memory, so the pipe is only being
use as a wakeup.  This can more easily be done with a thread-signal.
2008-07-12 20:27:40 +10:00
Neil Brown 2f64e61a50 Remove mon_pipe for communicating from monitor to manager
The returned value was never used, and we don't really want
this return path anyway as writing to a pipe could conceivably
block, and the monitor must not block.
2008-07-12 20:27:40 +10:00
Dan Williams 5b65005fc8 imsm: reenable mdmon
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:36:41 -07:00
Neil Brown e0d6609fe6 Exit when there are no more arrays to manage. 2008-05-27 09:18:41 +10:00
Neil Brown 5869a76c90 Remove supertype->devfd
It is never used.
2008-05-27 09:18:40 +10:00
Neil Brown 1ed3f38758 Remove stopped arrays.
When an array becomes inactive, clean up and forget it.

This involves signalling the manager.
2008-05-27 09:18:39 +10:00
Neil Brown 5d19760db0 Discard 'array_list' in mdmon
The container has an ->arrays field that we should be using.
2008-05-27 09:18:36 +10:00
Dan Williams 3e70c845e2 add infrastructure to receive higher order commands, like remove_device
From: Dan Williams <dan.j.williams@intel.com>

Each md_message encapsulates a single command.  A command includes an 'action'
member which describes what if any data comes after the action.  Communication
with the monitor involves updating the active_cmd pointer and then writing to
mgr_pipe.  Pass/fail status is returned via mon_pipe.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:54 +10:00
Dan Williams b109d92863 start fleshing out socket code, ping monitor to see if it is alive
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:52 +10:00
Neil Brown 549e9569c6 Merge mdmon 2008-05-15 16:48:37 +10:00