mdmon: man page

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-24 18:45:57 -07:00 · 2009-02-24 18:45:57 -07:00 · 7675959b0f
parent 140d3685fb
commit 7675959b0f
1 changed files with 138 additions and 0 deletions
--- a/mdmon.8
+++ b/mdmon.8
@ -0,0 +1,138 @@
+.\" See file COPYING in distribution for details.
+.TH MDMON 8 "" v3.0-devel2
+.SH NAME
+mdmon \- monitor MD external metadata arrays
+
+.SH SYNOPSIS
+
+.BI mdmon " CONTAINER [NEWROOT]"
+
+.SH OVERVIEW
+The 2.6.27 kernel brings the ability to support external metadata arrays.
+External metadata implies that user space handles all updates to the metadata.
+The kernel's responsibility is to notify user space when a "metadata event"
+occurs, like disk failures and clean-to-dirty transitions.  The kernel, in
+important cases, waits for user space to take action on these notifications.
+
+.SH DESCRIPTION
+.P
+.B Metadata updates:
+.P
+To service metadata update requests a daemon, mdmon, is introduced.
+Mdmon is tasked with polling the sysfs namespace looking for changes in
+.BR array_state , 
+.BR sync_action ,
+and per disk
+.BR state
+attributes.  When a change is detected it calls a per metadata type
+handler to make modifications to the metadata.  The following actions
+are taken:
+.RS
+.TP
+.B array_state \- inactive
+Clear the dirty bit for the volume and let the array be stopped
+.TP
+.B array_state \- write pending
+Set the dirty bit for the array and then set
+.B array_state
+to
+.BR active .
+Writes
+are blocked until userspace writes
+.BR active.
+.TP
+.B array_state \- active-idle
+The safe mode timer has expired so set array state to clean to block writes to the array
+.TP
+.B array_state \- clean
+Clear the dirty bit for the volume
+.TP
+.B array_state \- read-only
+This is the initial state that all arrays start at.  mdmon takes one of the three actions:
+.RS
+.TP
+1/
+Transition the array to read-auto keeping the dirty bit clear if the metadata
+handler determines that the array does not need resyncing or other modification
+.TP
+2/
+Transition the array to active if the metadata handler determines a resync or
+some other manipulation is necessary
+.TP
+3/
+Leave the array read\-only if the volume is marked to not be monitored; for
+example, the metadata version has been set to "external:\-dev/md127" instead of
+"external:/dev/md127"
+.RE
+.TP
+.B sync_action \- resync\-to\-idle
+Notify the metadata handler that a resync may have completed.  If a resync
+process is idled before it completes this event allows the metadata handler to
+checkpoint resync.
+.TP
+.B sync_action \- recover\-to\-idle
+A spare may have completed rebuilding so tell the metadata handler about the
+state of each disk.  This is the metadata handler’s opportunity to clear any
+"out-of-sync" bits and clear the volume’s degraded status.  If a recovery
+process is idled before it completes this event allows the metadata handler to
+checkpoint recovery.
+.TP
+.B <disk>/state \- faulty
+A disk failure kicks off a series of events.  First, notify the metadata
+handler that a disk has failed, and then notify the kernel that it can unblock
+writes that were dependent on this disk.  After unblocking the kernel this disk
+is set to be removed* from the member array.  Finally the disk is marked failed
+in all other member arrays in the container.
+.IP
+\* Note This behavior differs slightly from native MD arrays where
+removal is reserved for a
+.B mdadm --remove
+event.  In the external metadata case the container holds the final
+reference on a block device and a
+.B mdadm --remove <container> <victim>
+call is still required.
+.RE
+
+.P
+.B Containers:
+.P
+External metadata formats, like DDF, differ from the native MD metadata
+formats in that they define a set of disks and a series of sub-arrays
+within those disks.  MD metadata in comparison defines a 1:1
+relationship between a set of block devices and a raid array.  For
+example to create 2 arrays at different raid levels on a single
+set of disks, MD metadata requires the disks be partitioned and then
+each array can created be created with a subset of those partitions.  The
+supported external formats perform this disk carving internally.
+.P
+Container devices simply hold references to all member disks and allow
+tools like mdmon to determine which active arrays belong to which
+container.  Some array management commands like disk removal and disk
+add are now only valid at the container level.  Attempts to perform
+these actions on member arrays are blocked with error messages like:
+.IP
+"mdadm: Cannot remove disks from a \'member\' array, perform this
+operation on the parent container"
+.P
+Containers are identified in /proc/mdstat with a metadata version string
+"external:<metadata name>". Member devices are identified by
+"external:/<container device>/<member index>", or "external:-<container
+device>/<member index>" if the array is to remain readonly.
+
+.SH OPTIONS
+.TP
+CONTAINER
+The
+.B container
+device to monitor.  It can be a full path like /dev/md/container, a simple md
+device name like md127, or /proc/mdstat which tells mdmon to scan for
+containers and launch an mdmon instance for each one found.
+.TP
+[NEWROOT]
+In order to support an external metadata raid array as the rootfs mdmon needs
+to be started in the initramfs environment.  Once the initramfs environment
+mounts the final rootfs mdmon needs to be restarted in the new namespace.  When
+NEWROOT is specified mdmon will terminate any mdmon instances that are running
+in the current namespace, chroot(2) to NEWROOT, and continue monitoring the
+container.
+