The cmd_filter patch merged for 2.6.27 broke retrieving the serial
number via an ioctl to /dev/sgN. In debugging this I found that other
utilities like sdparm simply run the ioctl on /dev/sdX. So just convert
to that for protection in numbers, but scream on the mailing list for
the inconvenience grr...
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Using buffered IO risks non-atomic updates to parts of the
device that we don't actually want to write to. This isn't in
general safe.
So switch to O_DIRECT for all that IO and make sure we have
properly aligned buffers.
The returned value was never used, and we don't really want
this return path anyway as writing to a pipe could conceivably
block, and the monitor must not block.
This really should be done in mdadm, not mdmon.
We ensure the device won't be suddenly commited as a hot-spare
using O_EXCL, then check the 'holders' sysfs directory
to make sure it is only in use once.
When loading the metadata for a subarray (super_by_fd), we set
->subarray to be the name read from md/metadata_version so that
getinfo_super can return info about the correct array.
With this we can differentiate between a container and
an array within the container by looking at ->subarray[0].
Only one superswitch should be externally visible for each
general type. Others which handle different flavours
(e.g. container/data-array) should be internal only.
Code in manager can now just call queue_metadata_update with a
(freeable) buf holding the update, and it will get passed to the
monitor and written out.
'container_member' isn't really a well defined concept.
Each metadata might enumerate members differently, so just
let each format /mdX/YYYY as appropriate.
I want the metadata handler to have more control over the 'version',
particularly for arrays which are members of containers.
So discard st->text_version and instead use info->text_version
which getinfo_super can initialise.
mark_dirty is just a special case of mark_clean - with sync_pos == 0.
mark_sync is not required. We don't modify the metadata when sync
finishes. Only when the array becomes non-writeable at which point we
use mark_clean to record how far the resync progressed.
From: Dan Williams <dan.j.williams@intel.com>
Each md_message encapsulates a single command. A command includes an 'action'
member which describes what if any data comes after the action. Communication
with the monitor involves updating the active_cmd pointer and then writing to
mgr_pipe. Pass/fail status is returned via mon_pipe.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
From: Dan Williams <dan.j.williams@intel.com>
Added curr_state as a parameter to set_disk. Handlers look at this to
record components failures, and set global 'degraded' or 'failed'
status.
When reading the state as faulty:
1/ mark the disk failed in the metadata
2/ write '-blocked' to the rdev state to allow the kernel's failure
mechanism to advance
3/ the kernel will take away the drive's role in remove_and_add_spares()
4/ once the disk no longer has a role writing 'remove' to the rdev state
will get the disk out of array.
There is a window after writing '-blocked' where the kernel will return
-EBUSY to remove requests. We rely on the fact that the disk will
continue to show faulty so we lazily wait until the kernel is ready to
remove the disk. If the manager thread needs to get the disk out of the
way it can ping the monitor and wait, just like the replace_array()
case.
[buglet fix: swap the parameters of attr_match in read_dev_state]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
From: Dan Williams <dan.j.williams@intel.com>
1/ Block attempts to add/remove devices from container members
2/ Forward add/remove requests to containers
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
From: Dan Williams <dan.j.williams@intel.com>
Metadata handlers set mdinfo.resync_start depending on the state of the
array. By default mdadm assumes the array is dirty and needs a full
resync.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
From: Dan Williams <dan.j.williams@intel.com>
The following now work:
--examine
--examine --brief
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Create a ddf array by naming the device /dev/ddf* or
specifying metadata 'ddf'.
If ddf is specified with no level, assume a container (indeed,
anything else would be wrong).
**Need to use text_Version to set external metadata...
More ddf support
Load a ddf container. Now
--examine /dev/ddf
works.
super-ddf: fix compile warning
From: Dan Williams <dan.j.williams@intel.com>
super-ddf.c:723: format %lu expects type long unsigned int, but argument 3 has type unsigned int
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The current model for creating arrays involves writing
a superblock to each device in the array.
With containers (as with DDF), that model doesn't work.
Every device in the container may need to be updated
for an array made from just some the devices in a container.
So instead of calling write_init_super for each device,
we call it once for the array and have it iterate over
all the devices in the array.
To help with this, ->add_to_super now passes in an 'fd' and name for
the device. These get saved for use by write_init_super. So
add_to_super takes ownership of the fd, and write_init_super will
close it.
This information is stored in the new 'info' field of supertype.
As part of this, write_init_super now removes any old traces of raid
metadata rather than doing this in common code.