Commit Graph

1087 Commits

Author SHA1 Message Date
NeilBrown d28c1a7383 Release 3.0.3
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-22 12:05:22 +11:00
NeilBrown 0eb26465c0 Free some malloced memory that wasn't being freed.
As mdadm is normally a short-lived program it isn't always necessary
to free memory that was allocated, as the 'exit()' call will
automatically free everything.  But it is more obviously correct if
the 'free' is there.
So this patch add a few calls to 'free'

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-22 11:00:56 +11:00
NeilBrown 1799c9e8f8 super-intel: Fix compilation of mdassemble.
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 13:50:23 +11:00
NeilBrown 1dfcc211b1 testreshape5 fixes.
We seem to need a 'udevadm settle', and possibly the 'sync'..

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 08:02:53 +11:00
NeilBrown 151ea1a33d tests/imsm: allow for rounding of array size.
IMSM rounds array size to a multiple of 1024K, so our tests must
assume this.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 08:00:35 +11:00
NeilBrown 5ac6db12f9 mdopen: only use 'dev' as chosen name if it is a full path.
Otherwise using names like "r0" causes problem.  They are
handled sufficiently by other paths in the code.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 17:11:15 +11:00
NeilBrown 8a0a0ded4a Assemble: handle container members better
When looking for a specific member, don't accept a
different member, but step on to the next one.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 17:08:04 +11:00
NeilBrown 7636b5a8bb Assemble: print verbose messages when finding members in containers
.. so that "-Av" gives more hints at what is going on.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 17:04:12 +11:00
NeilBrown 8f1b2bbbb9 Detail: list containers before members.
To allow "--assemble --scan" to have a chance, list
containers before members in --detail --scan output.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 17:00:52 +11:00
NeilBrown 00eb571675 test/ddf: don't insist that mdadm.conf is always in the same order.
When created by different process, the order could reasonably
be different.  So sort before compare

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 16:58:38 +11:00
NeilBrown 453e3b41d0 test/raid6integ: correct type
ddf-zero-restart was misspelled.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 16:57:16 +11:00
NeilBrown 2e48e34945 test: udev-settle before testing device.
I think we sometime get way ahead of udev and devices disappear
and appear almost at random.  So add some settling.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 16:56:13 +11:00
Mike Frysinger d16c7af6d8 mdadm(8): fix spurious space after -e header
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 13:15:48 +11:00
Zdenek Behan 9a36a9b713 Monitor: add option to specify rebuild increments
ie. the percent increments after which RebuildNN event is generated

This is particulary useful when using --program option, rather than
(only) syslog for alerts.

Signed-off-by: Zdenek Behan <rain@matfyz.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 13:13:58 +11:00
NeilBrown 1373b07d75 mdmon: lock current memory as well as future memory.
mlockall(MCL_FUTURE) only locks mappings that have not yet
been created.  To lock all memory used by the process, we need
 MCL_CURRENT | MCL_FUTURE

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-19 13:04:16 +11:00
NeilBrown 5d504f4278 Merge git://github.com/djbw/mdadm 2009-10-19 12:52:58 +11:00
Dan Williams 9f1da82421 mdmon: preserve socket over chroot
Connect to the monitor in the old namespace and use that connection for
WaitClean requests when stopping the victim mdmon instance.  This allows
ping_monitor() to work post chroot().

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:58 -07:00
Dan Williams b928b5a038 mdmon: exec(2) when the switchroot argument is not "/"
Try to execute mdmon from the target namespace.  When used for initramfs
handovers we need to drop all references to the initramfs filesystem for
that memory to be freed.

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:58 -07:00
Dan Williams 96a8270d46 mdmon: avoid writes in the startup path for mdmon on root arrays
When killing a previous monitor be careful not to cause writes to the
filesystem until the reads necessary to get the monitor operational have
completed.

The code is already prepared for errors creating the pid and socket
files, so simply defer creation of these files until after the first
call to manage().

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:57 -07:00
Dan Williams aae5a11207 Detail: export MD_UUID from mapfile
The load_super() from an mdadm --detail call may race against an mdmon
update.  When this happens the load_super sees an inconsistent metadata
block and returns an error.  The fallback path to use the map file
contents lacks uuid reporting, so provide __fname_from_uuid for
generically printing a uuid.

Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:57 -07:00
Dan Williams d2b9eb5993 imsm: regression test for prodigal array member scenario
Provide a test to sanity check assembly and reassembly in the presence
of conflicting family number information.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams 6e46bf344b imsm: add --update=uuid support
When disks have conflicting container memberships (same container ids
but incompatible member arrays) --update=uuid can be used to move
offenders to a new container id by changing 'orig_family_num'.

Note that this only supports random updates of the uuid as the actual
uuid is synthesized.  We also need to communicate the new
'orig_family_num' value to all disks involved in the update.  A new
field 'update_private' is added to struct mdinfo to allow this
information to be transmitted.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams 955e9ea139 ddf: prevent superblock being zeroed on --update
The full fix would be to support updating ddf metadata, but this minimal
fix just prevents the superblock from being zeroed when someone
inadvertently passes an unsupported --update option during assembly.

Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams e683ca88ac imsm: fix/support --update
Fix init_super_imsm() to return an empty mpb when info == NULL, and
teach store_super_imsm() to simply write out the passed in mpb.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=523320

Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams f796af5d5e imsm: fix spare record writeout race
imsm_activate_spare() in the manager thread may race against
write_super_imsm_spares() in the monitor thread.  Give
write_super_imsm_spares() its own private mpb buffer to prevent
confusing the manager.

This change uncovered cases where spares were not being assembled due to
a failed metadata version number check.  Spares can freely associate
across metadata version number, so reduce the scope of the version check
in the spare assembly case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
NeilBrown 2b9aa337af Fix null-dereference in set_member_info
set_member_info would try to dereference ->metadata_version, without
checking that it isn't NULL.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-01 12:51:04 +10:00
NeilBrown 0e90271e53 Add missing space in "--detail --brief" output.
We need a space between the device name and the word "level"..

Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-01 12:38:31 +10:00
Dan Williams a2b9798159 imsm: disambiguate family_num
This is a result of trawling through the Windows implementation to learn
the mechanism of how it disambiguates family_num.  It is a continuation
of commit 148acb7b "imsm: fix family number handling" which introduced a
regression when reassembling a container with stale disks and rebuilt
members.

When rebuilding, a new family number is assigned to protect against the
"prodigal array member" problem.  It prevents a former family member
from returning to the system and causing a rebuild to go the wrong
direction.  However, this invalidates looking at the generation number to
determine the most up-to-date disk when comparing across family numbers.
Instead the assembly logic looks for agreement between a disk's local
family membership compared against a global list of all families in the
system.  Whenever a disk's local metadata does not match a family number
on the global list that family number is marked offline.

It is possible that this logic results in multiple incompatible but
valid family numbers existing in a container.  In this case mdadm.conf
cannot be consulted because it only records the uuid which is generated
from static fields in the metadata.  The metadata lacks the data needed
to disambiguate "local" versus "foreign".  The "foreign" array in this
case requires updating to change its container-id information
(orig_family_num), and possibly the member array names.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-30 11:45:41 -07:00
Dan Williams 51725a7c25 imsm: kill close() of component device
None of the other formats close the passed in fd at load, and this
becomes a problem when trying to support --update where we need O_EXCL
protection across the entire operation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-30 11:44:38 -07:00
Dan Williams 25ed7e5924 imsm: cleanup disk status tests
Add is_failed(), is_configured(), and is_spare() helpers to clean up
disk status flag testing.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-28 14:40:59 -07:00
NeilBrown 58ad57f684 Release mdadm-3.0.2
Just one bugfix.
2009-09-25 18:19:07 +10:00
NeilBrown 40d28f0d1b super0: fix crash on assemble if homehost is not set.
If homehost is not set - typically during early boot,
and assemble of v0.90 metadata arrays will crash.

Reported-by: Paweł Sikora <pluto@agmk.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-09-25 17:56:22 +10:00
NeilBrown d8419fe9e9 Release mdadm-3.0.1
Just bugfixes.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-09-25 17:08:19 +10:00
NeilBrown 2de8abd572 testreshape5 - flush devices between tests.
We need to flush the block devices before reading different data.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-09-25 16:57:01 +10:00
NeilBrown ae80545ac3 Merge branch 'master' of git://github.com/djbw/mdadm 2009-09-25 14:11:11 +10:00
Hans de Goede f5df5d69a7 mdmon: fix freeing unallocated memory
mdmon was creating a supertype struct with malloc, and thus not
necessarily getting zero-d memory.

This was causing it to segfault when called like this from the initrd:
/sbin/mdmon /proc/mdstat /sysroot

The problem was that  load_super_imsm would get called on the non-zero'd
super struct, whcih in turn calls free_super_imsm, which checks st->sb,
which should be zero but isn't and then starts freeing bogus memory.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-24 06:52:06 -07:00
Dan Williams cf53434e5c imsm: clear CONFIGURED_DISK for failed drives
Synchronizing with what the Windows driver does.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:35:28 -07:00
Dan Williams ee5aad5ae2 imsm: kill USABLE_DISK flag
'USABLE_DISK' is not a 'persistent' status flag it is an internal status
flag used for the in memory representation of the disk in the Windows
driver.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:35:28 -07:00
Dan Williams ed57a7e8ba Examine: don't count containers as spares
mdadm -Ebs will include containers in the scanned device list.
Examine() falsely thinks they are spares when MD_DISK_SYNC is not set.
This could be fixed by forcing all formats to set this flag for
container devices, but this flag is currently used by imsm to identify
free-floating spares.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:35:28 -07:00
Dan Williams 436305c690 Detail: fix for an imsm container with a spare
Spares for imsm arrays do not have any info about the container in their
metadata records.  If Detail() inadvertantly picks such a device for
->get_array_info() it will end up with less than useful info for the
container.  So, continue to read from the disks until a non-spare device
is found.

This bug was found by timeouts waiting for udev to create the
user-friendly container name.  To detect future UUID reporting problems
and a debug print to the timeout case in wait_for().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:34:20 -07:00
Dan Williams ee836c39b5 Examine: fixup output in the presence of containers with spares
If we dump any 'spare' or 'device' information for a container in the
'brief' case then we need a newline before printing member array info.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:34:20 -07:00
Dan Williams 709743c554 imsm: fix spare promotion
1/ Fix an off by one error when detecting whether the device allocation
   loop succeeded or not
2/ Update ->num_raid_devs before copying to avoid a segmentation fault

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:34:20 -07:00
NeilBrown 2a17c77bdb Add a missing 'closedir'.
Thanks to David Binderman for finding and reporting it.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-09-11 16:10:24 +10:00
NeilBrown 7cbeb80e90 super1: remove fd leak when opening /dev/urandom
As reported in
   https://bugzilla.novell.com/show_bug.cgi?id=527722

I forgot to close the fd after reading the random number.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-08-13 15:02:39 +10:00
NeilBrown 4737ae25de Exmaine/brief: put member arrays after container arrays.
A previous patch moved move the '--examine --brief' reporting of
member arrays to before their containers.  This breaks "mdadm -As"
assembly.  So put them back, but still fix the problem addressed by
previous patch.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-08-07 14:17:40 +10:00
NeilBrown 823f06865e Merge branch 'master' of git://github.com/djbw/mdadm 2009-08-07 13:45:38 +10:00
Dan Williams 3ef383aa96 Assemble: fix handling of empty container
# mdadm --create /dev/md/ddf /dev/sd[b-e] -n 4 -e ddf
mdadm: container /dev/md/ddf prepared.
# mdadm -Ss
mdadm: stopped /dev/md126
# mdadm -As
mdadm: Container /dev/md/ddf0 has been assembled with 4 drives
Segmentation fault

Reported-by: Artur Wojcik <artur.wojcik@intel.com>
Reported-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-04 10:17:23 -07:00
Dan Williams 7e8545e954 imsm: fix spare-uuid assignment
imsm spares do not have container membership by default so we associate
them with the first container found in the configuration file.  Some
ARRAY lines do not specify the metadata type so we cannot assume that
_cst will always be valid.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-07-31 17:11:42 -07:00
Dan Williams 969c255511 platform: relax rom scanning alignment for ahci platforms
The PCI-3.0 Firmware specification allows for option-roms to have
512-byte alignment rather than 2048-byte.  As there does not appear to
be a reliable method to detect a PCI-3.0 compliant BIOS from userspace
we allow the imsm platform detection code to presume that a system
modern enough to have an Intel AHCI controller does not have
dangerous/legacy ISA regions in the option-ROM memory space.

An environment variable to disable this behaviour, IMSM_SAFE_OROM_SCAN,
is added in case this presumption is ever proven wrong.

Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-07-31 17:11:41 -07:00
Dan Williams 148acb7baa imsm: fix family number handling
The family_number field can change.  The option-rom will change the
family number when it starts a rebuild process (flags a container for
rebuild).  This was not seen previously as mdadm would usually start the
rebuild process, preserving the family number.

This is the mechanism that helps to prevent a prodigal array member from
being returned to its original system and cause a rebuild to go in the
wrong direction.  With the change we will end up with a container that
will fail to assemble unless the device with the incompatible family
number is left out of the assembly.

So, take several actions:
1/ Convert uuid generation to use orig_family_num, being careful to
   preserve the existing uuid in the case where orig_family_num is not
   set (i.e. previous mdadm created imsm arrays)
2/ Set orig_family_num at Create.  For arrays created by mdadm prior to
   this release orig_family_num will be zero, so set it to family_num at
   the first metadata write.
3/ Add checks for orig_family_num to compare_super_imsm
4/ Update the family number when initiating rebuild
5/ The option-rom mixes some random data into the family number, add
   this functionality to the mdadm implementation.

Reported-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-07-31 17:11:41 -07:00