Commit Graph

217 Commits

Author SHA1 Message Date
NeilBrown 9698df15d9 Avoid using BLKFLSBUF.
Now that we use O_DIRECT for all device IO, BLKFLSBUF is not needed to
ensure we get current data, and it can impose a cost if any flush-out
is needed.  So remove it.

To be safe, add O_DIRECT to one place where it isn't currently used:
when reading a bitmap.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 16:00:55 +11:00
NeilBrown 6d388a8816 MISC: Add --examine-badblocks option
This will list the contents of the bad-blocks log, if one is present.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-05 12:56:31 +11:00
NeilBrown 72e7fb13f0 Incremental: support replacement devices.
These need to be counted in the number of 'active' devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-24 12:06:51 +11:00
NeilBrown aacb2f816a Assemble: add support for replacement devices.
Need to possibly collect 2 devices for each slot, and
original and a replacement.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-24 09:48:18 +11:00
NeilBrown 24c7bc8432 Report replacement devices correctly with --detail and --examine
--detail needs to be read to report 2 devices in each slot,
and --examine need to report if the device is the original or
the replacement.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-23 17:16:16 +11:00
NeilBrown 5d5002289c Replace a lot of leading spaces with tabs.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-10 18:33:26 +11:00
NeilBrown 72ca9bcff3 Allow data-offset to be specified per-device for create
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...

The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown cb19a251a5 super1: reserve at least 2 chunks for reshape headroom.
sometimes 0.1% isn't enough, though mostly only in testing.

We need one chunk for a successful reshape, so reserve 2.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 5e88ab2e2f New RESHAPE_NO_BACKUP flag to track when backup action is needed.
Some arrays (raid10) never need a backup file, so during assembly
we can avoid the whole Grow_continue check in that case.
Achieve this using a flag set by the metadata handler.

Also get "mdadm -I" to fail if a backup process would be
needed.  It currently does fail as the kernel rejects things,
but it is nicer to have this explicit.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 80bf913592 Add space_before/space_after fields to mdinfo
These will be needed to guide changes to data_offset during reshape.
Only set them for super1 for now.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 8fe1c44f82 super1: add new_offset field.
The 'new_offset' is used for reshaping to avoid the need
for a backup file.

For now we only report the value when it is set.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 83cd1e97cb Add data_offset arg to ->init_super and use it in super1.c
So if ->data_offset is already set, use that rather than
computing one.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown af4348ddd1 Add data_offset arg to ->validate_geometry.
This is needed to return correct available size.  It isn't
really used yet.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 387fcd593c Add data_offset arg to ->avail_size
This is currently only useful for 1.x metadata and will allow an
explicit --data-offset request on command line.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 7103b9b88d Handles spaces in array names better.
1/ When printing the "name=" entry for --brief output,
   enclose name in quotes if it contains spaces etc.
   Quotes are already supported for reading mdadm.conf

2/ When a name is used as a device name, translate spaces
   and tabs to '_', as well as the current translation of
   '/' to '-'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 88af981fa5 super1: ensure bitmap doesn't overlap bad block log.
If a bad block log already exists when adding a bitmap,
make sure the bitmap stays before the log.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 17:07:20 +10:00
NeilBrown 688e99a77d Allow --update to add or remove space for a bad block list.
--update=bbl will add a bad block list to each device.
--update=no-bblk will remove the bad block list providing that it
is empty.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 17:07:13 +10:00
NeilBrown bf95d0f38c Bad block log 2012-10-03 17:07:11 +10:00
Maciej Naruszewicz 80730bae52 Add MD_ARRAY_SIZE for --examine --export
An additional pair of key=value for --examine --export.

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-02 16:42:25 +10:00
NeilBrown ba728be72f Convert 'quiet' to 'not verbose' in various places.
If we change some functions to accept 'verbose', where <0 means to be
quiet, in place of 'quiet', then we will be able to merge
'quiet' and 'verbose' together for simplicity.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:18:09 +10:00
NeilBrown 503975b9d5 Remove scattered checks for malloc success.
malloc should never fail, and if it does it is unlikely
that anything else useful can be done.  Best approach is to
abort and let some super-daemon restart.

So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit.  Then use those
removing all the tests for failure.

Also replace all "malloc;memset" sequences with 'xcalloc'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
NeilBrown e7b84f9d50 Introduce pr_err for printing error messages.
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-09 17:14:16 +10:00
majianpeng 4687f16027 mdadm: Fix Segmentation fault.
In function write_init_super1():
If "rv = store_super1(st, di->fd)" return error and the di is the last.
Then the di = NULL && rv > 0, so exec:
if (rv)
    fprintf(stderr, Name ": Failed to write metadata to%s\n",
     	 di->devname);
will be segmentation fault.

Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-29 09:21:51 +10:00
NeilBrown d9751e06a6 super1: fix choice of data_offset.
While it is nice to set a high data_offset to leave plenty of head
room it is much more important to leave enough space to allow
of the data of the array.
So after we check that sb->size is still available, only reduce the
'reserved', don't increase it.

This fixes a bug where --adding a spare fails because it does not have
enough space in it.

Reported-by: nowhere <nowhere@hakkenden.ath.cx>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-15 09:51:03 +10:00
Jes Sorensen 34a13953fa Fix sign extension of bitmap_offset in super1.c
fbdef49811 incorrectly tried to fix sign
extension of the bitmap offset. However mdinfo->bitmap_offset is a u32
and needs to be converted to a 32 bit signed integer before the sign
extension.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-30 09:56:22 +10:00
NeilBrown 508a7f16b2 super1: leave more space in front of data by default.
The kernel is growing the ability to avoid the need for a
backup file during reshape by being able to change the data offset.

For this to be useful we need plenty of free space before the
data so the data offset can be reduced.

So for v1.1 and v1.2 metadata make the default data_offset much
larger.  Aim for 128Meg, but keep a power of 2 and don't use more
than 0.1% of each device.

Don't change v1.0 as that is used when the data_offset is required to
be zero.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-04 14:04:28 +10:00
NeilBrown fbdef49811 Bitmap_offset is a signed number
As the bitmap can be before the superblock, bitmap_offset is signed.
But some of the code didn't honour that :-(

Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-04 14:03:45 +10:00
NeilBrown d4633e06df Examine: fix array size calculation for RAID10.
RAID10 arrays with an odd number of devices had the arraysize
reported wrongly by --examine due to a rounding error.

Reported-by: Chris Francy <zoredache@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-22 14:43:09 +11:00
Jes Sorensen 0a2f189415 super1.c: use ROUND_UP/ROUND_UP_PTR
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 08:06:35 +11:00
Jes Sorensen 308340aa58 Use struct align_fd to cache fd's block size for aligned reads/writes
This uses a struct to cache the block size for aligned reads/writes,
to avoid repeated ioctl(BLKSSZGET) calls.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 08:01:20 +11:00
Jes Sorensen 3c0bcd4609 Use 4K buffer alignment for superblock allocations
To better accommodate 4K sector drives, use 4K buffer alignment for
superblock buffers.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 08:01:04 +11:00
Jes Sorensen 2de0b8a2b4 match_metadata_desc1(): Use calloc instead of malloc+memset
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 08:00:50 +11:00
Jes Sorensen 1afa9308d2 init_super1() memset full buffer allocated for superblock
Avoid possibly using stale data in bitmap and misc area of superblock.
In addition, remove superfluous memsets already covered by memset of
full superblock.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 08:00:26 +11:00
Jes Sorensen 4122675629 Define and use SUPER1_SIZE for allocations
Use a #define rather than calculate the size of the superblock buffer
on every allocation.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 08:00:07 +11:00
Jes Sorensen b2bfdfa0fe super1.c don't keep recalculating bitmap pointer
We just calculated the pointer to the bitmap, so use it instead of
recalculating.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-21 07:42:48 +11:00
NeilBrown 911cead7f1 super1: support superblocks up to 4K.
The current 1024 byte limit on 1.x superblocks limits us to
384 devices.  Sometimes people want more.

The kernel is already prepared for superblocks up to 4K,
so enable that in mdadm  allowing up to
   (4096-256)/2 == 1920
devices (active plus spare).

Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-08 15:40:52 +11:00
Jes Sorensen 4011421332 Print error message if failing to write super for 1.x metadata
In addition remove attempt to print an error message if
write_init_super() fails, as this is handled in the various
write_init_super() functions. This avoids a segfault on error.

Reported by Jim Meyering in
https://bugzilla.redhat.com/show_bug.cgi?id=795461

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-23 08:55:19 +11:00
Jes Sorensen d669228f29 Use posix_memalign() for memory used to write bitmaps
This makes super[01].c properly align buffers used for the bitmap
using posix_memalign() to make sure the writes don't fail in case the
bitmap is opened using O_DIRECT.

This is based on https://bugzilla.redhat.com/show_bug.cgi?id=789898
and an initial patch by Alexander Murashkin.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-16 14:16:03 +11:00
NeilBrown 6ef89052d8 super1: make aread/awrite always use an aligned buffer.
A recently change to write_bitmap1 meant awrite would sometimes
write from a non-aligned buffer which of course break.

So change awrite (and aread) to always use their own aligned
buffer to ensure safety.

Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-07 11:55:18 +11:00
Alexander Lyakas d59770567c getinfo_super1: Use MaxSector in place of sb->size
when deciding whether the array is clean or dirty, compare
sb->resync_offset against MaxSector and not against sb->size

With RAID6 resyncing and subsequent drive failures, it is possible to
reach the case, in which sb->resync_offset==sb->size. This happens
when resync is aborted due to drive failures, and immediately a
rebuild of a spare starts. In this case, mdadm was considered the
array as clean, while kernel was considering the array as dirty. It is
better for mdadm also to consider the array as dirty in this case.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-02-07 10:15:20 +11:00
NeilBrown c0c1acd691 Grow/bitmap: support adding bitmap via sysfs.
Adding a bitmap via ioctl can only add it at a fixed location.
That location is not suitable for 4K-block devices.

So allow setting the bitmap location via sysfs if kernel supports it
and aim to always use 4K alignments.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 14:10:41 +11:00
NeilBrown b6db6fab11 super1: use awrite when writing a new bitmap.
This ensures it will succeed on 4K block devices like DASD.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 14:09:56 +11:00
NeilBrown adbb382b55 super1 - fix for bigendian machines.
devflags is a single byte so endian conversions are now wanted.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 14:07:47 +11:00
NeilBrown cb0997242c super1: getinfo_super should set write-mostly flag.
Otherwise it is not preserved when you re-add a device to
an array.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-20 15:01:53 +11:00
Adam Kwolek 6e75048bc5 Add recovery blocked field to mdinfo
When container is assembled while reshape is active on one of its member
whole container can be required to be blocked from monitoring.
For such purpose field recovery blocked is added to mdinfo structure.

When metadata handler finds active reshape in container it should set
recovery_blocked field to disable whole container monitoring during
reshape.

For arrays that doesn't use containers, recovery_blocked field
has the same value as reshape_active field e.g. super0/1.
In fact,recovery is blocked during reshape for such arrays.
For ddf, metadata handler doesn't set reshape_active field,
so recovery_blocked is not set also.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-05 13:30:50 +11:00
Doug Ledford 16715c01f7 Fix readding of a readwrite drive into a writemostly array
If you create a two drive raid1 array with one device writemostly, then
fail the readwrite drive, when you add a new device, it will get the
writemostly bit copied out of the remaining device's superblock into
it's own.  You can then remove the new drive and readd it as readwrite,
which will work for the readd, but it leaves the stale WriteMostly1 bit
in devflags resulting in the device going back to writemostly on the
next assembly.

The fix is to make sure that A) when we readd a device and we might have
filled the st->sb info from a running device instead of the device being
readded, then clear/set the WriteMostly1 bit in the super1 struct in
addition to setting the disk state (ditto for super0, but slightly
different mechanism) and B) when adding a clean device to an array (when
we most certainly did copy the superblock info from an existing device),
then clear any writemostly bits.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-19 13:06:38 +10:00
NeilBrown 6218489119 super1: fix spacing for 'Flags' field in --examine.
Signed-off-by: NeilBrown <neilb@suse.de>
2011-08-02 13:36:08 +10:00
Scott Schaefer 9a88e7b6d5 --add incorrectly sets writemostly
Origin: vendor, http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=628667
Bug-Debian: http://bugs.debian.org/628667

Signed-off-by: NeilBrown <neilb@suse.de>
2011-08-02 13:27:32 +10:00
Luca Berra 3b7e9d0cbe Fix some type-aliasing issues.
Warnings for these are reported with -Wstrict-aliasing=2, and
avoiding the cast is certainly an improvement.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-17 14:38:14 +10:00
Luca Berra e4c72d1dc6 Fix some compiler warnings.
Original by Luca, with various changes by Neil

Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-17 14:35:06 +10:00