This patch rollback one change connected with mdadm-OROM
compatibility:
adding ':0' at the end of disk serial number if disk is
detected as failed.
Current mdadm's implementation does not distinguish two
cases when disk is marked as failed:
1. If disk is really failed- disconnected, broken
2. Just marked as failed by mdadm- using "-f" option
Second case is not yet fully handled and compatible with
IMSM standard.
Changing serial number of existing, operational disk causes
problems in "thunderdome" and "load_super" functions that use
serial numbers to disks comparisons and searching.
The change must be recalled until full support will be
developed.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Honor ignore_hw_compat to load metadata from disk attached to non-IMSM
controller or when there are no IMSM OROM/EFI capabilities.
Used only for guessing and examining metadata format.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
During adding spare disks to raid0, spare metadata is not written.
This is due to exit form sync_metadata() on empty updates_pending flag.
When mdmon is absent indicate sync_metadata() to flush changes to disks.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
While last_checkpoint is counter in per disk units, checkpoints
should be stored in the same manner.
Restoring from checkpoint should should recalculate checkpoint in to
array position (reshape_progress).
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm has a convention in some areas of passing a device name
if error messages about it are interesting, or NULL if not.
Follow this convention with find_intel_hba_capability so that it
doesn't complain when not appropriate - and so that it doesn't
have to go and find a device name that it wasn't given.
Signed-off-by: NeilBrown <neilb@suse.de>
OROM/EFI capabilities are retrieved based on disk's controller type.
1/ alloc_super no longer retrieves OROM capabilities
2/ find_imsm_capability replaces find_imsm_orom
3/ new function find_intel_hba_capability gets disk's HBA and relevant
capability
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Function find_intel_hba_capability attaches HBA information
to intel_super structure based on fd of the component disk.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
compare_super_imsm verifies that the component disks use the same type of HBA
in platform dependent environment. Otherwise print-out error message and block
the action.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Arrays exceeding the OROM/EFI maximum number of supported disk are
blocked in validate_geometry_imsm_orom function.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Print-out error message when volume geometry fails to comply with
OROM/EFI controller's capabilities.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Container_content_imsm calls validate_goemtry_imsm_orom to verify that
the array parameters are supported by controller's OROM/EFI.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The function uses find_intel_device and find_imsm_capability to present
AHCI and SAS controller capabilities taken from OROM or EFI.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
During reshape for dirty volumes reshape_progress has to be calculated
also. To keep the same logic for array creation:
not setting info->resync_start = MaxSector when first condition is
true,
resync_start is initialized by MaxSector to allow proper array
initialization.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Array state has to be managed during reshape based on consistent flag.
To achieve this existing code will be reused. Currently existing code for
blocks_per_unit calculation can be removed
and existing code can be reused also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Some grow operations must be applied to a whole container. These
are performed one array at a time, so only one array appears to
be reshaping.
When re-assembling such an array, we need to make sure that
when the reshape finished, we move on to the next array.
So require metadata to set ->reshape_active = 2 in that case,
and use reshape_container to complete the reshape.
Signed-off-by: NeilBrown <neilb@suse.de>
Variables declaration moved a little bit up,
to not mix declaration and code.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
For general migration, blocks per unit are required for all disks,
not for per-member.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a reshape (migration) is happening, we might need to modify the information
with provide to md so that it can cope with the reshape.
For example, if a migration from 4-device RAID0 to 5-device RAID0 is
happening, we need to tell md that it is reshape from degraded
5-device RAID4 to degraded 6-device RAID4 so md doesn't handle direct
reshape of RAID0.
There may be other migrations supported by IMSM that need special
treatment here.
Signed-off-by: NeilBrown <neilb@suse.de>
When we cannot find spare devices for grow operation we should
print error message.
This patch changes debug error message to 'stderr' print.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Defect description:
When we create an redundant array in mdadm and then degrade it
by disk removing, Option ROM and Windows OS does not detect any array.
Reason:
Metadata created and updated after degrading array is not compatible
with IMSM standard.
This patch synchronizes the metadata according IMSM requirements.
Following inconsistencies have been fixed:
- reset all fields in imsm_dev during creation to avoid random values
- init dev status during creation to proper state
- not reset CONFIGURED_DISK flag when disk is missing
- add ":0" suffix to the serial number for missing/failed disks
- update medatada signature after takeover operation
- mark map state as degraded after raid0->raid10 takeover
Note:
Patch reworked after Dan Willams review.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When reshape is started imsm stores new size in metadata.
mdadm requires "old" size to proper initialization restarted array.
When reshape is in progress getinfo_super_imsm_volume() should report
computed array size value instead array size stored in metatda.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
reshape prodess cannot be restarted due to no checkpoint information
in mdinfo.
When metadata is read during reshape process or reshape restart,
rehape_progress (mdinfo field) has to be initialized to value
stored in metadata. This allows start reshape from stored
in metadata checkpoint.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When chunk size is not set from command line we need to guess it
depending on metadata given on command line or found on listed devices.
Validate_geometry sets the default for it's metadata if chunk is not set.
For external metadata chunk is set only when creating in a container.
For imsm validate_geometry_imsm_orom is responsible for finding default
chunk depending on container metadata loaded. Container will already know
which controller it is attached to, and have this controllers orom
available.
do_default_chunk indicates that we need to find default chunk and
if validate_geometry fails for some metadata it tells us to reset chunk
that may have been set.
Current solution would set default chunk correctly for imsm only if
container device was given on command line. With the list of devices
chunk was always set to 512.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Spares that are specified on container can be used by any array in container.
this means that for every array in container they should be reported.
This let caller know how many spare devices (not used in any array)
are still available.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Delta_disk can be set to UnSet value.
This can a cause to pass wrong parameter to reshape_super().
To avoid such situations raid_disks and delta_disks parameters
have to be passed to reshape_super() separately.
It will be up to reshape_super() function validation
and usage of this parameters to avoid not valid values.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Because IMSM_ORD_REBUILD is set in second map not first.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
chunk is in K so size must be converted to K before it is rounded.
Otherwise we may get wrong freesize returned
resulting in creation failure.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
During metadata printout in '-E' option failed disk map field
information is missing. Add this information to mdadm '-E' option
output.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Use single enum definition/migration type for all migrations.
Using separate definitions causes limitation for number of changes
in metadata implementation during single update for migration/reshape.
Single CH_MIGRATION enum allows for many mtadata parameters change
in single update. It will be possible to change i.e. chunk size together
with raid level. In current implementation 2 metadata updates would be
required for such action, one using CH_CHUNK_MIGR and second using
CH_LEVEL_MIGRATION migration type.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Transition raid0 to raid5 is not possible
due to wrong condition in imsm_analyze_change().
Current condition blocks migration possibility instead allow for it.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
On wrong starting container reshape conditions, mdadm returns information on console:
mdadmimsm: Operation is not allowed on this container
it is changed to:
mdadm: (imsm) Operation is not allowed on this container
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If initial resync or recovery of a redundant array is not finished
before it is stopped then during assembly md will start it as inactive.
Writing readonly to array_state in assemble_container_content
fails because md thinks the array is during reshape.
In fact we only have a reshape if both current and previous map
states are the same. Otherwise we may have resync or recovery.
Setting reshape_active in such cases causes the issue.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In metadata size is set already during migration initialization.
There is no reason for second /the same/ calculation.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Before reshape finalization migration is still present in metadata.
After patch 'imsm: FIX: crash during getting map'
function get_imsm_map() returns correct value,
this means in our case from second (start) map.
We should calculate map size basing on first (final) map.
For this we should request it by setting second function parameter to '0'
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Array size calculation is made in the same way in few places in code.
Make function imsm_set_device_size() for this common code.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Some debug strings remains as they were introduced,
before code was moved to separate function.
Information displayed by debug information in not all cases
was correct.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
imsm_num_data_members() can indicate error by returning 0 value
In such case size cannot be set based on 0 value.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When a->last_checkpoint variable can reach array end,
reshape finalization can be put in to single place.
There is no need to reset migration variables.
imsm_set_disk() will call end_migration() and this sets
all migration variables to required values.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When get_imsm_map() is called with second_map parameter == '-1'
and array is not in migration state NULL pointer is returned.
This is wrong. '-1' means return map as migration record points.
'-1' can be passed to get_imsm_map() from imsm_num_data_members().
imsm_num_data_members() is called to get current map members based
on migr_state information
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Too big map was copied (outside allocated memory) and this causes
mdmon crash for 2 raid0 arrays in container.
Map of correct (smaller) size should be copied,
to not overwrite any internal data.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When expansion is run on 2 raid0 arrays in container no update
is sent to mdmon because mdmon is off (mdadm performs update)
Memory size for first reshaped array is allocated to satisfy memory
requirements for expanded maps.
Memory for second device is allocated using old disks number, as in
metadata there is no information about this array reshape.
When mdmon initiates second array reshape it overwrites internal
structures and crashes). There is no place to keep expanded maps.
To avoid this situation during loading metadata, allocated memory
should be performed using the maximum used disks number in particular
container.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When second array reshape is about to start external metadata should
be updated by mdmon in imsm_set_array_state(). For this purposes
imsm_progress_container_reshape() is reused.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reshape is started from second array, so it causes imsm incompatibility
and problems during second array start.
Reshape should be started in arrays metadata order.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Adding spare disks to imsm container fails due to problem with writing
new_dev to sysfs. This problem was caused by not closed handle
(opened exclusively) in Manage.c:803.
Disk handle was not closed by free_imsm().
This is due to not released disk_mgmt_list in free_imsm_disks().
Proper release of imsm metadata allows for spare adding without problems.
Memory leak was fixed also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
imsm_reshape_super() currently allows for expansion when requested
raid_disks number is the same as current.
This is wrong. Existing in code condition is too weak.
We should allow for expansion when new disks_number is greater
than current only.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Add support for raid1 to raid0 takeover operation in user space.
This patch includes support for native and imsm metadata.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
get_disk_controller_domain recognizes Intel (R) SAS controller (isci).
The function returns three different strings that differentiate disk attached
to AHCI, ISCI or unknown controller types to create separate domains
for each case.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>