|Summary:||Enhancement request to Software RAID to do Data Scrubbing|
|Product:||Red Hat Enterprise Linux 5||Reporter:||Colin.Simpson|
|Component:||mdadm||Assignee:||Doug Ledford <dledford>|
|Status:||CLOSED ERRATA||QA Contact:||BaseOS QE <qe-baseos-auto>|
|Version:||5.0||CC:||dkovalsk, k.georgiou, msusta, riek|
|Fixed In Version:||Doc Type:||Bug Fix|
The Linux software raid stack supports data scrubbing (reading disks in the raid array and looking for bad sectors, and when bad sectors are found using information from other disks or from parity to rewrite the bad sectors with good data). However, the mdadm package did not make use of this functionality. This package adds a cron job to /etc/cron.weekly to check disks for bad sectors and repair them when found.
|Last Closed:||2009-09-02 11:52:26 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:|
Description Colin.Simpson 2007-03-26 13:04:00 UTC
Description of problem: Disks that are run as a Software RAID can develop bad blocks on unaccessed sectors of the disk. When a disk fails in the array and you replace the drive, it can fail to rebuild due to previously hidden bad blocks on the remaining disks (we've recently been bitten by this). As disks get larger this problem becomes more likely. This can be mitigated on suitably up to date kernels by so called "Data Scrubbing". This is a very serious issue as without being scrubbed a RAID 5 can be less reliable than a RAID 0 with 2 drives (this stat it's off one of the links below). Debian has a script checkarray that they cron weekly (I'm told) that simply calls, echo check > /sys/block/mdX/md/sync_action ,for each of the Software RAID's. See: http://www.gentoo-wiki.com/HOWTO_Install_on_Software_RAID#Data_Scrubbing http://www.ashtech.net/~syntax/blog/archives/53-Data-Scrub-with-Linux-RAID-or-Die.html http://linux-raid.osdl.org/index.php/RAID_Administration A similar script should probably be added to RH EL and Fedora. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Comment 1 Colin.Simpson 2007-10-29 15:24:05 UTC
Any thoughts on this ticket?
Comment 2 Doug Ledford 2008-06-14 16:52:49 UTC
The check capability is present in rhel5 already, but we don't automatically initiate check events as those can have negative impacts on both performance and power consumption. It is left to the user to initiate an event if they choose. I would highly recommend initiating an event prior to any planned modifications of the array. However, I can certainly see shipping a cron.weekly script that simply defaults to off, but can be enabled by the user for exactly this purpose.
Comment 3 RHEL Product and Program Management 2008-07-21 23:11:31 UTC
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Comment 4 Colin.Simpson 2008-07-22 09:00:55 UTC
Not so bothered about it making it into a RH minor release, I think it should be on your radar for a future major release. Should I (or can you, as I'm not sure exactly how) put this as a suggestion to the Fedora team so it may make it into RH release down the line.
Comment 8 Ruediger Landmann 2009-05-21 05:51:49 UTC
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: The Linux software raid stack supports data scrubbing (reading disks in the raid array and looking for bad sectors, and when bad sectors are found using information from other disks or from parity to rewrite the bad sectors with good data). However, the mdadm package did not make use of this functionality. This package adds a cron job to /etc/cron.weekly to check disks for bad sectors and repair them when found.
Comment 9 Matěj Šusta 2009-07-22 15:01:47 UTC
Small note to relnotes: - change sectors to blocks - actual version of script just runs "check", which means that array will be checked whether it's consistent, but nothing will be repaired
Comment 10 Matěj Šusta 2009-07-24 08:34:14 UTC
/me slaps his face, to read better next time, please ignore comment #9
Comment 15 errata-xmlrpc 2009-09-02 11:52:26 UTC
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1382.html