Scsistuff Tech tip #1: (Solvability and hair pulling index - 10!!!! This was a very difficult to solve problem.)
MD1000s with SATA drives are experiencing drives falling off line (
timing out) due to the incorrect type of SATA drives installed. The issue is not
related to the backplane, raid controller, EMM, or cabling. In a nutshell
the non enterprise drive has a very long time out period in relation to the raid
controller. When a drive experiences an issue caused by a command issued
by the raid controller the drive goes into a very long time out period (
relative to the the speed of the raid controller) and does not respond to
repeated request by the controller. The controller thinks the drive is
offline and marks the drive as offline. By the time the SATA drive is
ready the raid controller has marked that drive as failed and brings the spare
drive online ( rebuild) if so configured. Reseating the drive has no
effect since the raid controller has marked the drive as failed.
Fix: Use Dell enterprise SATA drives or SATA SSD drives which are
not prone to the timing out issue or use SAS drives.
Scsistuff Tech tip #2 ( June 15, 2013)