|
I asked this question back in April on the stable list with no response ( http://lists.freebsd.org/pipermail/freebsd-stable/2012-April/067305.html ). I've now been seeing the same behavior on 9.0-release, and I thought it would be good to ask again here.
There is a failure mode for SATA disks (Seagate Barracuda ST3000DM001 disks, in this case) that the mps driver doesn't handle very well. If a disk is slow to respond, or is unresponsive altogether, I'd like it to be removed from the bus and degrade the zpool that it's a part of. The way things are now, mps will just report a lot of "SCSI command timeout on device" messages. Any I/O on the affected zpools will hang for an excessive amount of time (sometimes forever). We typically configure our storage volumes as a pool of mirrors, with the expectation that availability will be maintained if any redundant disk(s) should fail. Unfortunately, availability is actually made *worse* on highly-redundant mirrors when mps won't give up on an unresponsive device. It's possible that I'm overlooking an obvious solution, or some relevant configuration options for the driver. Can anyone offer some insight on this? Thanks, - .Dustin _______________________________________________ [hidden email] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "[hidden email]" |
|
I think I've had the same issues, but resolved it by avoiding certain
disks and firmware (for disks that aren't really failed... don't know if a failing disk will cause such problems in the future). Here is a related message list thread about it: http://lists.freebsd.org/pipermail/freebsd-fs/2012-January/013477.html On 06/05/2012 12:54 AM, Dustin Wenz wrote: > I asked this question back in April on the stable list with no response ( http://lists.freebsd.org/pipermail/freebsd-stable/2012-April/067305.html ). I've now been seeing the same behavior on 9.0-release, and I thought it would be good to ask again here. > > There is a failure mode for SATA disks (Seagate Barracuda ST3000DM001 disks, in this case) that the mps driver doesn't handle very well. If a disk is slow to respond, or is unresponsive altogether, I'd like it to be removed from the bus and degrade the zpool that it's a part of. > > The way things are now, mps will just report a lot of "SCSI command timeout on device" messages. Any I/O on the affected zpools will hang for an excessive amount of time (sometimes forever). We typically configure our storage volumes as a pool of mirrors, with the expectation that availability will be maintained if any redundant disk(s) should fail. Unfortunately, availability is actually made *worse* on highly-redundant mirrors when mps won't give up on an unresponsive device. > > It's possible that I'm overlooking an obvious solution, or some relevant configuration options for the driver. Can anyone offer some insight on this? > > Thanks, > > - .Dustin > > _______________________________________________ > [hidden email] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "[hidden email]" _______________________________________________ [hidden email] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "[hidden email]" |
| Powered by Nabble | Edit this page |
