I am managing a server with two solid state drives configured in mdadm RAID1. The server is running RHEL6 with an ext4 filesystem.
This evening the server went offline shortly after nightly backups began and the console reported disk errors:
Upon logging into the console, I found that one of the disks had been marked failed by mdadm and the file-system was set to read-only.
Is there a way that I can configure mdadm to fail the drive before the file-system is re-mounted as read-only? I would much rather run as a single disk system for a short time (until a replacement disk can be installed) rather than immediately kick the file-system into read-only mode -- which would guarantee an outage.
/dev/sdbtoo, but they weren't present on the console screen when I logged in. The system went read-only, so nothing was logged to/var/log/messagesand this is a critically important production server, so I didn't spend the extra time to read throughdmesg-- maybe I'll have to do that next time.I think the drives are actually fine since a server reboot fixed the problem immediately and no errors were recorded on the SMART data. Since this is such an important production system, I'm leaning towards just immediately replacing the entire motherboard.
– Elliot B. Mar 19 '18 at 17:00