0

So I've got several disks on several servers who lost their disks due to a brief network delay. according to mount and /proc/mounts, the disks are rw. When I sudo or try to access a faulty mount, I get:

sudo: unable to stat /var: Input/output error

When I reboot, the problem is fixed. However I don't see what options I still have for detecting these errors...

I guess dmesg shows some information but that information doesn't disappear with the error...

Currently the servers are running Centos 6-7 and the disks are xfs and nfs mostly.

Any ideas?

4
  • 3
    I'm not entirely sure what you're asking — it seems if you want to detect the error, you've already found a way: stat'ing the mount fails with an I/O error. And yes, dmesg doesn't vanish when the error is fixed, dmesg is basically a log. There are probably some other ways to detect this; what type of network storage is this? (I mean, other than NFS — how are the XFS disks over the network?) Commented Sep 14, 2018 at 15:42
  • Do you need to reboot? What happens if you unmount and mount again? Commented Sep 14, 2018 at 15:45
  • So what I eventually want to achieve is to implement a check into my monitoring system. At first I had something like grep ro in /proc/mounts and do a wc -l. Since Linux wasn't able to detect the error, I'm looking for something else. Commented Sep 14, 2018 at 15:56
  • You might have been confused by the effect of the mount option errors=remount-ro available for ext2/3/4 and maybe a few other fs. Commented Sep 14, 2018 at 17:46

1 Answer 1

1

This is not a RO FS,therefor it is not listed as ro in /proc/mount.

The Input/output error either means what is says, error in reading or writing, or it means the system tries to access a sector that doesn't exist (because of some error in the information how many blocks should be present on the disk).

If that happens often enough to warrant monitoring to detect the error, it happens often enough to find and fix the reason why this happens.

2
  • So only extx switch to RO on issues? We're fixing the issue but we still need a way to monitor this. What I noticed is that the disk wasn't mounted. According to mount however the mount was fine... Commented Sep 14, 2018 at 18:21
  • 1
    It is not RO, it is I/O error. And the cause for the I/O error might be that the system still considers the partition mounted, but has no access to the underlying disk (network, SAN, whatever). Commented Sep 15, 2018 at 7:34

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.