2

Inode usage go from 1 to 100% on a single file creation in a raid array on Debian.

First, clean boot, then:

sudo cryptsetup luksOpen /dev/RaidVG/LVMVol CVol
sudo mount /dev/mapper/CVol /mnt/raid/

Checking inode usage

$ df -ih
Filesystem           Inodes IUsed IFree IUse% Mounted on
/dev/mapper/CVol   117M    11  117M    1% /mnt/raid

Then, doing any touch on /mnt/raid, it failed saying disk is full.

My inode usage ramped up at 100% :

$ df -ih
Filesystem           Inodes IUsed IFree IUse% Mounted on
/dev/mapper/CVol   117M  117M     0  100% /mnt/raid

Counting files inside /mnt/raid returns :

$ find | cut -d/ -f2 | uniq -c | sort -n
      1 .
   6033 d1
  14070 d2
  31211 d3
 145866 d4
 184352 d5

fsck can't seems to finish

$ sudo fsck /dev/mapper/CVol
fsck from util-linux 2.33.1
e2fsck 1.44.5 (15-Dec-2018)
/dev/mapper/CVol contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found.  Create<y>? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Signal (6) SIGABRT si_code=SI_TKILL

Also df -h return wrong values : there is more than 1T in use in reality:

 $ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/CVol  1.8T   77M  1.7T   1% /mnt/raid

I don't really know what to do or where to look at. My file system is "read only" but is there a risk of losing data here? How to fix the problem and be able to write on this disk again?

EDIT
smartctl -a

=== START OF INFORMATION SECTION ===
Vendor:               WD
Product:              My Passport
Revision:             1028
Compliance:           SPC-4
User Capacity:        2,000,365,289,472 bytes [2.00 TB]
Logical block size:   512 bytes
LU is resource provisioned, LBPRZ=0
Rotation Rate:        5400 rpm
Serial number:        WX22A30FX287
Device type:          disk
Local Time is:        Fri Mar 28 12:10:55 2025 GMT
SMART support is:     Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

No self-tests have been logged

mdadm --examine-badblocks

Bad-blocks list is empty in /dev/sda1
Bad-blocks list is empty in /dev/sdb1

fdisk -l /dev/sda /dev/sdb

Disk /dev/sda: 1.8 TiB, 2000365289472 bytes, 3906963456 sectors
Disk model: My Passport 
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x1dfd4f21

Device     Boot Start        End    Sectors  Size Id Type
/dev/sda1        2048 3906963455 3906961408  1.8T fd Linux raid autodetect


Disk /dev/sdb: 1.8 TiB, 2000365289472 bytes, 3906963456 sectors
Disk model: My Passport 
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x9f2cb37d

Device     Boot Start        End    Sectors  Size Id Type
/dev/sdb1        2048 3906963455 3906961408  1.8T fd Linux raid autodetect

pvs

  PV         VG     Fmt  Attr PSize  PFree
  /dev/md0   RaidVG lvm2 a--  <1.82t    0

vgs

  VG     #PV #LV #SN Attr   VSize  VFree
  RaidVG   1   1   0 wz--n- <1.82t    0

lvs

  LV     VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LVMVol RaidVG -wi-ao---- <1.82t

cat /proc/mdstat

Personalities : [raid]
md0 : active raid sdb1[0] sda1[1]
      1953348608 blocks super 1.2 [2/2] [UU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>
7
  • 1
    Is there anything in dmesg / syslog / journal? fsck says the filesystem has errors, and it was aborted. If you can't rule out filesystem corruption, then that might be it… but before trying fsck you should rule out storage errors; can't fix filesystems sitting on broken storage. Check smartctl -a for all drives and mdadm --examine-badblocks for md raid. Commented Mar 28 at 9:19
  • 2
    Gut feel, you need to get your data off that filesystem as soon as possible. Then flatten it and rebuild it (on top of the LUKS volume) from scratch Commented Mar 28 at 9:27
  • Two things to double check are that (a) your Physical Volume and Volume Group are constrained to the correct disks/partitions, (b) none of your partitions overlap Commented Mar 28 at 9:29
  • SMART support is: Unavailable - device lacks SMART capability. and Bad-blocks list is empty for both drives. Commented Mar 28 at 12:15
  • @Chris Davies how to check what you said ? Commented Mar 28 at 12:16

1 Answer 1

3

The ext4 kernel code will mark block groups as unusable if it finds the block or inode allocation bitmaps corrupted. This will reduce the free inodes and blocks counters. If all of the group inode bitmaps are corrupted (then presumably the free inode count will go to zero.

This would very likely result in Ext4-FS error messages in the console, which you can see with dmesg -T | less.

Running e2fsck is really the only option to fix this. You should make a copy of the filesystem to another device.

2
  • You are right. I've got hundreds of EXT4-fs error (device dm-1): ext4_validate_inode_bitmap:100: comm touch: Corrupt inode bitmap - block_group = 0, inode_bitmap = 1274. When running fsck I get Signal (6) SIGABRT si_code=SI_TKILL. What can I do ? Commented Mar 29 at 18:47
  • Have you tried running e2fsck with backup group descriptors? For example, e2fsck -b 32768 /dev/mapper/CVol or other values of (32768 x 3, 5, 7, 9, 25, 27, 49, 81, 125, 3^n, 5^n, 7^n). If you have the space on another disk, try copying the whole device with "dd" to a new partition or image file and run e2fsck on that, in case LUKS is causing problems. Commented Apr 16 at 4:51

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.