0

I found out that to look for file size in bytes, I use 'c'.

So I can look for file size with 1000 bytes by using: find . -size 1000c

But what about different kind of size such as Mb, Gb or even bits? What character or letters do I need to use?

3
  • 4
    This is one of those questions where the documentation of (the particular implementation of) the tool you're using is likely to give the best answer. Commented Apr 19, 2024 at 10:26
  • @ilkkachu, well not always like for toybox' find where you may be better of reading the source code instead. Commented Apr 21, 2024 at 9:57
  • @StéphaneChazelas, well, I would say that's a bit of a bug in the tool itself. (IMO, telling someone to RTFM is a bit of a blunt way to tell someone they haven't done their homework in reading the docs. But telling someone to RTFS is a bit of a blunt way to tell someone the software author hasn't done their homework in writing the docs...) Commented Apr 21, 2024 at 15:29

3 Answers 3

3

POSIX only specifies no suffix or a c suffix. With no suffix, values are interpreted as 512-byte blocks; with a c suffix, values are interpreted as byte counts, as you’ve determined.

Some implementations support more suffixes; for example GNU find supports

  • b for 512-byte blocks
  • c for bytes
  • w for 2-byte words
  • k for kibibytes
  • M for mebibytes
  • G for gibibytes
1
  • 1
    Note that what suffixes are supported is not the end of the story. There are variations on how they are handled. See my answer for details. Commented Apr 21, 2024 at 9:56
3

POSIXly:

find . -size  1000c # files whose size¹ is exactly 1000 bytes (not characters)
find . -size -1000c # strictly less than 1000 bytes (0 - 999)
find . -size +1000c # strictly more than 1000 bytes (1001 - ∞)

Then, using POSIX sh syntax, you can do:

EiB=$((1024*(PiB=1024*(TiB=1024*(GiB=1024*(MiB=1024*(KiB=1024)))))))
 EB=$((1000*( PB=1000*( TB=1000*( GB=1000*( MB=1000*( kB=1000)))))))

find . -size "$(( 12 * GiB ))c" # exactly 12GiB (12,884,901,888 bytes)
find . -size "$(( 12 * GB  ))c" # exactly 12GB (12,000,000,000 bytes)
find . -size "-$(( 12 * GB ))c" # 0 - 11,999,9999,999 bytes
...

Without the c suffix, beware the behaviour can be surprising:

find . -size  1000 # files whose size, in number of 512-byte units (rounded *up*)
                   # is 1000. So, that's file whose size in bytes ranges from
                   # 1000*512-511 (999*512+1) to 512*1000
find . -size -1000 # files whose size is 999*512 bytes or less
find . -size +1000 # files whose size is 1000*512+1 bytes or more

That's it for the POSIX specification of the find utility.

Now, various find implementations support additional suffixes but beware the same suffixes can be interpreted differently by different implementations.

As noted by @StephenKitt, GNU find supports cwbkMG for byte, word, 512-byte unit, kibibyte, mebibyte, gibibyte, but it behaves like POSIX find requires in that find . -size -12G for instance is not the same as our find . -size "-$((12 * GiB))c" from above as that's files whose size in number of gibibyte (rounded up) is strictly less than 12, so files that are 11GiB or less.

For instance, find . -size -1G only finds empty files (files of size 0). A one byte file is considered to be 1GiB as sizes are rounded up to the next GiB.

busybox find supports cwbk suffixes but interprets them differently from GNU find. It's also currently not POSIX compliant for its handling of sizes without suffixes.

For busybox find, find . -size -12G is like find . -size "-$(( 12 * GiB ))c", and find . -size -1 is for sizes ranging from 0 to 511 instead of just 0.

toybox find (as found on Android for instance) behaves like busybox find in that regard (and is also not POSIX compliant). Another difference is that suffixes are case insensitive there and TPE for tebibyte, pebibyte and exbibyte are also supported and a d (decimal) additional suffix can be used to specify that the units are powers of 1000 rather than 1024. For instance -size 1kd finds files that are exactly 1000 bytes (1 kilobyte) instead of 1024 bytes (1 kibibyte) for -size 1k.

In toybox find, the suffix handling is done as part of its atolx() function which is not only used for find. Note however that since that supports 0xffff hexadecimal numbers, there's a conflict for cbedCBED that are also hexadecimal digits. -size -0x2c is not for less than 0x2 bytes, but for less than 0x2c (44) 512-byte units. And -size 010c is treated as -size 8c (octal), another POSIX non-conformance.

FreeBSD/DragonFly BSD find support ckMGTP (not bwE) but while it behaves as required by POSIX without suffix, it behaves like busybox/toybox and not GNU find when there's a suffix².

sfind or the find builtin of the bosh shell behave like FreeBSD's except suffixes are case insensitive and bwE are supported and octal/decimal numbers and some product arithmetic expressions (such as 6x12x8k) are accepted.

As far as I can tell, all of the OpenBSD, NetBSD, Illumos, Solaris, AIX, HP/UX only support no-suffix for 512-byte units or c for byte as POSIX required.

A summary table:

Traditional/POSIX GNU FreeBSD sfind busybox toybox
suffixes c cwbkMG ckMGTP cwbkmgtpeCWBKMGTPE cwbk cwbkmgtpeCWBKMGTPE (+d)
number format decimal decimal decimal dec/oct/hex/expr decimal dec/oct/hex
-size $n ($n-1)*512+1 .. $n*512 ($n-1)*512+1 .. $n*512 ($n-1)*512+1 .. $n*512 ($n-1)*512+1 .. $n*512 $n*512 $n*512
-size -$n 0 .. ($n-1)*512 0 .. ($n-1)*512 0 .. ($n-1)*512 0 .. ($n-1)*512 0 .. $n*512-1 0 .. $n*512-1
-size +$n ($n*512)+1 .. ∞ ($n*512)+1 .. ∞ ($n*512)+1 .. ∞ ($n*512)+1 .. ∞ ($n*512)+1 .. ∞ ($n*512)+1 .. ∞
-size ${n}c $n $n $n $n $n $n
-size -${n}c 0 .. $n-1 0 .. $n-1 0 .. $n-1 0 .. $n-1 0 .. $n-1 0 .. $n-1
-size +${n}c $n+1 .. ∞ $n+1 .. ∞ $n+1 .. ∞ $n+1 .. ∞ $n+1 .. ∞ $n+1 .. ∞
-size $n$unit N/A ($n-1)*$unit+1 .. $n*$unit $n*$unit $n*$unit $n*$unit $n*$unit
-size -$n$unit N/A 0 .. ($n-1)*$unit 0 .. $n*$unit-1 0 .. $n*$unit-1 0 .. $n*$unit-1 0 .. $n*$unit-1
-size +$n$unit N/A $n*$unit+1 .. ∞ $n*$unit+1 .. ∞ $n*$unit+1 .. ∞ $n*$unit+1 .. ∞ $n*$unit+1 .. ∞

So, in short, for portability, your best bet is to use the c suffix, decimal only numbers without leading zeros and compute the units manually.

For completeness, the L glob qualifier of zsh (with kmgt case insensitive, but pP is for 512-byte unit, not pebibyte) behaves like POSIX/GNU find (*(LM-12) expands to files whose size is in-between 0 and 11 mebibytes for instance).


¹ That's the size as reported in the st_size attribute of the structure returned by lstat() whose meaning for non-regular files can vary between system.

² There's the same kind of distinction in FreeBSD find/sfind for the -Xtime predicates where for instance -mtime +1 matches on files that are 2 days old or older (age 86400*2 - ∞) while -mtime +1d matches on files that are more than one day old (age 86400.000000001 - ∞). With GNU find, see also ! -newermt -1day (or 1 day ago or yesterday).

1

To add to what Stephen Kitt mentioned, beware that gnu find rounds up the size to the specified granularity before comparing!

If you do

truncate --size=1000 dummy_file_1000
truncate --size=1024 dummy_file_1024

then

find . -size 1k 
find . -size 1024c

will not give the same result!

See find command: -size behavior

In short - find . -size 1k will list every file with size∈[1,1024], whereas find . -size 1024c will only list files where the actual size is exactly 1024 bytes.

4
  • 1
    You may have intended to test find . -size 1024c. The default is a generous 1024b, aka 0.5 MB. Commented Apr 19, 2024 at 12:51
  • Oops. Yes indeed. Every time I use find -size I'm surprised by some odd behaviour. This is not the first time I get fooled by that peculiarity. Commented Apr 19, 2024 at 13:08
  • I dumbed down myself when I tried it in my home directory, because I could not intuit the actual difference. Turned out that -maxdepth 1 -name 'dummy_file*' was a slightly better idea ;-) Note to self: sandbox everything! Commented Apr 19, 2024 at 13:33
  • The answer would benefit from actually showing the results and discussing the differences. Commented Apr 19, 2024 at 13:43

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.