Is it a sparse file?

The other day someone asked how is it possible that one file’s size is way bigger than size of the filesystem it resided on. So I figured, it’s probably a sparse file. Let’s see…

[somedude@tsys logs]$ df -h
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/vg_system-lv_root  8.0G  2.4G  5.7G  30% /
devtmpfs                        32G     0   32G   0% /dev
tmpfs                           32G     0   32G   0% /dev/shm
tmpfs                           32G  290M   32G   1% /run
tmpfs                           32G     0   32G   0% /sys/fs/cgroup
/dev/sda1                      509M  162M  347M  32% /boot
/dev/mapper/vg_oracle-lv_u01   410G   95G  316G  24% /u01
/dev/mapper/vg_system-lv_tmp   8.0G   33M  8.0G   1% /tmp
/dev/mapper/vg_system-lv_data   81G   48G   33G  60% /data
/dev/mapper/vg_system-lv_var   8.0G  1.6G  6.5G  19% /var
tmpfs                          6.3G     0  6.3G   0% /run/user/0
tmpfs                          6.3G     0  6.3G   0% /run/user/3021
tmpfs                          6.3G     0  6.3G   0% /run/user/10809

The file in question, stdout.txt resides on /data mountpoint. Looking at the size of the file:

[somedude@tsys logs]$ ls -lh stdout.txt
-rw-r--r-- 1 root root 95G Oct 18 09:06 ./stdout.txt

It’s size is reported to be 95GB, which is a few GB bigger than /data partition. As per various suggestions, let’s look at the find man page:

%S File’s sparseness. This is calculated as (BLOCK- SIZE*st_blocks / st_size). The exact value you will get for an ordinary file of a certain length is system-dependent. However, normally sparse files will have values less than 1.0, and files which use indirect blocks may have a value which is greater than 1.0. The value used for BLOCK- SIZE is system-dependent, but is usually 512 bytes. If the file size is zero, the value printed is undefined. On systems which lack sup- port for st_blocks, a file’s sparseness is assumed to be 1.0.

So, let’s take a look:

[somedude@tsys logs]$ find . -type f -printf "%S\t%p\n"
0.0617542            ./stdout.txt

So, what is the real size of that file? ls man page says:

s, –size print the allocated size of each file, in blocks

[somedude@tsys logs]$ ls -lsh stdout.txt
5.9G -rw-r--r-- 1 root root 95G Oct 18 07:33 stdout.txt

So the file is actually taking up 5.9GB.