When is an empty directory, not an empty directory?

For all you UNIX hackers out there … a conundrum:

% ls -l tmp
total 4
-rw------- 1 user group 1363 Feb 4 2003 getchap-times
% du -k tmp
1914 tmp

Last night I was trying to move copy a directory structure between machines. It failed, for some reason, I guess permissions somewhere, so I tried to clean out some directories with large numbers of ‘temporary’ files that hadn’t got deleted. There were a couple of directories like this and it is not trivial selectively deleting files from very full directories as something like ‘rm search*.out' fails if there are too many files matching ‘search*.out1. I know I could have written a quick C program, but life is short and so after many partial match commands of the form ‘rm search23*.out‘ I started to get there.

Then I turned to a second directory, which seemed to be biggy. However it turned out to have just a dozen files in it. I deleted all but one and assumed it was enormous, but ‘ls -l‘ got:

% ls -l tmp
total 4
-rw------- 1 user group 1363 Feb 4 2003 getchap-times

Just 1363 bytes? I thought I must have been mistaken when I’d thought the directory was full, so checked du again:

% du -k tmp
1914 tmp

“Ah!, I thought, “hidden files”, so a quick ‘ls -l

% ls -la tmp
total 3832
drwxrwxrwx 2 user group 1942016 Dec 14 06:37 .
drwxr-xr-x 5 user group 1536 Apr 22 2006 ..
-rw------- 1 user group 1363 Feb 4 2003 getchap-times

No hidden files … but boy look at ‘.’. The directory itself is the biggy. The directory contains one file of 1.3K, but is itself 1.9 meg big!

Several years ago (i.e. since when there will have been reboots, file checks, etc.) the directory had loads of small files in it that I deleted (using the same laborious method I had done in the other directory last night). In order to have so many files the directory itself will have to have stored an entry for each – leading to its whopping 1.9 meg size (there were a LOT of little files, UNIX is good at being scalable). … but when the files are deleted, UNIX does not reclaim this space.

I guess whoever coded the ‘unlink‘ system call never thought about the possibility that directories would get so big and so didn’t try to reclaim the space and there is no ‘garbage collection’ on the file system.

This was on SunOS 5.8 – is this the same on other versions of UNIX – LINUX?

So next time you are wondering where all your space has gone … maybe it is in those empty directories. :-/

  1. the ‘search*.out‘ is of course expanded by the shell into a very very big command line![back]