In this tutorial, we will learn how to find large files, or big files if you prefer, and large folders on a Linux system. These commands are very useful and save a lot of time, especially when you need to free up disk space urgently or are trying to understand the sudden increase in disk space usage. This can be a log file (log file), a backup file, a ZIP archive, etc….
II. Find large files with find
Before starting, I created empty files of a predefined size (90 MB, 100 MB, 200 MB, and 1 GB) in the “/home/flo” directory.
The find command allows you to search for items on a Linux machine. It includes a parameter called “ -size,” which is used to specify a precise size: useful for our case!
So, the command below will search for all files larger than 100 MB on the system:
find / -type f -size +100M
Some explanations are in order:
- /: the location to search from, here systemwide
- -type f: filter on files only
- -size +100M: include all files larger than 100 MB
For example, the following result is returned:
/tmp/latest.zip /proc/kcore /var/log/apache2/access.log /home/flo/File 100MB /home/flo/File200MB /home/flo/File1GB
I know that these files are larger than 100 MB. Of course, one can specify a different size expressed in Gigabytes. To find files larger than 1 GB: find
/ -type f -size +1G
We can also search under a specific tree. For example, in “/home/”:
find /home/ -type f -size +100M
This time, there are fewer files in the output because the search was only done under “/home/”:
/home/flo/File 100MB /home/flo/File200MB /home/flo/File1GB
The problem is that this output does not indicate the file size, only its path. By adding the “-printf” parameter, we can customize the output to indicate the file size and path:
find /home/flo/ -type f -size +100M -printf '%s %p\n'
112640000 /home/flo/File 100MB 204800000 /home/flo/File200MB 1126400000 /home/flo/1GB File
We can go a little further by sorting the files from largest to smallest: find
/home/flo/ -type f -size +100M -printf '%s %p\n' | sort -nr
The output will be simpler to analyze:
If there are a lot of files, how can we obtain a “Top 10” or a “Top 20” of the largest files? The response with the head command, which we will add this way:
find /home/flo/ -type f -size +100M -printf '%s %p\n' | sort -nr | head -10
Note : When specifying the size, you can use K for KB, M for MB, G for GB.
III. Find large folders
In addition to the find command, there is also the du command, which can be used to find the largest folders on a machine. The name of this command means ” disk usage, “or disk usage in French: it should be interesting in relation to our need of the day!
you -a | black - no | head -n 5
Some explanations about this command:
- du -a: -a for -all allows all files and folders to be included in the count
- sort -nr: sort by looking at the numerical values (size, n) and reverse the result (r).
- head -n 5: display only the first five results, i.e., the Top 5 largest folders
This gives the following result, which includes both first-level folders and sub-folders:
To perform an analysis only on the first level folders (/var, /etc, /home, /tmp, etc.), with a size easily readable by a human (expressed in GB or MB), we will use this syntax:
du -hs /* | sort -rh | head -5
At a glance, we can identify the largest root directories:
But, this could lack precision: it would be useful to be able to identify which are the largest subdirectories. In theory, the “/home/flo” directory containing the files I intentionally created should come out. In this case, we need to adapt the command slightly:
you -Sh | sort -rh | head -5
The result is satisfactory:
Once we have identified the largest directories, we can carry out a targeted search for large files in these directories using the find command mentioned previously.
Being able to identify large files and folders on a Linux machine is a simple administrative task, but one you have to know how to do! Using the find and du commands, you can perform this search easily. They are available on different Linux distributions: Debian, Ubuntu, Rocky Linux, etc.
To be able to easily use these commands without having to memorize them completely, you can create your own Linux command aliases.