Reclaiming space on Linux Machines

Table of Contents

If you’ve ever managed a Linux server, whether it’s a virtual machine, a physical server, or a router with Debian or Red Hat roots, you’ll know that reclaiming disk space is an ongoing battle. Over time, logs accumulate, redundant files pile up, and unused packages sit around taking up space. In this article, we’ll dive into some helpful tips and tricks to help you identify and reclaim space on your Linux server.

What’s Eating Your Space?
#

df and du are your friends. These two powerful utilities, are a good port of call when investigating space use. But it’s important to keep in mind that they work quite differently:

df: Reads the superblock (filesystem metadata) to show total usage at a high level. Think of it as a broad overview.
du: Goes through each directory and file individually to tally up usage, giving a more granular report.

To get started, here are a few useful commands:

Summary of Current Directory
#

du -smch *

-s: Summarises the disk usage for each specified argument.
-m: Shows the output in megabytes for easier readability.
-c: Displays a grand total at the end.
-h: Outputs the sizes in human-readable format.

This command gives a readable breakdown of usage in the current directory. Here’s an example output of that command:

$ sudo du -smch *
4.9M	backups
1.5G	cache
161M	lib
4.0K	local
0	lock
3.9M	log
20K	mail
4.0K	opt
0	run
124K	spool
100M	swap
44K	tmp
1.7G	total

Now it should be straight forward enough to see where your space is being utilised. You can cd into the directory with the most used space in it, and run the exact same command until you find where your problem is. Then it’s just a case of removing any unwanted garbage.

Including Hidden Files
#

du -sch .* *

.* *: Includes both hidden (.*) and visible (*) files.
-s, -c, -h: Same as above.

Useful when dealing with hidden files that might be taking up significant space.

Sorted Output, Excluding Unix “Dotfiles”
#

du -sch .[!.]* * | sort -h

.[!.]*: Matches hidden files excluding . and .. (common dotfiles).
| sort -h: Sorts the output in ascending order by human-readable size.

This one excludes Unix-specific dotfiles like . and .. while showing other hidden files.

Using `find` and `du` Together to Track Down Large Files and Directories
#

Long-lived systems tend to accumulate “forgotten” files. For instance, large log files, old backups, or misconfigured apps might silently hoard space. Here’s how to find them:

Listing Large Files in `/var`
#

find /var -type f -size +10000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'

/var: Target directory to search in.
-type f: Finds only files (ignores directories).
-size +10000k: Lists files larger than 10 MB.
-exec ls -lh {} \;: Executes ls -lh on each matching file, showing detailed size in human-readable format.
| awk '{ print $9 ": " $5 }': Outputs the filename and file size.

This command searches for files over 10 MB in /var, a typical directory for log files and caches.

Top 10 Largest Files
#

find /var -type f -print0 | xargs -0 du | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}

-print0: Outputs file paths separated by null characters (avoids issues with special characters).
xargs -0 du: Feeds the null-separated paths into du for detailed sizes.
sort -n: Sorts output numerically by file size.
tail -10: Retrieves the last 10 entries, representing the largest files.
cut -f2: Extracts just the filenames.
xargs -I{} du -sh {}: Re-runs du to show sizes in human-readable format.

Great for pinpointing the space hogs on your system. Adjust the directory path if needed.

Top 10 Largest Directories (Handling “Permission Denied” Errors)
#

find /var -type d -print0 2>/dev/null | xargs -0 du 2>/dev/null | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {} 2>/dev/null

-type d: Finds directories only.
2>/dev/null: Redirects permission-denied errors to /dev/null to avoid cluttering output.
| xargs -0 du: Feeds null-separated directories into du to calculate sizes.
sort -n | tail -10 | cut -f2: Same as the previous command, used for listing the largest directories.

This finds the biggest directories while bypassing common permission errors. Just replace /var with the directory you’re interested in.

Cleaning Up
#

Once you’ve identified the space hogs, it’s time to clean up. But be careful—deleting the wrong file can have serious consequences. Here’s a safe way to start:

Deleting Files Older Than 10 Days
#

sudo /usr/bin/find . -mtime +10 -type f -exec rm {} \;

. -mtime +10: Searches for files older than 10 days in the current directory.
-type f: Targets files only.
-exec rm {} \;: Executes rm to delete each matching file.

This will remove files older than 10 days in the current directory—perfect for clearing out old logs. Adjust +10 to the age limit you’re comfortable with.

Additional Tools for Disk Space Management on Debian and RHEL
#

Both RHEL-based systems (e.g., CentOS, Rocky) and Debian-based systems (e.g., Ubuntu, Debian itself) have their unique methods when it comes to package management and maintenance.

RHEL-based Systems (RedHat Enterprise Linux/CentOS/RockyLinux)
#

yum clean all

This clears out cached packages and headers, freeing up space used by old packages.

Logrotate
By default, RHEL systems use logrotate to manage log sizes. Ensure it’s configured to archive or delete logs appropriately. Configurations can be checked or updated in /etc/logrotate.conf.

Debian-based Systems (Debian / Raspberry Pi OS / loads more)
#

apt clean

This removes local copies of retrieved packages, freeing up space.

journalctl --vacuum-size=100M

This limits systemd’s journal logs to a maximum of 100 MB, useful on Debian-based systems where journal logs grow quickly.

Keeping Your System Lean and Healthy with automation
#

Setting up a cron job for periodic clean-up is an effective way to keep your system from running out of space unexpectedly. Here’s how you can set up a cron job to automatically delete files older than 10 days in a specified directory.

Example Cron Job to Delete Old Files
#

Edit the cron file: Open the crontab editor by running:
```
crontab -e
```

Important - don’t forget the -e! If you don’t include this flag, you will overwrite the crontab, potentially losing any existing crons in there!

Add the cron job: Insert the following line to delete files older than 10 days from /path/to/directory every week:
```
0 2 * * 0 find /path/to/directory -type f -mtime +10 -exec rm {} \;
```
- 0 2 * * 0: Runs the job every Sunday at 2:00 AM.
- find /path/to/directory -type f -mtime +10 -exec rm {} \;: Deletes files older than 10 days from the specified directory.
Save and exit: After adding the line, save and close the editor.

Testing the Cron Job
#

To test the job without waiting, you can copy and paste the command directly in the terminal:

find /path/to/directory -type f -mtime +10 -exec rm {} \;

Checking Cron Logs
#

To confirm the cron job is running, you can check the cron logs:

grep CRON /var/log/syslog

On some systems, the cron logs may be located in /var/log/cron.log.

With cron, automated monitoring tools like Nagios or Prometheus can also alert you to spikes in disk usage before they become critical.

Here’s the troubleshooting section with the inode scenario:

Troubleshooting Disk Space Issues
#

`du` Shows 100% Usage, but you can’t find the files!
#

If you’ve used du to check your disk usage and it reports 100% usage, but you can’t locate any large files consuming the space, the issue could be with inodes rather than the actual disk space.

What Is an Inode?
#

An inode (index node) is a data structure in Linux that stores information about a file, like its owner, permissions, and location on the disk. Every file and directory on a filesystem has an associated inode. While the disk might still have free space, if your filesystem has used up all its inodes, you won’t be able to create new files. Essentially this means that lots of small files that don’t even cumulatively fill up the volume’s free space, they can still prevent new files from being written.

To check inode usage, you can use the df command with the -i flag:

df -i

If you see 100% inode usage on a particular filesystem, this is likely the cause of your “full disk” issue. Now, you should be able to effectively ascertain the location of the heavy inode use and resume your clear down task using the methods shown earlier. I’ve seen this issue catch out even the most senior of engineers in the past and it’s easy to forget about.

What’s Eating Your Space?#

Summary of Current Directory#

Including Hidden Files#

Sorted Output, Excluding Unix “Dotfiles”#

Using find and du Together to Track Down Large Files and Directories#

Listing Large Files in /var#

Top 10 Largest Files#

Top 10 Largest Directories (Handling “Permission Denied” Errors)#

Cleaning Up#

Deleting Files Older Than 10 Days#

Additional Tools for Disk Space Management on Debian and RHEL#

RHEL-based Systems (RedHat Enterprise Linux/CentOS/RockyLinux)#

Debian-based Systems (Debian / Raspberry Pi OS / loads more)#

Keeping Your System Lean and Healthy with automation#

Example Cron Job to Delete Old Files#

Testing the Cron Job#

Checking Cron Logs#

Troubleshooting Disk Space Issues#

du Shows 100% Usage, but you can’t find the files!#

What Is an Inode?#