As a full-stack developer working extensively on Linux, having complete visibility into open files and file handles is critical for building performant and stable applications. The Linux kernel exposes everything as files – including devices, sockets, pipes – which needs to be carefully monitored. This comprehensive guide will dive into Linux tools, techniques and best practices from a developer‘s lens to check open files and optimize resource usage.
File Descriptors in Linux
At the heart of open files in Linux is the concept of file descriptors. A file descriptor is simply an integer associated with an open file or other input/output resource. It is used by Linux kernels to interface and interact with these resources.
Some key points about file descriptors:
- Allocated by the kernel when a file is opened or a new resource is created
- Needed for processes to read, write and manipulate files
- Have limit on maximum handles imposed per process
- Closed automatically when process exits or by explicit close calls
- Can leak when not properly closed leading to issues
As developers working with sockets, databases, RPC calls and more – it becomes crucial to monitor if file handles are being closed correctly.
Now let‘s explore the primary tools and techniques available to manage file descriptors.
lsof – List Open Files
The lsof
(list open files) command is the go-to tool for any analysis related to open files. It‘s available on pretty much all Linux and UNIX-like systems.
Some quick examples on how I routinely use lsof for development tasks:
# Which process has this log file open currently
lsof /var/log/myapp.log
# List files opened by the process with PID 6789
lsof -p 6789
# See open sockets, pipes and other IPC mechanisms
lsof -i -U
# Continuously monitor file activity in intervals
lsof -r 5
In my experience, creatively combining various lsof arguments is key to debugging issues around file descriptors – most commonly "Too many open files" errors.
Next, let‘s explore some critical lsof arguments:
Recurring Monitoring
The -r
switch is invaluable when you need to recur and sample the open files continuously. This can reveal transient spikes getting missed in one-off observations.
lsof -r 5
Here the value 5
denotes repeating every 5 seconds.
File Type Filtering
Linux exposes everything as files. We can filter to only the types of open resources we care about using:
# Only network sockets
lsof -i
# Directories and metadata
lsof +L1
# Named pipes FIFOs
lsof +f
This allows focusing only on resource handles relevant to the problem being diagnosed.
Capability Filtering
Linux divides processes into capability sets like net admin, sys admin etc. Viewing by capability set often highlights the faulty processes:
# Open files for CAP_NET_ADMIN procs
lsof +c CAP_NET_ADMIN
# CAP_DAC_OVERRIDE processes
lsof +c CAP_DAC_OVERRIDE
This filters output to the specific capability set processes only.
In summary, lsof + arguments is an indispensable tool for me. It offers insights into file activity not provided by utilities like ps, top or netstat.
Now let‘s look at another useful subset of arguments – specifically picking output columns with -F
:
Customizing lsof Output
The -F
option allows selectively picking columns from lsof and rearranging them for readability.
For example, displaying just the core details:
# Useful cols for debugging
lsof -FpcfPn
p - process ID
c - command
f - file descriptor
n - file name
Another view I use – organized by process ID:
lsof -FpcPn +D /var | sort -k 1n
Sorts output by PID
Lists files under /var
Shows PID, command, FD, path cols
This output format and sorting allows correlating all files in use per process.
In essence, -F
allows creating views matching our mental models – making it easier to spot patterns.
Now that we‘ve covered listing open files in detail, let‘s discuss limiting maximum file handles per process.
Tuning Maximum File Descriptors
Linux allows tuning the number of maximum file handles available per process via the ulimit
command.
Viewing current max limit for a shell:
ulimit -Hn
# Typical default is 1024
Temporarily updating soft limit:
ulimit -Sn 10000
Making it persistent via /etc/security/limits.conf:
# For user john
john soft nofile 10000
# For all users
* soft nofile 10000
* hard nofile 20000
Why tune the max open files limit? Often services fail to start upon hitting ulimit, unable to open sockets or files. Temporarilyincreasing this soft limit is helpful in debugging "too many open files" errors.
Additionally for long running production services – setting higher hard limits allows configuring room for future growth and prevent availability issues.
Now that we have covered listing and managing file descriptors, let‘s go over some best practices around file handling.
Programming Best Practices for File Handles
While coding apps dealing with files and sockets – keeping some best practices in mind will prevent descriptor leaks:
Use Resource Managers
Languages like Python provide context managers that auto-close sockets and files:
with open("/tmp/test.txt") as f:
data = f.read()
# No need to explicitly close
requests.get(url) as resp:
print(resp.status_code) # Closed automatically
Leverage in-built managers to cleanly handle closure avoiding leaks.
Signal Handlers
Unexpected signals like SIGKILL can terminate processes before clean up begins. Implement signal handlers to manage file closures:
import signal
def handler(signum, frame):
print(f"Signal {signum} received!")
# Close open handles
sys.exit()
signal.signal(signal.SIGTERM, handler)
This ensures unexpected exits can still close FDs avoiding descriptor exhaustion over time.
Connection Pooling
Managing individual sockets and files becomes inefficient at scale. Pooling using libraries like Apache Commons DBCP allows reuse of descriptors:
BasicDataSource pool = new BasicDataSource();
pool.setMaxTotal(100);
Connection con = pool.getConnection();
// Reuse from pool instead per request
Pooling sets up descriptor limits with reuse lowering resource overhead.
To summarize, engineering applications judiciously around file descriptors and monitoring for leaks is vital for Linux infrastructure stability.
Now that we have covered various facets of files and processes – it‘s also helpful to peek under the hood in procfs.
Exploring /proc for File Stats
The /proc virtual filesystem on Linux exposes insightful metrics into system resources usage – including file and process statistics.
As a developer, I routinely poke around /proc to correlate metrics with application behavior.
Some particularly useful file and descriptor related endpoints:
/proc/sys/fs/file-nr
Gives a holistic view of current file handles allocated on the system:
# /proc/sys/fs/file-nr
0 135227 839459
Allocatable handles left
Allocated file handles
Max system limit
This summary allows determining overall system health around file usage coefficiency.
/proc/\<PID>/limits
The limits file exposes per process handle restrictions:
# /proc/31201/limits
Max open files 10000 10000 files
Troubleshooting processes hitting the max open files limit is made easier.
/proc/\<PID>/fd
This special directory provides the list of file descriptors allocated per process:
# Directly list file descriptors per PID
ls -l /proc/31201/fd
total 0
lrwx------ 1 root root 64 Jan 20 07:23 0 -> /dev/null
lrwx------ 1 root root 64 Jan 20 07:23 1 -> /dev/null
lrwx------ 1 root root 64 Jan 20 07:23 10 -> /tmp/foo
lrwx------ 1 root root 64 Jan 20 07:23 11 -> /tmp/bar
Debugging and correlating leaky or exhausted file handles becomes simpler using /proc.
Combine with lsof for targeted profiling of troublesome processes.
Conclusion
Managing file descriptors and open files is critical for building and monitoring production Linux environments. Languages and tools make it easy to lose track of stale handles impacting reliability and performance.
In summary here are my top strategies as a Linux developer:
- Instrument applications upfront with trace logging and metrics on files/sockets
- Create visibility into file utilization early via dashboards and alarms
- Master usage of lsof + procfs for debugging files and processes
- Set appropriate init and max limits on open files with ulimit
- Build apps defensively leveraging automatic closures and pooling
- Keep OS and applications updated to prevent descriptor leaks
Getting a rigorous handle around file usage ensures apps withstand load spikes along with preventing stability headaches!
Let me know if you have any other best practices around efficient file handling in Linux.