As a full-stack developer working extensively on Linux, having complete visibility into open files and file handles is critical for building performant and stable applications. The Linux kernel exposes everything as files – including devices, sockets, pipes – which needs to be carefully monitored. This comprehensive guide will dive into Linux tools, techniques and best practices from a developer‘s lens to check open files and optimize resource usage.

File Descriptors in Linux

At the heart of open files in Linux is the concept of file descriptors. A file descriptor is simply an integer associated with an open file or other input/output resource. It is used by Linux kernels to interface and interact with these resources.

Some key points about file descriptors:

  • Allocated by the kernel when a file is opened or a new resource is created
  • Needed for processes to read, write and manipulate files
  • Have limit on maximum handles imposed per process
  • Closed automatically when process exits or by explicit close calls
  • Can leak when not properly closed leading to issues

As developers working with sockets, databases, RPC calls and more – it becomes crucial to monitor if file handles are being closed correctly.

Now let‘s explore the primary tools and techniques available to manage file descriptors.

lsof – List Open Files

The lsof (list open files) command is the go-to tool for any analysis related to open files. It‘s available on pretty much all Linux and UNIX-like systems.

Some quick examples on how I routinely use lsof for development tasks:

# Which process has this log file open currently
lsof /var/log/myapp.log

# List files opened by the process with PID 6789 
lsof -p 6789

# See open sockets, pipes and other IPC mechanisms  
lsof -i -U

# Continuously monitor file activity in intervals
lsof -r 5

In my experience, creatively combining various lsof arguments is key to debugging issues around file descriptors – most commonly "Too many open files" errors.

Next, let‘s explore some critical lsof arguments:

Recurring Monitoring

The -r switch is invaluable when you need to recur and sample the open files continuously. This can reveal transient spikes getting missed in one-off observations.

lsof -r 5

Here the value 5 denotes repeating every 5 seconds.

File Type Filtering

Linux exposes everything as files. We can filter to only the types of open resources we care about using:

# Only network sockets   
lsof -i 

# Directories and metadata      
lsof +L1

# Named pipes FIFOs   
lsof +f

This allows focusing only on resource handles relevant to the problem being diagnosed.

Capability Filtering

Linux divides processes into capability sets like net admin, sys admin etc. Viewing by capability set often highlights the faulty processes:

# Open files for CAP_NET_ADMIN procs  
lsof +c CAP_NET_ADMIN

# CAP_DAC_OVERRIDE processes
lsof +c CAP_DAC_OVERRIDE   

This filters output to the specific capability set processes only.

In summary, lsof + arguments is an indispensable tool for me. It offers insights into file activity not provided by utilities like ps, top or netstat.

Now let‘s look at another useful subset of arguments – specifically picking output columns with -F:

Customizing lsof Output

The -F option allows selectively picking columns from lsof and rearranging them for readability.

For example, displaying just the core details:

# Useful cols for debugging   
lsof -FpcfPn

p - process ID
c - command 
f - file descriptor
n - file name

Another view I use – organized by process ID:

lsof -FpcPn +D /var | sort -k 1n 

Sorts output by PID  
Lists files under /var
Shows PID, command, FD, path cols

This output format and sorting allows correlating all files in use per process.

In essence, -F allows creating views matching our mental models – making it easier to spot patterns.

Now that we‘ve covered listing open files in detail, let‘s discuss limiting maximum file handles per process.

Tuning Maximum File Descriptors

Linux allows tuning the number of maximum file handles available per process via the ulimit command.

Viewing current max limit for a shell:

ulimit -Hn

# Typical default is 1024

Temporarily updating soft limit:

ulimit -Sn 10000

Making it persistent via /etc/security/limits.conf:

# For user john  
john soft nofile 10000

# For all users   
* soft nofile 10000 
* hard nofile 20000

Why tune the max open files limit? Often services fail to start upon hitting ulimit, unable to open sockets or files. Temporarilyincreasing this soft limit is helpful in debugging "too many open files" errors.

Additionally for long running production services – setting higher hard limits allows configuring room for future growth and prevent availability issues.

Now that we have covered listing and managing file descriptors, let‘s go over some best practices around file handling.

Programming Best Practices for File Handles

While coding apps dealing with files and sockets – keeping some best practices in mind will prevent descriptor leaks:

Use Resource Managers

Languages like Python provide context managers that auto-close sockets and files:

with open("/tmp/test.txt") as f:
   data = f.read() 
   # No need to explicitly close 

requests.get(url) as resp:
   print(resp.status_code) # Closed automatically

Leverage in-built managers to cleanly handle closure avoiding leaks.

Signal Handlers

Unexpected signals like SIGKILL can terminate processes before clean up begins. Implement signal handlers to manage file closures:

import signal

def handler(signum, frame):
   print(f"Signal {signum} received!") 
   # Close open handles
   sys.exit() 

signal.signal(signal.SIGTERM, handler)

This ensures unexpected exits can still close FDs avoiding descriptor exhaustion over time.

Connection Pooling

Managing individual sockets and files becomes inefficient at scale. Pooling using libraries like Apache Commons DBCP allows reuse of descriptors:

BasicDataSource pool = new BasicDataSource();
pool.setMaxTotal(100);

Connection con = pool.getConnection(); 
// Reuse from pool instead per request

Pooling sets up descriptor limits with reuse lowering resource overhead.

To summarize, engineering applications judiciously around file descriptors and monitoring for leaks is vital for Linux infrastructure stability.

Now that we have covered various facets of files and processes – it‘s also helpful to peek under the hood in procfs.

Exploring /proc for File Stats

The /proc virtual filesystem on Linux exposes insightful metrics into system resources usage – including file and process statistics.

As a developer, I routinely poke around /proc to correlate metrics with application behavior.

Some particularly useful file and descriptor related endpoints:

/proc/sys/fs/file-nr

Gives a holistic view of current file handles allocated on the system:

# /proc/sys/fs/file-nr
0       135227      839459

Allocatable handles left
Allocated file handles
Max system limit 

This summary allows determining overall system health around file usage coefficiency.

/proc/\<PID>/limits

The limits file exposes per process handle restrictions:

# /proc/31201/limits 
Max open files            10000                10000                files

Troubleshooting processes hitting the max open files limit is made easier.

/proc/\<PID>/fd

This special directory provides the list of file descriptors allocated per process:

# Directly list file descriptors per PID
ls -l /proc/31201/fd

total 0
lrwx------ 1 root root 64 Jan 20 07:23 0 -> /dev/null
lrwx------ 1 root root 64 Jan 20 07:23 1 -> /dev/null
lrwx------ 1 root root 64 Jan 20 07:23 10 -> /tmp/foo
lrwx------ 1 root root 64 Jan 20 07:23 11 -> /tmp/bar  

Debugging and correlating leaky or exhausted file handles becomes simpler using /proc.

Combine with lsof for targeted profiling of troublesome processes.

Conclusion

Managing file descriptors and open files is critical for building and monitoring production Linux environments. Languages and tools make it easy to lose track of stale handles impacting reliability and performance.

In summary here are my top strategies as a Linux developer:

  • Instrument applications upfront with trace logging and metrics on files/sockets
  • Create visibility into file utilization early via dashboards and alarms
  • Master usage of lsof + procfs for debugging files and processes
  • Set appropriate init and max limits on open files with ulimit
  • Build apps defensively leveraging automatic closures and pooling
  • Keep OS and applications updated to prevent descriptor leaks

Getting a rigorous handle around file usage ensures apps withstand load spikes along with preventing stability headaches!

Let me know if you have any other best practices around efficient file handling in Linux.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *