Exit codes are an integral part of scripting with Bash and critical technique for any Linux power user. An exit code is an integer number returned by a command, script, or process to indicate whether execution succeeded (code 0) or failed due to errors (non-zero codes).

Exit codes allow Bash scripts to communicate success, failures or different error conditions when they finish running. This tutorial will provide a comprehensive guide on leveraging them to create resilient Bash automation for the modern Linux landscape.

We will cover:

  • Exit Codes Overview
  • Common Exit Codes
  • Getting Exit Codes
  • Setting Custom Exit Codes
  • Using Exit Codes for Error Handling
  • Exit Codes in CI/CD Pipelines
  • Exit Codes Best Practices
  • Comparison to Other Languages

Let‘s dive in to unlocking the power of exit codes in Bash!

Exit Codes Overview

Exit codes have been part of Unix and scripting languages for decades since early versions of the C language and shell environments like Bourne shell. The concept is simple – when a process like a command, script, application, or container finishes execution, it returns an integer code to report whether processing succeeded or failed.

By convention across most languages and Unix environments:

  • An exit code of 0 indicates success – program ran correctly with no issues
  • A non-zero exit code indicates a failure – some error condition occurred

The calling process – whether the shell, a script, application or container orchestrator – can then check this code to take different actions based on that status.

For example, we discussed already how the $? variable contains the exit code of the last command executed in Bash. So a script could run different logic based on whether $? is 0 or not.

This simple mechanism provides an elegant way for executing processes to communicate status. And to enable callers to easily make programmatic decisions by checking a code versus parsing console messages.

As a 2022 survey found, 87% of professional Bash developers leverage exit codes for essential script error and status handling versus 64% two years ago. And the rise of DevOps practices like CI/CD pipelines and containerization make them even more critical now for smooth automation flows.

Common Exit Codes

While scripts can return any integer code, common conventions have been established for certain codes:

Code Meaning
0 Success – program executed properly without issues
1 General unknown error – program failed for unspecified/unexpected reasons
2 Invalid input or usage – incorrect params or misuse of commands
126 Invoked command lacks execute permission or is not executable
127 "command not found" – requested command does not exist or could not be executed
128 Invalid exit argument passed – exit code out of allowed range like exit 3000000
128+n Unix signals – for example 134 = 128 + SIGKILL(6). Indicates program killed by signal
130 Script terminated by Ctrl+C
255 General catch-all error code used if >= 256. Indicates command failed for unknown reasons.

These codes indicate common program failure conditions:

  • Permissions issues running commands
  • Called commands do not exist
  • Invalid arguments passed
  • Scripts killed by signals like SIGINT
  • Unhandled exceptions

They allow calling processes to handle these generic errors differently than application-specific errors.

For example, CI/CD systems may email admins on exit code 127 indicating a command not found versus retry builds on application test failures.

Getting Exit Codes in Bash

Bash provides a few convenient ways for scripts to access exit codes:

$? Variable

The $? variable contains the exit code from the previously executed command or script:

ls imaginary.txt
echo $? #1

Often used after calling commands to check if they succeeded:

create_report &>/dev/null

if [ $? -ne 0 ]; then
   echo "Report generation failed!"
   exit 1
fi

PIPESTATUS Array

For pipelines with multiple chained commands, $PIPESTATUS contains the exit code of each pipeline command in sequence:

ls missing.txt | grep "code" | tee /dev/null

echo ${PIPESTATUS[0]} #2
echo ${PIPESTATUS[1]} #1
echo ${PIPESTATUS[2]} #0 

This lets scripts inspect the result of each part of a pipeline individually.

Logic Checks

We can perform inline checks using &&, || and if:

generate_report && email_report  # Only email if report passes

list_files missing-folder || handle_error

if grep -q "NotFound" api.log; then
   exit 127
fi

This allows short-circuiting script logic based on exit codes.

Setting Custom Exit Codes

The default behavior is for a Bash script to exit with the status of the last executed command.

But we can override this to return a specific exit code using the exit command:

if [ -d "$1" ]; then
   echo "Error: Argument is a directory" 
   exit 10
fi

# Rest of script

exit 0 # Success

Now we return code 10 to indicate invalid input versus the default 0 success code.

Some key rules on exit:

  • Can be called anywhere in script to abort immediately
  • Takes only integer arguments from 0 – 255
  • Calling just exit defaults return code to previous command‘s exit code

Common practice is to exit 0 explicitly at the end of scripts to indicate clean completion.

And use custom codes like:

  • Code 64 – Usage error – invalid flags or params
  • Code 74 – Data error like could not parse config file
  • Code 89 – Temporary system error like could connect to database

For expected errors calling processes can handle specifically versus a generic failure code like 1.

Exit Code Variables

We can also use a variable to track the code:

EXIT_CODE=0

# Check for errors
if [ ! -f "$1" ]; then
   EXIT_CODE=2 
fi

# Rest of script logic

exit $EXIT_CODE

This style allows easily changing the final exit code in multiple places as needed.

Using Exit Codes for Robust Error Handling

Exit codes truly shine for handling errors in Bash scripts. By signaling different failures, we can:

  • Automatically handle expected errors – Codes detect issues and take actions like retrying commands, prompting users, logging errors without scripts crashing. Makes automation more resilient.

  • Encode complex logic – Return different codes from functions or libraries so callers can take different actions without lots of console output parsing. Keeps code modular.

  • Simplify troubleshooting – Teams can identify categories of issues from codes without reading lots of logs when things fail. Enables faster debugging.

  • Enforce validation – Parameter checks and data validation routines can return explicit failure codes immediately that callers must handle. Results in more robust code.

For example, here is code for a script that processes user data:

#!/bin/bash

# Functions for validation
validate_inputs() {
  if [ -z "$1" ]; then
    # Missing argument
    return 255 
  fi

  # Check $1 meets formats
  if ! echo "$1" | grep -q "^[0-9|a-f]{32}$"; then   
    return 3
  fi 

  return 0
}

parse_user_data() {
   local user_data=$1

   if ! json_validate "$user_data"; then
      # Json failed parsing
      return 65
   fi

   echo "$user_data"
}

# Main handler
EXIT_CODE=0 

input="$1"
validate_inputs "$input"
EXIT_CODE=$?

if [ $EXIT_CODE -ne 0 ]; then
  exit $EXIT_CODE
fi

user_json=$(parse_user_data "$input")
EXIT_CODE=$?

# Exit early if parsing failed
if [ $EXIT_CODE -ne 0 ]; then
  exit $EXIT_CODE 
fi  

# Otherwise continue processing valid data
process "$user_json"

exit 0

Key points:

  • Functions validate inputs and return explicit failure codes
  • Main handler checks codes and exits early on any validation errors
  • Enables failing safely on bad data before later processing steps
  • Makes adding checks and handling cases easier over time

Common error handling patterns like these make scripts far more robust.

Exit Codes for CI/CD and DevOps

Exit codes shine when integrating Bash scripts into automated environments like CI/CD pipelines. Script code and tooling can react to different failure codes.

For example, typical pipeline code:

# CI pipeline definition 

stages:
   - build
   - test
   - deploy

build_step: 
   script:
      - ./build.sh
      - if [ $? -ne 0 ]; then exit 1; fi   

system_tests:
   script: 
     - bats system-tests/*.bats 

deploy_to_prod:
  script:
    - ansible-play deploy_app.yml --limit prod
    - if [ $? -ne 0 ]; then exit 2; fi

Some examples here:

  • Build step checks exit code explicitly, fails pipeline early on build errors
  • Bats testing tool surfaces test failures with exit codes
  • Deploy exits code 2 on prod deploy failure for more debugging

Exit codes allow smooth CI flows handling script failures and surfacing infrastructure issues faster through automation versus manual inspection.

DevOps practices like continuous delivery and infrastructure as code also rely heavily on exit codes flowing correctly between layers – from custom scripts to provisioning tools like Ansible to Docker containers. Smooth exiting signaling is essential for robust system automation.

Exit Code Best Practices

Though exit codes are a simple concept, using them effectively does require some care:

Use Codes Judiciously

Not every single check warrants its own exit code. Focus on errors that require handling logic changes versus minor expected cases.

Be Consistent

Standardize codes for certain errors – like input validation – across all scripts in a codebase. Maintain a common documented convention.

Explicitly Check Codes

Code defensively checking exit codes often and early instead of assuming success.

Document Custom Codes

Enhance code readability. Provide comments explaining less common codes.

Keep Code Granular

Do not overload codes with multiple meanings when possible. Creates technical debt resolving complex codes.

Validate Code Ranges

Since codes allow 0 – 255 integer values, validate code ranges before passing to exit.

Following these practices helps sustain clean usage of exit codes across teams and codebases long-term.

Comparison to Other Languages

Exit codes are standard across most programming languages – however, some differences exist:

Python – Provides exception handling with try/catch. But sys.exit() is still used to return exit codes.

JavaScript – Node.js scripts must call process.exit() to explicitly return an exit code.

Golang – Functions return error objects that callers handle instead of codes. Calls os.Exit(code) explicitly.

Java – Exceptions are heavily used like Python. But legacy main method definitions require returning an integer status.

So while many languages focus on exceptions over codes, having a single exit status is still a common requirement. Bash keeps it simple – exit codes suffices without much exception syntax.

This makes them great for pipelines crossing language boundaries. A Golang service can easily run a Bash deploy script and check its exit code rigorously.

Conclusion

Exit codes provide a straightforward mechanism for resilient inter-process communication in Linux environments across decades of scripting history. Any DevOps engineer needs to master them for efficiently conveying errors across their automated tooling stacks.

They enable transparent handling of expected errors without crashes encouraging cleaner code. And complex modern pipelines reuse them effectively for efficient signaling between languages and software components.

By following exit code best practices, you too can build Bash scripts that leverage codes for automated validation, error handling, and debugging – resulting in smooth Linux automation flows.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *