Environment variables enable Python developers to separate configuration from code. Using techniques like os.environ, exports, distutils configs, and .env files allow you to leverage environment variables safely and effectively.

This complete guide covers all aspects of using environment variables in Python, from basic concepts to production best practices.

What are Environment Variables and Why do They Matter?

Environment variables are dynamic values that can affect running processes on a system. They customize aspects of the environment without changing code.

For example, PATH specifies directories to search for executables. Python processes rely on many env vars set by the operating system.

Additionally, custom environment variables allow you to configure applications dynamically. Benefits include:

Configuration

  • Override hardcoded application defaults
  • Change app behavior across environments

Security

  • Omit secret keys and credentials from code
  • Isolate sensitive values like API tokens

Services Integration

  • Interface with supporting tools like containers and task queues
  • Pass runtime configs to microservices

Isolation

  • Prevent conflicts between competing services
  • Containerize apps with unique environments

In summary, key reasons to use environment variables are:

  1. Configuration: Dynamic runtime configuration for code
  2. Security: Store secrets safely separated from code
  3. Integration: Interface between code, services, and tools
  4. Isolation: Provide unique envs for isolated processes

Understanding these motives helps cement best practices as we dive further.

Reading and Accessing Environment Variables in Python

Python provides easy access to environment variables via the inbuilt os module.

The os.environ dictionary contains env vars mapped to values. For example:

import os

print(os.environ) # Prints all variables
{ 
    ‘HOME‘: ‘/Users/name‘,
    ‘SHELL‘: ‘/bin/bash‘, 
    ‘CUSTOMVAR‘: ‘Hello‘
}

You can access a particular variable‘s value as a dict key:

home_dir = os.environ[‘HOME‘] # Get HOME variable

Or use getenv() to retrieve values:

home_dir = os.getenv(‘HOME‘)

getenv() allows safely handling unset variables by providing defaults:

key = os.getenv(‘UNDEFINED_KEY‘, ‘default‘) 
print(key) # Prints ‘default‘

Additionally useful functions include:

  • os.environ.get(key, default): Get var or return a default
  • key in os.environ: Check if var is set
  • os.environ.keys(): Get all defined keys

So reading env vars in Python via os.environ is very straightforward.

Setting and Modifying Environment Variables

To set new environment variable keys and values, write to the os.environ dict:

import os

os.environ[‘DEBUG‘] = ‘1‘ # Set a new key

os.environ[‘FOO‘] = ‘custom_value‘ # Modify existing key 

print(os.environ[‘FOO‘]) # Prints new value

However, some downsides to manually altering os.environ include:

  • Changes only last for the current running process and child processes. Terminating your Python program resets changes.
  • Modifying OS level variables can break things and cause inconsistencies.

So how do you setup environment variables cleanly?

Permanently Setting Environment Variables with Exports

For permanent, inheritable environment variables, developers often export variables from shell scripts.

For example, consider an bash script envvars.sh:

#!/bin/bash

export CUSTOM_KEY="123abc" 
export EnabledDevMode=true

Running this script exports the variables. Child processes inherit exports.

To validate, check from Python:

import os
print(os.environ[‘CUSTOM_KEY‘]) # Prints 123abc

The script permanently applies changes – until your shell session ends.

You can automate running such scripts on login with dotfiles that customize your environment.

However, exports must be reapplied manually per OS user session. Complex projects often demand more robust configuration.

Setting Variables via distutils Configuration Files

For central, global environment variables, Python‘s distutils module is preferred.

distutils lets you specify env vars in dedicated .pydistutils.cfg config files.

For example:

Linux/Mac

[environ] 
DEBUG=1
OTHERVAR=/some/path

Windows

[environ]
DEBUG=1
OTHERVAR=c:/some/path

Any variables under [environ] get injected into os.environ:

import os

print(os.environ[‘DEBUG‘]) # Prints 1

This approach has multiple advantages:

  • Variables are centrally managed in standalone files
  • Changes automatically apply to all Python processes
  • Works globally without per-session setup

In summary, distutils configuration files provide the cleanest way to manage permanent environment variables for Python code across environments.

Loading Variables from .env Files

Another popular approach is using .env files that store environment variables as key-value pairs:

DB_HOST=localhost
DB_PASS=foobar123

Python can load these files with the python-dotenv module:

from dotenv import load_dotenv

load_dotenv() # Load vars from .env file

db_pass = os.getenv(‘DB_PASS‘)

Key benefits of this approach are:

  • Separate files from code for variables
  • Can load different files per environment (staging.env, prod.env)
  • Often used with Docker containers
  • Easy to integrate with frameworks like Flask

.env files combined with dotenv provide an easy way to inject configuration on a per-project basis during local development.

Setting Environment Variables in CI/CD Pipelines

For testing and deployment automation, pipelines need access to environment variables.

Most CI/CD systems like GitHub Actions, TravisCI and CircleCI have syntax for injecting secret variables into running builds.

For example, in a .github/workflows/main.yml file:

jobs:
  build:
    env:  
      SUPER_SECRET: ${{ secrets.SUPER_SECRET }}

This securely passes the GitHub Secret into your code as SUPER_SECRET without exposing it directly.

So for dynamic configs, tokenizing values into CI/CD variables is the standard modern practice.

Your Python code can then access these variables like any other:

import os

api_token = os.getenv(‘SUPER_SECRET‘) # From GH Actions secret

This workflow allows secrets to be managed securely by the CI/CD system.

Best Practices for Production Secrets Management

Speaking of secrets – how do you securely manage sensitive credentials and keys for production?

Common bad practices are:

✘ Checking secrets into version control

✘ Hardcoding unencrypted passwords into config files

✘ Reusing the same production secrets across multiple environments

Instead, here are some recommendations:

Abstract Secret Access Into Helper Modules

Centralize all secret usage into secure helpers. Don‘t fetch values directly:

# bad 
import os
password = os.getenv(‘DB_PASSWORD‘)

# good
from securutils import get_db_password  

password = get_db_password() # Abstracted access

Use Secrets Managers Like HashiCorp Vault

Tools like Vault generate dynamic secrets on demand. Servers fetch secrets at runtime without hardcoding.

For example, Vault can inject SQL credentials into Python apps securely.

Utilize Per-Environment Variables

Production vs staging vs QA environments should have different configs.

Name variables by environment like DB_PASSWORD_STAGING and DB_PASSWORD_PROD.

Combining these best practices – secret abstraction, vaults, and environment namespacing – leads to robust secrets management.

Environment Variable Hygiene Recommendations

Generally, maintaining good "environment hygiene" gives you confidence your configs are correct. Here are some top recommendations:

  • Document all variables used in your project clearly. Unknown configs can be dangers.

  • Define explicit variable namespaces like MYPROJ_DEBUG=1. Avoid generic names like DEBUG=1 that could collide.

  • Consider dotenv files for local development, exports for test/staging, and dedicating prod configs only to production.

  • Validate and sanitize any externally provided environment variable data before business use.

  • In CI/CD pipelines, lint files for invalid references during PR reviews. This catches new secrets being erroneously added.

Adopting these habits early prevents nasty surprises down the line!

Containerizing Python Applications with Environment Variables

Container technologies like Docker have first-class support for injecting environment variables at runtime.

For example, a Dockerfile might copy a .env file and access its values:

# Dockerfile

COPY ./.env /app/.env  

ENV DEBUG=$DEBUG
ENV DB_HOST=$DB_HOST

And pass additional variables during docker run:

docker run -e "EXTRA_VAR=123" my-python-app 

Inside the containerized app code:

import os

debug = os.getenv("DEBUG") # From .env file
extra = os.getenv("EXTRA_VAR") # Set from docker run

print(f"extra={extra}") 

So Docker provides a few options – files, Dockerfile vars, and manual params – to discretionarily configure containers.

Additional Libraries for Environment Variables

Beyond built-in os.environ, third-party Python libraries expand environment variable support.

For example:

python-dotenv: Robust .env file parsing

from dotenv import load_dotenv

load_dotenv(verbose=True) 

django-environ: Seamless Django config integration

python-decouple: Auto-cast string env vars (to ints, bools etc)

So depending your specific workflow, extended functionality may be helpful beyond the basics.

Troubleshooting Issues with Environment Variables

Let‘s discuss how to debug environment issues if things go wrong:

  • First, check if expected variables are set from Python with print(os.environ).

  • Inspect differences between OS-level vs Python-level variables. A mismatch indicates a scopes issue.

  • Test that CI/CD, docker-compose etc actually provide values as expected.

  • Enable verbose logging for libraries like dotenv during load.

  • Consider using a package like environs to validation env types.

  • Triple check for typos in names when accessing variables.

  • Clear out cached/compiled artifacts and fully reinstall dependencies.

Slowing down methodically comparing configuration across levels often surfaces problems quickly.

And tools like environs can prevent issues upfront by type checking.

Putting it All Together: A Workflow Example

Consider a web API with the following environments:

  • Local development machine
  • Shared dev/test resources
  • Staging on Kubernetes
  • Production cluster

Here is one way to manage configurations across the full workflow:

Environment Configuration Approach
Local Dev .env file loaded by python-dotenv
Shared Env Exported variables from setenv.sh
Staging Kubernetes configmaps
Production HashiCorp Vault injected vars

This hybrid combines complementary approaches at each layer.

Locally, developers enjoy rapid .env-based iteration.

The shared layer relies on inherited shell exports.

Staging operates inside Kubernetes, leveraging native configmaps.

And finally, production secures credentials via Vault secrets.

Transitioning between environments is smoothed by abstraction – the app code just accesses vars the same way, while underlying sources differ.

Blending various techniques like this is extremely powerful for real-world applications.

Key Takeaways

Some core lessons around managing environment variables:

  • Use os.environ to access vars in Python code
  • Choose the right approach per use case – .env files, exports, distutils, vaults etc
  • Abstract secret access into secure helper modules
  • Documentation and hygiene practices prevent surprises
  • CI/CD pipelines should provide dynamic, tokenized variables
  • Containers enable easy injection of configs
  • Validate and sanitize input variable data
  • Apply environment variable best practices from development through production

Learning these key ideas will help you build robust applications!

The ability to separate configuration from code makes environment variables invaluable. Both user applications and systems-level programs rely on them.

Hopefully this guide gives you a firm grasp of the full environment variable landscape. Let me know if you have any other questions!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *