Intermediate

You have a Python library that works perfectly on your machine running Python 3.11. Then a user files an issue — it crashes on 3.9. Another user is on 3.12. You fix the 3.9 bug and accidentally break 3.11 compatibility. Sound familiar? Testing across multiple Python versions manually means juggling virtual environments, remembering which one to activate, and running your test suite in each — a process so tedious it simply doesn’t happen. Bugs slip through. Users get hurt.

tox solves this by automating multi-environment testing in a single command. It reads a configuration file that lists which Python versions and dependencies to test against, creates isolated virtual environments for each one, installs your package into them, and runs your test suite in every environment — reporting failures per environment. One command, all Python versions, zero manual juggling.

In this article you will learn how to install and configure tox, write a tox.ini file from scratch, run tests across Python 3.9 through 3.12, pass environment variables and extra dependencies, run only a subset of environments, integrate tox with pytest and coverage, use the modern pyproject.toml configuration style, and build a real-world tox setup for a small utility library. By the end you will have a test automation setup that works identically on your laptop and in CI.

Running tox: Quick Example

Before diving into configuration details, here is a minimal tox setup that runs pytest across two Python versions. Create a small project directory, add a module, a test, and a tox.ini file, then run tox.

# project layout
# mylib/
# ├── mylib/
# │   └── utils.py
# ├── tests/
# │   └── test_utils.py
# ├── tox.ini
# └── setup.py (or pyproject.toml)

# mylib/utils.py
def add(a, b):
    return a + b

def greet(name):
    return f"Hello, {name}!"

# tests/test_utils.py
from mylib.utils import add, greet

def test_add():
    assert add(2, 3) == 5

def test_greet():
    assert greet("Alice") == "Hello, Alice!"

# tox.ini
[tox]
envlist = py39, py311

[testenv]
deps = pytest
commands = pytest tests/

# Run from the project root
$ tox

py39 create: /home/user/mylib/.tox/py39
py39 installdeps: pytest
py39 inst: /home/user/mylib/.tox/.tmp/package/1/mylib-0.1.0.tar.gz
py39 run-test: pytest tests/
============================= test session starts ==============================
platform linux -- Python 3.9.18, pytest-8.2.0
collected 2 items
tests/test_utils.py ..                                                   [100%]
============================== 2 passed in 0.12s ===============================
py311 create: /home/user/mylib/.tox/py311
py311 installdeps: pytest
py311 inst: /home/user/mylib/.tox/.tmp/package/1/mylib-0.1.0.tar.gz
py311 run-test: pytest tests/
============================= test session starts ==============================
platform linux -- Python 3.11.8, pytest-8.2.0
collected 2 items
tests/test_utils.py ..                                                   [100%]
============================== 2 passed in 0.12s ===============================
___________________________________ summary ___________________________________
  py39: commands succeeded
  py311: commands succeeded
  congratulations :)

Tox created two completely separate virtual environments — one for Python 3.9, one for 3.11 — installed your package and pytest into each, ran the test suite, and reported results. The key parts are envlist (which environments to run), deps (what to install), and commands (what to execute). Everything below digs into these and more.

What Is tox and Why Use It?

Tox is a generic virtualenv management and test command-line tool. At its core it does three things: creates isolated virtual environments, installs specified dependencies into each, and runs your commands inside them. It was originally designed for testing Python packages across multiple interpreter versions, but it handles any environment-based task — linting, type checking, building docs, running formatters.

The key insight is that tox installs your package into each environment from a source distribution, the same way a user would install it with pip install mylib. This means your tests run against the installed package, not the raw source files. If you forget to list a dependency in your setup.py or pyproject.toml, tox will catch it — the test environment simply won’t have that import available.

Approach	What it solves	What it misses
Manually activate venvs	Isolation	Repeatability, automation
pytest only	Test running	Multi-version, missing deps detection
tox	Multi-version + isolation + automation	Requires Python versions to be installed
tox + pyenv	Everything	Slightly more setup upfront

Install tox into your system Python or a dedicated virtual environment — do not install it inside the project venv you are testing, as this creates circular dependency problems:

# install_tox.sh
pip install tox --break-system-packages
# or into a dedicated tools venv
python -m venv ~/.venvs/tox && ~/.venvs/tox/bin/pip install tox
tox --version

tox 4.15.0 from /home/user/.local/lib/python3.11/site-packages/tox/__init__.py

Tox 4 (released 2022) changed several configuration defaults from tox 3. This article uses tox 4 conventions throughout. The most important difference: tox 4 no longer requires a setup.py — it works with pyproject.toml out of the box.

Debug Dee at futuristic control panel configuring test environments — tox spins up a fresh venv for each envlist entry. No shared state, no surprises.

The tox.ini Configuration File

The tox.ini file lives in your project root alongside setup.py or pyproject.toml. It uses INI syntax with sections that map to environments. Understanding the full set of options unlocks tox’s real power.

[tox] — Global Settings

# tox.ini — full global section
[tox]
# Environments to run when you type just "tox"
envlist = py39, py310, py311, py312

# Minimum tox version required
minversion = 4.0

# Skip missing Python interpreters instead of failing
skip_missing_interpreters = true

# Where to store environment data (default: {toxinidir}/.tox)
toxworkdir = {toxinidir}/.tox

skip_missing_interpreters = true is extremely useful in CI: if you have only Python 3.11 and 3.12 installed, tox skips the 3.9 and 3.10 environments with a warning rather than failing the build. On a developer laptop you might have only one or two Python versions — this setting keeps things friendly. Set it to false in CI if you want strict enforcement.

[testenv] — The Base Environment

The [testenv] section defines defaults inherited by all environments. Any specific environment like [testenv:py39] inherits everything from [testenv] and can override individual values.

# tox.ini — complete testenv section
[tox]
envlist = py39, py311, py312

[testenv]
# Dependencies to install (separate from your package's requirements)
deps =
    pytest>=8.0
    pytest-cov
    requests-mock

# The command to run
commands =
    pytest {posargs:tests/} --cov=mylib --cov-report=term-missing

# Environment variables to pass through or set
setenv =
    PYTHONPATH = {toxinidir}/src
    APP_ENV = testing

# Pass these variables from your shell into the environment
passenv =
    HOME
    CI
    GITHUB_*

# Install the package itself (default: true)
# Set to false for environments that don't need it (e.g., linting)
package = wheel

# Run tox and pass extra pytest args via posargs
$ tox -- -k test_add -v

py39 run-test: pytest tests/ -k test_add -v --cov=mylib --cov-report=term-missing
...
PASSED tests/test_utils.py::test_add

The {posargs} placeholder is how you forward extra arguments to the underlying command. Everything after -- on the tox command line becomes {posargs}. This lets you run a single test or pass -x to stop on first failure without changing tox.ini.

Per-Environment Overrides

Sometimes a specific Python version needs different dependencies or commands. Define a named environment section to override just those values:

# tox.ini — per-environment overrides
[tox]
envlist = py39, py311, py312, lint, typecheck

[testenv]
deps =
    pytest
    pytest-cov
commands = pytest tests/ --cov=mylib

# py39 needs a backport not required on 3.11+
[testenv:py39]
deps =
    pytest
    pytest-cov
    importlib-metadata>=4.0

# Linting environment — no package install needed
[testenv:lint]
package = skip
deps =
    ruff
    black
commands =
    ruff check mylib/ tests/
    black --check mylib/ tests/

# Type checking
[testenv:typecheck]
package = skip
deps = mypy
commands = mypy mylib/ --strict

# Run only the lint environment
$ tox -e lint

# Run multiple specific environments
$ tox -e py311,typecheck

# List all configured environments
$ tox list

default environments:
py39        -> [no description]
py311       -> [no description]
py312       -> [no description]
lint        -> [no description]
typecheck   -> [no description]

The package = skip setting tells tox not to build and install your package for that environment. This speeds up linting and type checking runs significantly since they only need the source files, not a full package installation.

API Alice with clipboard celebrating all tests passing — tox.ini: four lines to test against Python 3.9, 3.10, 3.11, and 3.12 simultaneously.

Integrating pytest and Coverage

Tox and pytest work seamlessly together. The most useful addition is coverage reporting — knowing not just that your tests pass, but that they actually exercise your code.

# tox.ini — pytest + coverage setup
[tox]
envlist = py311, py312

[testenv]
deps =
    pytest>=8.0
    pytest-cov

commands =
    pytest tests/ \
        --cov=mylib \
        --cov-report=term-missing \
        --cov-report=html:htmlcov \
        --cov-fail-under=80

# Separate environment to combine coverage from all Python versions
[testenv:coverage-report]
package = skip
deps = coverage[toml]
commands =
    coverage combine
    coverage report --fail-under=80
    coverage html

To combine coverage data across all Python version environments, add --cov-append to your pytest command and ensure all environments write to the same .coverage file:

# tox.ini — combined coverage across Python versions
[testenv]
deps =
    pytest
    pytest-cov

setenv =
    COVERAGE_FILE = {toxinidir}/.coverage.{envname}

commands =
    pytest tests/ --cov=mylib --cov-report= --cov-append

[testenv:coverage-report]
package = skip
deps = coverage
setenv =
    COVERAGE_FILE = {toxinidir}/.coverage

commands =
    coverage combine .coverage.py311 .coverage.py312
    coverage report --show-missing

$ tox -e py311,py312,coverage-report

Name                 Stmts   Miss  Cover   Missing
--------------------------------------------------
mylib/utils.py          12      1    92%   45
mylib/parser.py         30      4    87%   22-25
--------------------------------------------------
TOTAL                   42      5    88%

The combined coverage report aggregates line hit data from every Python version. A line that only executes under Python 3.9’s sys.version_info branch will now be properly credited, giving you a truer picture of what the test suite actually exercises.

Using pyproject.toml Instead of tox.ini

Modern Python projects often consolidate all tool configuration into pyproject.toml. Tox 4 supports this natively — put your tox configuration in the [tool.tox] table and delete tox.ini:

# pyproject.toml — tox config inside the project file
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.backends.legacy:build"

[project]
name = "mylib"
version = "0.1.0"
requires-python = ">=3.9"
dependencies = ["requests>=2.28"]

[tool.tox]
legacy_tox_ini = """
[tox]
envlist = py39, py311, py312
skip_missing_interpreters = true

[testenv]
deps =
    pytest
    pytest-cov
commands = pytest tests/ --cov=mylib --cov-report=term-missing

[testenv:lint]
package = skip
deps = ruff
commands = ruff check mylib/ tests/
"""

The legacy_tox_ini key holds an INI string — the same syntax as a standalone tox.ini — inside the TOML file. Tox reads it transparently. There is also a native TOML-based configuration format (available in tox 4.2+) that avoids the embedded string, but legacy_tox_ini is the most compatible approach for projects that also need to support tox 3 users.

Environment Variables and Secrets

Test environments often need credentials or configuration that should not be committed to source control. Tox provides setenv for constants and passenv for forwarding values from your shell:

# tox.ini — handling secrets and config
[testenv]
deps = pytest

# Set constants needed by tests
setenv =
    APP_ENV = testing
    DATABASE_URL = sqlite:///test.db
    LOG_LEVEL = WARNING

# Pass secrets from the shell environment
passenv =
    AWS_ACCESS_KEY_ID
    AWS_SECRET_ACCESS_KEY
    GITHUB_TOKEN
    CI
    CODECOV_TOKEN

commands = pytest tests/

# In your shell before running tox:
export GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
tox -e py311

# tests/test_api.py — accessing the passed variable
import os

def test_token_available():
    token = os.environ.get("GITHUB_TOKEN")
    # In CI this will be the real token; locally it must be set
    assert token is not None, "GITHUB_TOKEN not set in environment"

Tox deliberately strips most environment variables from the test environment by default. This prevents hidden dependencies on your local shell configuration — the same isolation that makes tox results trustworthy also means you must explicitly declare every environment variable your tests need. Use passenv = * only as a last resort during debugging; it defeats tox’s isolation guarantee.

Running tox in GitHub Actions CI

The real payoff of a tox configuration is running it automatically on every push. GitHub Actions has first-class support for matrix builds across Python versions:

# .github/workflows/tests.yml
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.9", "3.10", "3.11", "3.12"]

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install tox
        run: pip install tox tox-gh-actions

      - name: Run tox
        run: tox

# tox.ini — gh-actions section maps matrix Python to tox env
[tox]
envlist = py39, py310, py311, py312, lint

[gh-actions]
python =
    3.9: py39
    3.10: py310
    3.11: py311
    3.12: py312

[testenv]
deps = pytest
commands = pytest tests/

[testenv:lint]
package = skip
deps = ruff
commands = ruff check mylib/

The tox-gh-actions plugin reads the GITHUB_ACTIONS and PYTHON_VERSION environment variables set by GitHub’s matrix runner and automatically selects the matching tox environment. When the matrix job runs Python 3.11, tox automatically runs only the py311 environment rather than all of them. This is more efficient than running the full envlist on every matrix node.

Real-Life Example: Testing a String Utilities Library

# string_utils_project/
# ├── strutils/
# │   ├── __init__.py
# │   ├── transform.py
# │   └── validate.py
# ├── tests/
# │   ├── test_transform.py
# │   └── test_validate.py
# ├── pyproject.toml
# └── tox.ini

# strutils/transform.py
def slugify(text: str) -> str:
    """Convert a string to a URL-friendly slug."""
    import re
    text = text.lower().strip()
    text = re.sub(r'[^\w\s-]', '', text)
    text = re.sub(r'[\s_-]+', '-', text)
    text = re.sub(r'^-+|-+$', '', text)
    return text

def truncate(text: str, max_length: int, suffix: str = "...") -> str:
    """Truncate text to max_length characters."""
    if len(text) <= max_length:
        return text
    return text[:max_length - len(suffix)] + suffix

# strutils/validate.py
import re

def is_valid_email(email: str) -> bool:
    """Basic email format validation."""
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

def is_strong_password(password: str) -> bool:
    """Check password has 8+ chars, upper, lower, digit, special."""
    if len(password) < 8:
        return False
    has_upper = any(c.isupper() for c in password)
    has_lower = any(c.islower() for c in password)
    has_digit = any(c.isdigit() for c in password)
    has_special = any(c in '!@#$%^&*()_+-=[]{}|;:,.<>?' for c in password)
    return all([has_upper, has_lower, has_digit, has_special])

# tests/test_transform.py
import pytest
from strutils.transform import slugify, truncate

@pytest.mark.parametrize("text,expected", [
    ("Hello World", "hello-world"),
    ("  Python 3.11  ", "python-311"),
    ("café & bistro!", "caf-bistro"),
])
def test_slugify(text, expected):
    assert slugify(text) == expected

def test_truncate_short():
    assert truncate("hello", 10) == "hello"

def test_truncate_long():
    assert truncate("hello world", 8) == "hello..."

def test_truncate_custom_suffix():
    assert truncate("hello world", 8, suffix="…") == "hello w…"

# tests/test_validate.py
import pytest
from strutils.validate import is_valid_email, is_strong_password

@pytest.mark.parametrize("email,valid", [
    ("user@example.com", True),
    ("bad-email", False),
    ("missing@tld.", False),
    ("ok+tag@sub.domain.org", True),
])
def test_is_valid_email(email, valid):
    assert is_valid_email(email) == valid

def test_strong_password():
    assert is_strong_password("Secure@123") is True
    assert is_strong_password("weakpass") is False
    assert is_strong_password("NoSpecial1") is False

# tox.ini — full production config for strutils
[tox]
envlist = py39, py310, py311, py312, lint, typecheck, coverage-report
skip_missing_interpreters = true

[gh-actions]
python =
    3.9: py39
    3.10: py310
    3.11: py311
    3.12: py312

[testenv]
deps =
    pytest>=8.0
    pytest-cov
setenv =
    COVERAGE_FILE = {toxinidir}/.coverage.{envname}
commands =
    pytest tests/ --cov=strutils --cov-report=

[testenv:lint]
package = skip
deps = ruff
commands = ruff check strutils/ tests/

[testenv:typecheck]
package = skip
deps = mypy
commands = mypy strutils/ --strict

[testenv:coverage-report]
package = skip
deps = coverage
setenv =
    COVERAGE_FILE = {toxinidir}/.coverage
commands =
    coverage combine
    coverage report --show-missing --fail-under=90

$ tox -e py311,lint,coverage-report

py311 create: .tox/py311
py311 run-test: pytest tests/ --cov=strutils --cov-report=
============================= test session starts ==============================
collected 11 items
tests/test_transform.py ....                                             [100%]
tests/test_validate.py .......                                           [100%]
============================== 11 passed in 0.18s ==============================

lint run-test: ruff check strutils/ tests/
All checks passed!

coverage-report run-test: coverage combine && coverage report --show-missing --fail-under=90
Name                      Stmts   Miss  Cover   Missing
-------------------------------------------------------
strutils/transform.py        12      0   100%
strutils/validate.py         10      0   100%
-------------------------------------------------------
TOTAL                        22      0   100%
___________________________________ summary ___________________________________
  py311: commands succeeded
  lint: commands succeeded
  coverage-report: commands succeeded
  congratulations :)

This configuration gives you a complete quality gate: unit tests across Python versions, linting with ruff, strict type checking with mypy, and a combined coverage report with a minimum threshold. Add this to GitHub Actions with the matrix config shown earlier and every pull request will automatically validate against all supported Python versions before merging.

Frequently Asked Questions

What happens if a Python version isn't installed?

With skip_missing_interpreters = true, tox prints a warning and marks that environment as skipped rather than failing. The final summary shows SKIPPED for missing interpreters and only fails if an installed environment's tests actually fail. Without that setting, tox exits with an error if any interpreter in envlist cannot be found. On developer machines, use skip_missing_interpreters = true; in CI, use false to enforce that all required versions are present.

When does tox recreate environments?

Tox caches virtual environments in the .tox/ directory and reuses them across runs for speed. It recreates an environment only when deps change, the Python interpreter changes, or you pass --recreate (or -r). If your tests behave strangely after a dependency upgrade, run tox -r to force a clean rebuild. You can also delete the entire .tox/ directory — tox will rebuild everything from scratch on the next run.

How do I run a single test with tox?

Use the {posargs} placeholder in your commands and pass arguments after -- on the command line. For example, tox -e py311 -- tests/test_transform.py::test_slugify -v runs only that one test with verbose output. The -- separator tells tox everything after it should be forwarded as {posargs} rather than interpreted as tox options. This is the cleanest way to do rapid test-driven development while keeping tox's isolation.

Should I pin dependency versions in tox.ini?

For library projects, leave deps unpinned (e.g., pytest>=8.0) so tox installs the latest compatible versions — this surfaces breakage from upstream changes early. For application projects where you want reproducible builds, pin exact versions (e.g., pytest==8.2.1) or use a requirements-test.txt file referenced with deps = -r requirements-test.txt. The -r syntax in tox deps works the same way as pip install -r.

What changed between tox 3 and tox 4?

Tox 4 dropped Python 2 support and changed several defaults: isolated_build = true is now the default, meaning tox uses PEP 517/518 build systems instead of python setup.py install. The [gh-actions] section syntax changed slightly. The package option replaces skip_install. Most importantly, tox 4 requires a valid pyproject.toml or setup.py — if your project has neither, add a minimal pyproject.toml with [build-system]. Check the official upgrade guide when migrating.

Conclusion

Tox turns multi-environment testing from a manual, error-prone process into a single tox command. You learned how to write a tox.ini with envlist, deps, and commands; create per-environment overrides for linting, type checking, and version-specific dependencies; integrate pytest coverage across Python versions; use setenv and passenv for environment variables; configure tox-gh-actions for CI matrix builds; and build a complete test pipeline for a real utility library.

Extend the string utilities example by adding a docs environment that builds Sphinx documentation, a security environment that runs bandit, or a benchmark environment that runs pytest-benchmark. Each new environment is just a few lines in tox.ini. Official documentation: tox.wiki.

Post Views: 5

How To Use Python tox for Multi-Environment Test Automation

Running tox: Quick Example

What Is tox and Why Use It?

The tox.ini Configuration File

[tox] — Global Settings

[testenv] — The Base Environment

Per-Environment Overrides

Integrating pytest and Coverage

Using pyproject.toml Instead of tox.ini

Environment Variables and Secrets

Running tox in GitHub Actions CI

Real-Life Example: Testing a String Utilities Library

Frequently Asked Questions

What happens if a Python version isn't installed?

When does tox recreate environments?

How do I run a single test with tox?

Should I pin dependency versions in tox.ini?

What changed between tox 3 and tox 4?

Conclusion

Submit a Comment Cancel reply

How To Use Python tox for Multi-Environment Test Automation

Running tox: Quick Example

What Is tox and Why Use It?

The tox.ini Configuration File

[tox] — Global Settings

[testenv] — The Base Environment

Per-Environment Overrides

Integrating pytest and Coverage

Using pyproject.toml Instead of tox.ini

Environment Variables and Secrets

Running tox in GitHub Actions CI

Real-Life Example: Testing a String Utilities Library

Frequently Asked Questions

What happens if a Python version isn't installed?

When does tox recreate environments?

How do I run a single test with tox?

Should I pin dependency versions in tox.ini?

What changed between tox 3 and tox 4?

Conclusion

Related Articles

Submit a Comment Cancel reply