Intermediate

You are working with a JSON API response. You need a value that is three levels deep — something like response["data"]["users"][0]["profile"]["email"]. Any key in that chain might be missing, any list might be empty, and the response structure might change between API versions. So you add a try-except block, or a series of .get() calls with defaults, and suddenly five lines of boilerplate are doing the work of one conceptually simple operation.

Glom is a Python library that replaces this pattern with a clean, declarative approach. You describe the path to the data you want as a spec, and glom walks the structure and extracts it — raising a clear error if something is missing, or returning a default you specify. It handles dicts, lists, objects, and nested combinations. The same spec language also supports data transformation, aggregation, and restructuring in place.

This article covers installation, basic path access, list traversal, coalesce for fallbacks, nested restructuring with Spec, the T object for method calls, and a real-world API response processing example. By the end you will have a practical toolkit for working with complex nested data without defensive boilerplate cluttering your code.

Accessing Nested Python Data with glom: Quick Example

Here is the same data access done the naive way vs. the glom way:

# quick_glom.py
from glom import glom

data = {
    "user": {
        "profile": {
            "name": "Alice",
            "contact": {
                "email": "alice@example.com"
            }
        }
    }
}

# Naive way - crashes if any key is missing
email_naive = data["user"]["profile"]["contact"]["email"]

# Glom way - clear path spec
email_glom = glom(data, "user.profile.contact.email")

print("Naive:", email_naive)
print("Glom: ", email_glom)

Output:

Naive: alice@example.com
Glom:  alice@example.com

The path "user.profile.contact.email" is a dot-separated string that glom resolves step by step. If any key is missing, glom raises a descriptive GlomError that tells you exactly where the path broke — unlike a bare KeyError that gives you only the missing key with no context about where in the structure it was expected. The sections below show how to handle missing keys gracefully, traverse lists, and restructure data.

data['a']['b']['c']['d'] -- one missing key from a KeyError. glom knows where it broke.
data[‘a’][‘b’][‘c’][‘d’] — one missing key from a KeyError. glom knows where it broke.

What Is glom and When Should You Use It?

Glom is a data access and transformation library built around the idea of a spec — a declarative description of what you want from a data structure. At its simplest the spec is a dot-path string. At its most powerful it is a nested combination of path specs, transformations, defaults, and conditionals.

ApproachMissing Key BehaviorList TraversalRestructuring
data[“a”][“b”]Unhelpful KeyErrorManual loopsManual
data.get(“a”, {}).get(“b”)Returns default silentlyManual loopsManual
glom(data, “a.b”)Descriptive GlomErrorBuilt-in with IterBuilt-in with Spec

Use glom when: you are working with nested API responses, processing complex config files, restructuring data from one shape to another, or when you want missing-key errors that tell you exactly where the path broke rather than just which key was missing.

Installing glom

# terminal
pip install glom

Output:

Successfully installed glom-23.5.0 boltons-23.1.1

Glom depends on boltons (a utilities library from the same author) but has no other heavy dependencies. Import with from glom import glom for basic use, or from glom import glom, Coalesce, T, Iter, Spec for the full toolkit.

Handling Missing Keys with Default Values

Glom raises GlomError when a path cannot be resolved. To provide a default instead, pass it as the third argument or use the default keyword:

# glom_defaults.py
from glom import glom

user = {
    "name": "Bob",
    "address": {
        "city": "Chicago"
    }
}

# Key exists - returns value
city = glom(user, "address.city", default="Unknown")
print("City:", city)

# Key missing - returns default instead of raising
zip_code = glom(user, "address.zip", default="N/A")
print("Zip:", zip_code)

# Nested key missing at top level - also returns default
country = glom(user, "address.country.name", default="Unknown")
print("Country:", country)

# Without default - raises GlomError with path context
try:
    phone = glom(user, "contact.phone")
except Exception as e:
    print("Error:", type(e).__name__, str(e)[:80])

Output:

City: Chicago
Zip: N/A
Country: Unknown
Error: PathAccessError (Attribute/Key "contact" not found in {"name": "Bob", ...})

The GlomError message includes the full target object and which key in the path was missing — dramatically better than a bare KeyError: 'contact'. When you want a silent default, pass it as the third argument. When you want to catch the error and handle it yourself, let it raise and catch GlomError.

KeyError: 'contact'. Sure. But which contact? In which object? On which request?
KeyError: ‘contact’. Sure. But which contact? In which object? On which request?

Traversing Lists with glom

When a path includes a list, glom can collect a value from every element of the list using the [spec] list-comprehension syntax inside the spec:

# glom_lists.py
from glom import glom

api_response = {
    "data": {
        "articles": [
            {"id": 1, "title": "Python Tips", "author": {"name": "Alice"}, "views": 1200},
            {"id": 2, "title": "Async Guide",  "author": {"name": "Bob"},   "views": 890},
            {"id": 3, "title": "Type Hints",   "author": {"name": "Carol"}, "views": 2100},
        ]
    }
}

# Get all titles
titles = glom(api_response, ("data.articles", ["title"]))
print("Titles:", titles)

# Get all author names (nested path inside list)
authors = glom(api_response, ("data.articles", [("author.name")]))
print("Authors:", authors)

# Get first article's title
first_title = glom(api_response, "data.articles.0.title")
print("First title:", first_title)

# Get count of articles
count = glom(api_response, ("data.articles", len))
print("Count:", count)

Output:

Titles: ['Python Tips', 'Async Guide', 'Type Hints']
Authors: ['Alice', 'Bob', 'Carol']
First title: Python Tips
Count: 3

The tuple form (path_to_list, [spec_for_each_element]) is glom’s list comprehension pattern. The list spec ["title"] means “extract the title key from every element.” Integer indexing like "data.articles.0.title" accesses the first element directly. The len function as a spec applies the function to the result of the preceding path — a clean way to get a count without intermediate variables.

Using Coalesce for Fallback Chains

Coalesce tries a series of specs in order and returns the first one that succeeds without raising. This is the glom equivalent of a or b or c but for nested path access:

# glom_coalesce.py
from glom import glom, Coalesce, SKIP

records = [
    {"id": 1, "display_name": "Alice Chen"},
    {"id": 2, "full_name": "Bob Torres"},
    {"id": 3, "first_name": "Carol", "last_name": "White"},
    {"id": 4},
]

for record in records:
    # Try display_name, then full_name, then first_name, then "Unknown"
    name = glom(record, Coalesce("display_name", "full_name", "first_name", default="Unknown"))
    print(f"ID {record['id']}: {name}")

Output:

ID 1: Alice Chen
ID 2: Bob Torres
ID 3: Carol
ID 4: Unknown

Coalesce is particularly valuable when working with heterogeneous data sources where different records use different field names for the same concept. Instead of a chain of .get() calls that returns None silently on failure, Coalesce raises if all options fail (unless you provide a default), making silent failures visible.

Restructuring Data with Spec Dicts

Glom can restructure data by using a dict as the spec. Each key in the spec dict becomes a key in the output, and each value is a spec for what to put there:

# glom_restructure.py
from glom import glom, Coalesce

raw_api_data = {
    "results": [
        {
            "user_id": "u_001",
            "user_data": {
                "personal": {"first": "Alice", "last": "Chen"},
                "contact": {"email": "alice@example.com", "phone": "555-0101"},
            },
            "subscription": {"plan": "pro", "active": True},
        },
        {
            "user_id": "u_002",
            "user_data": {
                "personal": {"first": "Bob", "last": "Torres"},
                "contact": {"email": "bob@example.com"},
            },
            "subscription": {"plan": "free", "active": True},
        },
    ]
}

# Spec dict defines the output shape
user_spec = {
    "id":    "user_id",
    "name":  ("user_data.personal", lambda p: f"{p['first']} {p['last']}"),
    "email": "user_data.contact.email",
    "phone": Coalesce("user_data.contact.phone", default="N/A"),
    "plan":  "subscription.plan",
}

users = glom(raw_api_data, ("results", [user_spec]))
for user in users:
    print(user)

Output:

{'id': 'u_001', 'name': 'Alice Chen', 'email': 'alice@example.com', 'phone': '555-0101', 'plan': 'pro'}
{'id': 'u_002', 'name': 'Bob Torres', 'email': 'bob@example.com', 'phone': 'N/A', 'plan': 'free'}

The spec dict maps output keys to specs for their values. The lambda inside the tuple ("user_data.personal", lambda p: ...) first extracts the nested dict, then applies the lambda to produce a full name. This is data reshaping expressed declaratively — the spec describes the output shape, and glom handles the traversal and assembly.

Spec dict: describe the shape you want. glom figures out how to get there.
Spec dict: describe the shape you want. glom figures out how to get there.

Using the T Object for Method Calls

The T object lets you include method calls and attribute access in a spec without switching to a lambda:

# glom_T_object.py
from glom import glom, T

data = {
    "message": "  Hello, World!  ",
    "tags": ["Python", "Tutorial", "Beginner"],
    "score": 87.654321,
}

# T.method() calls the method on the extracted value
trimmed = glom(data, ("message", T.strip()))
print("Trimmed:", trimmed)

# Chain calls
upper_trimmed = glom(data, ("message", T.strip().upper()))
print("Upper trimmed:", upper_trimmed)

# T works on list results too
joined = glom(data, ("tags", T.__getitem__(0)))
print("First tag:", joined)

# Round a float
rounded = glom(data, ("score", lambda x: round(x, 2)))
print("Rounded:", rounded)

Output:

Trimmed: Hello, World!
Upper trimmed: HELLO, WORLD!
First tag: Python
Rounded: 87.65

The T object is a proxy that records method calls and attribute accesses, then replays them on the actual extracted value. T.strip().upper() means “call strip() then upper() on whatever value comes before this in the spec pipeline.” For simple transformations, T is cleaner than a lambda. For complex transformations, a regular function or lambda is more readable.

Real-Life Example: Processing a GitHub API Response

Here is a realistic use case — extracting structured data from a GitHub-style API response that has nested contributors, labels, and metadata:

# process_github_issues.py
from glom import glom, Coalesce, T

# Simulated GitHub API response for repository issues
github_response = {
    "repository": "myorg/myproject",
    "total_count": 3,
    "items": [
        {
            "number": 42,
            "title": "Fix login timeout bug",
            "state": "open",
            "user": {"login": "alice_dev", "type": "User"},
            "labels": [{"name": "bug"}, {"name": "priority-high"}],
            "assignees": [{"login": "bob_dev"}],
            "comments": 5,
            "created_at": "2024-01-10T09:00:00Z",
            "body": "Users are getting logged out after 5 minutes of inactivity...",
        },
        {
            "number": 43,
            "title": "Add dark mode support",
            "state": "open",
            "user": {"login": "carol_dev", "type": "User"},
            "labels": [{"name": "enhancement"}],
            "assignees": [],
            "comments": 12,
            "created_at": "2024-01-11T14:30:00Z",
            "body": None,
        },
        {
            "number": 44,
            "title": "Update dependencies",
            "state": "closed",
            "user": {"login": "bot_user", "type": "Bot"},
            "labels": [{"name": "maintenance"}],
            "assignees": [{"login": "alice_dev"}, {"login": "carol_dev"}],
            "comments": 0,
            "created_at": "2024-01-12T08:00:00Z",
            "body": "Automated PR to update package versions.",
        },
    ]
}

issue_spec = {
    "id":          "number",
    "title":       "title",
    "state":       "state",
    "author":      "user.login",
    "is_human":    ("user.type", lambda t: t == "User"),
    "labels":      ("labels", [("name")]),
    "assignees":   ("assignees", [("login")]),
    "comments":    "comments",
    "description": Coalesce("body", default="(no description)"),
}

issues = glom(github_response, ("items", [issue_spec]))

for issue in issues:
    assignee_str = ", ".join(issue["assignees"]) if issue["assignees"] else "Unassigned"
    print(f"#{issue['id']} [{issue['state'].upper()}] {issue['title']}")
    print(f"   Author: {issue['author']} | Labels: {', '.join(issue['labels'])}")
    print(f"   Assigned to: {assignee_str} | Comments: {issue['comments']}")
    print()

Output:

#42 [OPEN] Fix login timeout bug
   Author: alice_dev | Labels: bug, priority-high
   Assigned to: bob_dev | Comments: 5

#43 [OPEN] Add dark mode support
   Author: carol_dev | Labels: enhancement
   Assigned to: Unassigned | Comments: 12

#44 [CLOSED] Update dependencies
   Author: bot_user | Labels: maintenance
   Assigned to: alice_dev, carol_dev | Comments: 0

The spec dict handles both the path traversal and the data transformation in one place. The Coalesce for body handles the None value gracefully. The label list extraction ("labels", [("name")]) flattens the nested list of label objects to a list of name strings. This pattern scales cleanly — add more fields to the spec dict and they get extracted automatically for every item in the list.

Frequently Asked Questions

How does glom compare to jmespath?

JMESPath is a query language for JSON that specializes in filtering and selecting data from JSON documents. It is powerful for filter and projection queries but is limited to JSON-compatible data and does not support transformation in place. Glom works on any Python object (dicts, objects, lists, combinations), supports transformation via callables and the T object, and integrates naturally into Python code. Use jmespath for read-only JSON queries; use glom when you need Python-native data transformation.

Does glom work on regular Python objects, not just dicts?

Yes. Glom resolves paths using attribute access for regular objects and key access for dicts. A path like "user.name" works whether user is a dict (user["name"]) or an object (user.name). You can mix them in the same path: "response.data.users.0.email" works even if response is an object, data is a dict, users is a list, and each user is another dict.

Can glom write values as well as read them?

Yes, using glom.assign(target, path, value). This sets a value at a nested path, creating intermediate dicts if needed with the missing=dict parameter. For example, glom.assign(data, "user.profile.bio", "New bio", missing=dict) creates the nested structure if it does not already exist. This is useful for building nested data structures incrementally.

Is glom significantly slower than direct dict access?

Yes, glom has overhead compared to direct bracket access because it resolves specs at runtime. For tight loops over millions of records, direct access is faster. For typical API response processing (hundreds to thousands of records), the overhead is negligible. Glom’s value is in correctness and maintainability — clear errors, readable specs, and less boilerplate — not raw performance.

How do I get more context when a GlomError occurs?

Glom’s error messages already include the target object and the path that failed. For additional context, wrap the glom call in a try-except and log the full exception: the traceback includes the spec that was being processed and the point of failure. You can also use glom.Path to build paths programmatically with descriptive labels for each step, which shows up in error messages.

Conclusion

Glom replaces fragile nested dict access with a declarative path spec system that provides clear errors, graceful defaults, and built-in data transformation. The dot-path syntax handles simple access. Coalesce handles fallback chains. Spec dicts reshape data in a single pass. The T object adds method calls without lambda clutter. Together, these tools make working with complex nested data from APIs and config files significantly cleaner.

Try extending the GitHub issues example by adding a filter spec to select only open issues assigned to a specific user, then restructure the output into a flat CSV-ready format. Once you start expressing data transformations as specs rather than nested loops, you will find glom showing up in every project that touches external API data.

See the official glom documentation for the full spec API including Iter, Check, and Fill.