Beginner
JSON (JavaScript Object Notation) is the universal language of data exchange on the web. REST APIs return JSON. Configuration files are written in JSON. NoSQL databases store JSON. When you fetch data from any web service — weather APIs, payment processors, social media platforms — you’re almost certainly receiving JSON. Knowing how to read and write JSON in Python is one of the most practical skills you can have.
Python makes JSON handling easy with its built-in json module. You can parse a JSON string into a Python dictionary with one function call, and serialize Python data back to JSON with another. No installation required — import json is all you need. The module handles the translation between Python types (dicts, lists, strings, numbers, booleans) and their JSON equivalents automatically.
In this article we’ll cover reading JSON from strings and files, writing JSON to strings and files, pretty-printing, handling nested structures, working with real API data, customizing serialization for Python objects, and error handling. By the end, you’ll be comfortable parsing any JSON structure you encounter and serializing your Python data to clean, readable JSON output.
Reading JSON in Python: Quick Example
Here’s how to parse a JSON string and work with the resulting Python data in just a few lines:
# quick_json.py
import json
json_string = '{"name": "Alice", "age": 30, "languages": ["Python", "SQL"]}'
# Parse JSON string -> Python dict
data = json.loads(json_string)
print(type(data)) #
print(data['name']) # Alice
print(data['languages']) # ['Python', 'SQL']
# Serialize Python dict -> JSON string
output = json.dumps(data, indent=2)
print(output)
Output:
<class 'dict'>
Alice
['Python', 'SQL']
{
"name": "Alice",
"age": 30,
"languages": [
"Python",
"SQL"
]
}
json.loads() parses a JSON string (the “s” stands for “string”), while json.dumps() serializes to a string. The indent=2 argument to dumps() pretty-prints with 2-space indentation. For reading and writing files directly, use json.load() and json.dump() (without the “s”).
What Is JSON and How Does It Map to Python?
JSON is a text-based data format derived from JavaScript object syntax. It stores data as key-value pairs (objects), ordered lists (arrays), strings, numbers, booleans, and null. Python’s json module automatically converts between JSON types and Python types.
| JSON Type | Python Type | Example |
|---|---|---|
| object | dict | {"key": "value"} |
| array | list | [1, 2, 3] |
| string | str | "hello" |
| number (int) | int | 42 |
| number (float) | float | 3.14 |
| true / false | True / False | true |
| null | None | null |
One important difference: JSON only supports string keys in objects, while Python dicts can have any hashable key. When you serialize a Python dict with integer keys, the json module automatically converts them to strings. Keep this in mind when working with round-trip serialization.
Reading JSON from a File
Reading JSON from a file is extremely common — configuration files, data exports, and API response caches are often stored as JSON files. Use json.load() (no “s”) with a file object.
# read_json_file.py
import json
# First, create a sample JSON file to read
sample_data = {
"app": "MyApp",
"version": "2.1.0",
"database": {
"host": "localhost",
"port": 5432,
"name": "myapp_db"
},
"features": ["auth", "notifications", "analytics"]
}
with open('config.json', 'w', encoding='utf-8') as f:
json.dump(sample_data, f, indent=2)
# Now read it back
with open('config.json', 'r', encoding='utf-8') as f:
config = json.load(f)
print('App:', config['app'])
print('DB host:', config['database']['host'])
print('DB port:', config['database']['port'])
print('Features:', ', '.join(config['features']))
Output:
App: MyApp
DB host: localhost
DB port: 5432
Features: auth, notifications, analytics
Always open files with encoding='utf-8' — JSON is defined as UTF-8 by default and many JSON files use Unicode characters. The with statement ensures the file is properly closed even if an error occurs during parsing.
Writing JSON to a File
Serializing Python data to a JSON file is just as straightforward. The json.dump() function writes directly to a file object, which is more efficient than creating a string with json.dumps() and then writing it.
# write_json_file.py
import json
from datetime import date
# Python data to serialize
user_data = {
"users": [
{"id": 1, "name": "Alice", "active": True, "score": 98.5},
{"id": 2, "name": "Bob", "active": False, "score": 72.0},
{"id": 3, "name": "Charlie", "active": True, "score": 85.25},
],
"total": 3,
"generated": "2026-04-16"
}
# Write with pretty printing and sorted keys
with open('users.json', 'w', encoding='utf-8') as f:
json.dump(user_data, f, indent=2, sort_keys=True)
print('Written to users.json')
# Verify by reading it back
with open('users.json', 'r', encoding='utf-8') as f:
content = f.read()
print(content[:300])
Output:
Written to users.json
{
"generated": "2026-04-16",
"total": 3,
"users": [
{
"active": true,
"id": 1,
"name": "Alice",
"score": 98.5
},
...
]
}
The sort_keys=True option outputs keys in alphabetical order, which makes JSON diffs much cleaner in version control — you won’t see spurious changes just because Python iterated dict keys in a different order. Use it for any JSON file that will be committed to a git repository.
Working with Real API Data
The most common use of JSON in Python is parsing data from REST APIs. Here’s how to fetch and parse real JSON data from a public practice API:
# api_json.py
import json
import urllib.request
# Fetch a list of users from JSONPlaceholder (a free practice REST API)
url = 'https://jsonplaceholder.typicode.com/users'
with urllib.request.urlopen(url) as response:
raw = response.read().decode('utf-8')
users = json.loads(raw)
print(f'Fetched {len(users)} users\n')
for user in users[:3]: # Show first 3 users
name = user.get('name', 'Unknown')
email = user.get('email', 'Unknown')
city = user.get('address', {}).get('city', 'Unknown')
company = user.get('company', {}).get('name', 'Unknown')
print(f'{name} | {email} | {city} | {company}')
Output:
Fetched 10 users
Leanne Graham | Sincere@april.biz | Gwenborough | Romaguera-Crona
Ervin Howell | Shanna@melissa.tv | Wisokyburgh | Deckow-Crist
Clementine Bauch | Nathan@yesenia.net | McKenziehaven | Romaguera-Jacobson
The .get('key', default) pattern is defensive JSON parsing — it returns the default value if the key is missing rather than raising a KeyError. For nested structures like address.city, chain the .get() calls: user.get('address', {}).get('city', 'Unknown'). If 'address' is missing, the inner .get() runs on an empty dict and safely returns 'Unknown' instead of crashing.
Navigating Nested JSON
Real-world API responses are often deeply nested. Here’s how to extract data from a complex nested structure safely:
# nested_json.py
import json
# Simulate a complex API response
response_text = '''
{
"status": "success",
"data": {
"post": {
"id": 42,
"title": "Understanding Python JSON",
"author": {"id": 7, "name": "Sam Dev"},
"tags": ["python", "json", "tutorial"],
"stats": {"views": 1250, "likes": 87, "comments": 14}
}
}
}
'''
data = json.loads(response_text)
# Defensive nested access
post = data.get('data', {}).get('post', {})
title = post.get('title', 'Unknown')
author_name = post.get('author', {}).get('name', 'Unknown')
views = post.get('stats', {}).get('views', 0)
tags = post.get('tags', [])
print(f'Title: {title}')
print(f'Author: {author_name}')
print(f'Views: {views:,}')
print(f'Tags: {", ".join(tags)}')
Output:
Title: Understanding Python JSON
Author: Sam Dev
Views: 1,250
Tags: python, json, tutorial
The chained .get() approach is much safer than writing data['data']['post']['title'] — any missing key in the chain would raise a KeyError and crash your script. With .get(), you control the default at every level.
Custom Serialization for Python Objects
The json module can’t serialize Python objects like datetime by default — they’re not JSON-native types. You have two options: use a custom encoder class or use the default parameter of json.dumps().
# custom_json.py
import json
from datetime import datetime, date
# Option 1: default function for simple cases
def json_default(obj):
if isinstance(obj, (datetime, date)):
return obj.isoformat()
raise TypeError(f'Object of type {type(obj).__name__} is not JSON serializable')
data = {
'event': 'User signup',
'timestamp': datetime(2026, 4, 16, 9, 30, 0),
'date_only': date(2026, 4, 16),
'user_id': 123
}
result = json.dumps(data, default=json_default, indent=2)
print(result)
Output:
{
"event": "User signup",
"timestamp": "2026-04-16T09:30:00",
"date_only": "2026-04-16",
"user_id": 123
}
The default function is called whenever json.dumps() encounters an object it can’t serialize natively. Return a JSON-serializable value (a string, number, list, or dict), and json.dumps() will use it in place of the original object. The ISO 8601 format for datetime strings (2026-04-16T09:30:00) is the widely-accepted standard.
Handling JSON Errors
JSON parsing fails when the input is malformed. Always wrap json.loads() in a try/except when dealing with data from external sources.
# json_errors.py
import json
def safe_parse(json_string):
"""Parse JSON safely, returning None on failure."""
try:
return json.loads(json_string)
except json.JSONDecodeError as e:
print(f'JSON parse error at line {e.lineno}, col {e.colno}: {e.msg}')
print(f'Bad input: {json_string[:100]}')
return None
# Valid JSON
result = safe_parse('{"name": "Alice", "age": 30}')
print('Valid:', result)
# Invalid JSON (missing closing brace)
result2 = safe_parse('{"name": "Bob"')
print('Invalid:', result2)
# Invalid JSON (trailing comma -- not valid in JSON)
result3 = safe_parse('{"key": "value",}')
print('Trailing comma:', result3)
Output:
Valid: {'name': 'Alice', 'age': 30}
JSON parse error at line 1, col 15: Expecting property name enclosed in double quotes
Bad input: {"name": "Bob"
Invalid: None
JSON parse error at line 1, col 17: Expecting property name enclosed in double quotes
Bad input: {"key": "value",}
Trailing comma: None
json.JSONDecodeError is a subclass of ValueError and carries the line number, column, and a descriptive message about what went wrong. Always check for this error when parsing API responses, user-provided input, or files from external sources — any of these can contain malformed JSON.
Real-Life Example: JSON Config Manager
Here’s a complete configuration manager that reads a JSON config file, applies defaults for missing keys, validates required fields, and writes updated config back to disk.
# config_manager.py
import json
import os
DEFAULTS = {
"debug": False,
"log_level": "INFO",
"database": {
"host": "localhost",
"port": 5432,
"pool_size": 5
},
"cache": {
"enabled": True,
"ttl_seconds": 300
}
}
REQUIRED_KEYS = ["database.host", "database.port"]
def deep_merge(base, override):
"""Merge override into base dict recursively."""
result = base.copy()
for key, val in override.items():
if key in result and isinstance(result[key], dict) and isinstance(val, dict):
result[key] = deep_merge(result[key], val)
else:
result[key] = val
return result
def get_nested(d, dotted_key, default=None):
"""Access nested dict value using dot notation."""
keys = dotted_key.split('.')
for key in keys:
if not isinstance(d, dict) or key not in d:
return default
d = d[key]
return d
def load_config(config_path):
"""Load config from file, merging with defaults."""
if os.path.exists(config_path):
try:
with open(config_path, 'r', encoding='utf-8') as f:
user_config = json.load(f)
except json.JSONDecodeError as e:
print(f'Error reading config: {e}')
user_config = {}
else:
print(f'Config file not found at {config_path}, using defaults')
user_config = {}
config = deep_merge(DEFAULTS, user_config)
# Validate required keys
missing = [k for k in REQUIRED_KEYS if get_nested(config, k) is None]
if missing:
raise ValueError(f'Missing required config keys: {missing}')
return config
def save_config(config, config_path):
"""Write config to JSON file."""
with open(config_path, 'w', encoding='utf-8') as f:
json.dump(config, f, indent=2, sort_keys=True)
print(f'Config saved to {config_path}')
# Demo
config = load_config('app_config.json')
config['debug'] = True
config['database']['pool_size'] = 10
save_config(config, 'app_config.json')
print(f"Debug mode: {config['debug']}")
print(f"DB pool size: {config['database']['pool_size']}")
print(f"Cache TTL: {config['cache']['ttl_seconds']}s")
Output:
Config file not found at app_config.json, using defaults
Config saved to app_config.json
Debug mode: True
DB pool size: 10
Cache TTL: 300s
The deep_merge() function recursively merges user settings into defaults, so users only need to specify the keys they want to override. The dot-notation accessor get_nested() makes validation and access of nested keys clean and readable. This pattern is used in virtually every production application that uses a JSON config file.
Frequently Asked Questions
What is the difference between json.loads and json.load?
json.loads() parses a JSON string (the “s” = string). json.load() reads from a file object. Similarly, json.dumps() serializes to a string and json.dump() writes to a file. This naming convention is consistent across Python’s standard library (e.g., pickle.loads/pickle.load follows the same pattern).
How do I pretty-print JSON?
Use json.dumps(data, indent=2) for 2-space indentation or indent=4 for 4-space. Add sort_keys=True to sort keys alphabetically. From the command line, you can pretty-print a JSON file with: python -m json.tool myfile.json. This is built into Python and works on any valid JSON file.
How does Python handle Unicode in JSON?
By default, json.dumps() escapes non-ASCII characters as \uXXXX escape sequences. To output them directly as Unicode (which is valid JSON and more readable), use json.dumps(data, ensure_ascii=False). Always open JSON files with encoding='utf-8' to handle any Unicode content correctly.
Why can’t I serialize datetime objects?
The JSON spec only defines six data types: object, array, string, number, boolean, and null. Python’s datetime doesn’t map to any of these, so the json module raises TypeError. The idiomatic solution is to convert to ISO 8601 strings using dt.isoformat(). Pass a default function to json.dumps() that handles these conversions, as shown in the custom serialization section above.
How do I handle very large JSON files efficiently?
For JSON files too large to load into memory at once, use the ijson library (install via pip install ijson) for streaming incremental JSON parsing. It parses the file as it reads, yielding items one at a time. The standard json module always loads the entire file into memory — fine for files up to hundreds of MB, but not for multi-GB JSON datasets.
Conclusion
Python’s json module makes JSON handling simple and reliable. We covered json.loads()/json.load() for parsing, json.dumps()/json.dump() for serialization, pretty-printing with indent and sort_keys, defensive nested access with chained .get(), parsing real API responses, custom serialization for datetime objects, and robust error handling with json.JSONDecodeError. JSON fluency is an essential Python skill — you’ll use it in almost every project that touches the internet or stores configuration.
Try extending the config manager to support environment variable overrides (keys from os.environ take precedence over the file) or to validate values against a schema using jsonschema (available via pip). Both are common patterns in production-grade config management.
For the full API reference and additional encoder/decoder customization options, see the official json module documentation.