Intermediate
Putting parameters in configuration files can take some extra effort at the start, but then can save you a lot of time and heartache in the future. We are all tempted to simply hardcode parameters directly into our code as we save precious time when we write code, but then doing this properly can take extra effort. Some of us at least create constants or store parameters in a variable, while others store them in a class variable to keep this even cleaner. Arguably the best option is store these in a configuration file. In this article you’ll learn the steps compulsory to use configuration files in python 3. It will be strictly according to the official documentation of python 3.
ConfigParser is the class used to implement configuration files in python 3. The main function of using these files is to write python programs which can easily be modified by end users easily. The main aspect of this article is to know about the complete implementation of configuration files. We will cover the three main aspects in this article which are Setup, File format and Basic API.
Introduction to Python 3 Configuration Files
Configuration files can play a vital role in any program and its management. One of the popular approaches to separate code from configuration is to store these files in YAML, JSON or INI and not in .py format. One reason that .py files are not used is that Python 3 can be slower when it comes to reloading. You would need to restart the whole program if you stored your config in a python .py file. Also, the end user can modify the code at will if it is in .py format. Configuration files make it easier to modify or change the code. The data stored in configuration is to have separation so that the programmer can focus on code development and ensure that is clean as possible and the user only needs to touch the configuration file.
Setup of Python 3 ConfigParser
The class used to create configuration files is ConfigParser. This is a part of the standard python 3 library so no need to do any pip installation. We have to import it: “import configparser” to use it or there is another way of using it, it will work in both python2 and python 3, which is:
import configparser
File Format of configuration file
One convention that is used for the file format is to use the extension .ini (short for initial or initiation) but you can use the configuration based on your own or on clients preferences. There are different parts of configuration files.
- A configuration file consists of one or more sections.
- The section names are written in these delimiters [section name].
- The concept is similar to mapping. It consists of key-value pairs meaning there is a name of the configuration item (“key”) and the other the actual value of the configuration (“value”)
- Two operators are used to initialize or separate key-value pair assignment operator (=) or colon operator (:).
- You can even put in a comment using the # or ; prefix.
Example:
[default]
host = 192.168.1.1
port = 31
username = admin
password = admin
[database]
#database related configuration files
port = 22
forwardx11 = no
name = db_test
In the above configuration file example, we have two sections first is [default] and second is [database]. Each section has its own key-value pairs/entries like username = admin and name = db_test. So all of the key-value pairs belong to a given section, so it is easier to organise your configuration files. Finally the sentence with a prefix of # is for commenting
Reading the configuration file from python code
Now, we will talk about the method to read from the config file. As mentioned earlier, ConfigParser is the module/class used to create configuration files. First, ConfigParser object has to be initialized: config = configparser.ConfigParser(); The following are functions:
Initialization of ConfigParser
You can can initiate the configuration file with the following syntax. Here the variable “config” will contain all the values
config = configparser.ConfigParser()
Write to a Configuration file with ConfigParser
Although normally you normally edit to a configuration file in a text editor by hand, there are times where you want to programmatically write to a config file. For example, this could be to create a default config file which a user can then use as a basis to change or edit. You may also want to over-ride a config entry (after confirming with the user) that is erroneous.
Once the object is initialised, we can now write in it. There are ways through which we can initialize the section to write in the config file. We are going use the example mentioned above in file format. Let’s initialize the default section using dictionary.
Example:
config['default'] = {
"host" : "192.168.1.1",
"port" : "22",
"username" : "username",
"password" : "password"
}
Here, “default” is the name of the section (the part in the actual configuration file that had the square “[” and “]” brackets) and curly braces denote the start and end of a dictionary. Inside the dictionary are key-value pairs i.e. “host” is the key and “192.168.1.1” is the value separated by colon “:”
Now, let’s initialize the database section using empty dictionary and add the key-value pairs line by line.
Example:
config['database'] = {}
config['database']['port'] = "22"
config['database']['forwardx11'] = "no"
config['database']['name'] = "db_test"
Here, “database” is the name of the section and curly braces denote the same start and end of a dictionary. In this case, the dictionary is empty. Key-value pairs i.e. “port” is the key and “22” is the value separated by colon “=.” This method provides a lot more flexibility.
Here’s the full code so far:
import configparser
config = configparser.ConfigParser()
config['default'] = {
"host" : "192.168.1.1",
"port" : "22",
"username" : "username",
"password" : "password"
}
config['database'] = {}
config['database']['port'] = "22"
config['database']['forwardx11'] = "no"
config['database']['name'] = "db_test"
with open('test.ini', 'w') as configfile:
config.write(configfile);
After initializing the sections in config, you can now write it to a config file:
with open('test.ini', 'w') as configfile:
config.write(configfile);
Now, you will be able to see the file named test.ini created.
Read config from the config file using ConfigParser
The next step is to read the file which you just have created.
- The config file can be read by using read() method: config.read(‘test.ini’). This will read the test.ini file which you just created.
- If you want to print just the sections available in configuration file, method sections() can be used: config.sections().
- Next is getting the value of any key stored in the section. config[‘database’][‘name’]
This will give you the value which is “db_test” of the key called “name” stored in data_base section.
The following code will print out all the values stored against the keys in the default section using a for loop.
for key in config['default']:
print(config['default'][key])
Code:

Output:

Changing the datatype of the configuration value from ConfigParser
The datatype of the object of ConfigParser is string by default. This is fine for most situations, but then suppose you want to get a true/false value instead, or a number value to do maths operations. For this the string default may not work. We can typecast/covert the datatype of the object of configparser or the datatype of keys of section into any other type such as integer, float etc. In order to change the datatype of object, you have to covert it manually or by using getter methods. The best and the preferred way is to use getter methods.
There are three getter methods:
- getint();
- getfloat();
- getboolean();
Example: config['default'].getint('port')
getint() will covert the datatype of port key of section “default” into “integer”. If you use the typeof(); method on port then it will show integer type now.
There is another way of doing it:
Example: config.getboolean('data_base', 'forwardx11')
In this way, config file is invoking the getboolean() method and its takin two parameters as argument. The first is the name of the section and the other is the key whole value’s type will be changed.
What to do if a value is not available from a configfile
A fallback result can also be obtained. Fallback is the result obtained when the key or section we want to get isn’t available.
Example: config.get('default', 'database', fallback='not_database')
In this case, not_database will be returned if the “database” key isn’t available or the section default is not found.
Conclusion
We come to know about the setup i.e. importing the ConfigParser first to create configuration files. Next section was about the file format. There you can check about the basic syntax of creating a configuration file. It consists of sections and key-value pairs.
We played with the data types of keys in default and data_base sections. We can change datatypes using getter methods. Last but not the least, we studied about the basic api like write, read and about fallback.
Using configuration files is not difficult and can save a lot of time. So in your next coding work, take the extra few minutes to create a configuration file instead of hardcoding.
Full Code: ConfigParser Example Code
import configparser
config = configparser.ConfigParser()
#Set up default item for hosts using dictionary
config['default'] = {"host" : "192.168.1.1",
"port" : "22",
"username" : "username",
"password" : "password" }
#setup config item bytes
config['database'] = {}
config['database']['port'] = "22"
config['database']['forwardx11'] = "no"
config['database']['name'] = "db_test"
#Write default file
with open('test.ini', 'w') as configfile:
config.write(configfile)
#Open the file again to try to read it
config.read('test.ini')
#Print the sections
print(config.sections())
print( config['database']['name'] )
#Print each key pair
for key in config['default']:
print(config['default'][key])
#print the type of integer value
print (type (config['default'].getint('port')))
print( config.getboolean('database', 'forwardx11') )
#Print default value
print( config.get('default', 'databaseabc', fallback='not_database') )
Output:

Reference
https://docs.python.org/3/library/configparser.html
Want to see more useful tips?
How To Use Python Pydash for Functional Programming Utilities
Intermediate
If you’ve ever written the same list-filtering loop for the hundredth time, or stared at a nested dictionary wondering how to safely pluck a value three levels deep, you already know the problem Pydash solves. Python’s standard library is powerful, but working with collections and data pipelines often produces verbose, repetitive code that buries your intent under implementation details. Pydash is a utility library — inspired by Lodash from JavaScript — that gives you a clean, consistent set of functions for transforming data the way you think about it, not the way Python’s builtins happen to expose it.
Pydash is a pure Python library with no mandatory dependencies, and installing it takes one command: pip install pydash. It works with any Python 3.7+ environment. The functions cover six main areas: lists, dictionaries, strings, numbers, functions (higher-order utilities), and chaining. Most functions are forgiving by design — they return sensible defaults instead of raising exceptions when data is missing or the wrong shape.
In this article, we cover the most useful Pydash utilities for everyday Python work: deep key access with get and set_, list operations with chunk, flatten, and group_by, string utilities, functional tools like curry and partial, and Pydash’s method chaining API. By the end you’ll have a working data pipeline that processes a collection of records using only Pydash functions — no manual loops required.
Working with Pydash: Quick Example
To see how Pydash changes the shape of everyday code, here is a complete example that deep-reads nested dictionary keys, filters a list, and groups results — all without writing a single for-loop or try/except block:
# quick_example.py
import pydash as _
records = [
{"user": {"name": "Alice", "role": "admin"}, "score": 91},
{"user": {"name": "Bob", "role": "member"}, "score": 74},
{"user": {"name": "Carol", "role": "admin"}, "score": 88},
{"user": {"name": "Dan", "role": "member"}, "score": 55},
]
# Deep-get a nested key with a fallback default
first_name = _.get(records[0], "user.name", "Unknown")
print("First user:", first_name)
# Filter records where score >= 80
high_scorers = _.filter_(records, lambda r: r["score"] >= 80)
print("High scorers:", _.map_(high_scorers, "user.name"))
# Group all records by role
by_role = _.group_by(records, "user.role")
print("Admins:", [r["user"]["name"] for r in by_role["admin"]])
Output:
First user: Alice
High scorers: ['Alice', 'Carol']
Admins: ['Alice', 'Carol']
Notice that _.map_(high_scorers, "user.name") uses a path string instead of a lambda — Pydash accepts dot-notation paths wherever it accepts iteratee functions. That single convention eliminates a large category of boilerplate lambdas. The trailing underscore on filter_ and map_ avoids shadowing Python’s built-in names.
The sections below go deeper into each function group, with realistic examples you can run immediately.
What Is Pydash and Why Use It?
Pydash is a port of Lodash — one of JavaScript’s most-downloaded utility libraries — to Python. Where Python’s standard library organizes utilities by data type (itertools, functools, collections, str methods), Pydash organizes everything under a single consistent namespace: import pydash as _ and every utility is one call away.
The library solves a specific problem: Python’s builtins are not composable at the data level. You cannot safely read data["user"]["address"]["city"] without guarding every bracket with a try/except or a chain of .get() calls. Pydash’s _.get(data, "user.address.city", "Unknown") does the same thing in one line and never raises an exception. This matters most when you’re processing API responses, configuration files, or any JSON-shaped data where fields may or may not exist.
Here is a quick comparison of what Pydash replaces:
| Task | Vanilla Python | Pydash |
|---|---|---|
| Safe nested key access | data.get("a", {}).get("b", default) | _.get(data, "a.b", default) |
| Flatten nested list | list(itertools.chain.from_iterable(...)) | _.flatten(nested) |
| Group list by field | Manual defaultdict loop | _.group_by(items, "field") |
| Split list into pages | Slice arithmetic in a loop | _.chunk(items, size) |
| Unique by field | Seen-set + loop | _.uniq_by(items, "id") |
| Partial application | functools.partial(fn, ...) | _.partial(fn, ...) or _.curry(fn) |
Pydash is not a replacement for pandas or numpy — it doesn’t do vectorized math or DataFrames. It’s the missing middle ground between raw Python and a full data science stack: a clean toolbox for transforming ordinary Python dicts and lists.
Installing Pydash
Install Pydash from PyPI. No other dependencies are required:
# install_pydash.sh
pip install pydash
Output:
Successfully installed pydash-8.0.3
Throughout this article we import Pydash as _, which mirrors the Lodash convention and makes function calls read naturally. If the underscore alias conflicts with something in your codebase, use import pydash as pd or import functions individually: from pydash import get, flatten, group_by.
Dictionary Utilities: get, set_, has, omit, pick
Safe Deep Access with get and set_
The two most-used Pydash functions are _.get() and _.set_(). They read and write nested keys using dot-notation paths without raising exceptions on missing keys. This is invaluable when consuming API responses where any field might be absent:
# dict_get_set.py
import pydash as _
user = {
"profile": {
"name": "Alice",
"address": {
"city": "Melbourne",
"postcode": "3000"
}
},
"scores": [91, 85, 78]
}
# Safe nested read — returns default if path is missing
city = _.get(user, "profile.address.city", "Unknown")
country = _.get(user, "profile.address.country", "Australia") # key missing
print("City:", city)
print("Country (default):", country)
# Array index access in paths
first_score = _.get(user, "scores[0]", 0)
print("First score:", first_score)
# Deep write — creates intermediate keys if needed
_.set_(user, "profile.settings.theme", "dark")
print("Theme set:", _.get(user, "profile.settings.theme"))
Output:
City: Melbourne
Country (default): Australia
First score: 91
Theme set: dark
_.get() never raises a KeyError or TypeError — if any segment of the path is missing or None, it returns your default value. _.set_() mutates the original dict in place and creates intermediate dicts automatically, so you never need to pre-initialize nested structures.
Selecting and Filtering Keys: pick, omit, has
When you need a subset of a dictionary’s keys — for serialization, logging, or passing to an API — _.pick() and _.omit() do the job cleanly without dictionary comprehensions:
# dict_pick_omit.py
import pydash as _
record = {
"id": 42,
"name": "Alice",
"email": "alice@example.com",
"password_hash": "abc123",
"created_at": "2024-01-15",
"internal_notes": "VIP customer"
}
# Keep only safe fields for API response
public = _.pick(record, ["id", "name", "email", "created_at"])
print("Public record:", public)
# Remove sensitive fields before logging
loggable = _.omit(record, ["password_hash", "internal_notes"])
print("Loggable record:", loggable)
# Check if a nested key exists
print("Has email?", _.has(record, "email"))
print("Has address?", _.has(record, "address.city")) # nested path
Output:
Public record: {'id': 42, 'name': 'Alice', 'email': 'alice@example.com', 'created_at': '2024-01-15'}
Loggable record: {'id': 42, 'name': 'Alice', 'email': 'alice@example.com', 'created_at': '2024-01-15'}
Has email? True
Has address? False
Both _.pick() and _.omit() return new dicts — the original is untouched. _.has() accepts dot-notation paths just like _.get(), so you can check for deeply nested keys before trying to access them.
List Utilities: chunk, flatten, group_by, uniq_by, zip_
Splitting Lists with chunk
When sending items to an API in batches, paginating results, or splitting data for parallel processing, you need to divide a list into fixed-size groups. _.chunk() handles this in one call:
# list_chunk.py
import pydash as _
item_ids = [101, 102, 103, 104, 105, 106, 107, 108, 109, 110]
# Split into batches of 3 for API calls
batches = _.chunk(item_ids, 3)
print("Batches:", batches)
# Simulate batch API calls
for i, batch in enumerate(batches, 1):
print(f" Batch {i}: processing IDs {batch}")
Output:
Batches: [[101, 102, 103], [104, 105, 106], [107, 108, 109], [110]]
Batch 1: processing IDs [101, 102, 103]
Batch 2: processing IDs [104, 105, 106]
Batch 3: processing IDs [107, 108, 109]
Batch 4: processing IDs [110]
The last batch contains whatever items remain — Pydash never drops items to make batches uniform. No slice arithmetic, no off-by-one errors.
Flattening Nested Lists
API responses frequently return nested arrays — list of lists of items, or lists of dicts containing lists. Pydash provides three levels of flattening: _.flatten() for one level deep, _.flatten_deep() for all levels, and _.flatten_depth(n) for a specific depth:
# list_flatten.py
import pydash as _
# One level of nesting (common from paginated API results)
pages = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
all_items = _.flatten(pages)
print("Flatten one level:", all_items)
# Deeply nested structure
nested = [1, [2, [3, [4, [5]]]]]
print("Flatten deep:", _.flatten_deep(nested))
print("Flatten 2 levels:", _.flatten_depth(nested, 2))
Output:
Flatten one level: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Flatten deep: [1, 2, 3, 4, 5]
Flatten 2 levels: [1, 2, 3, [4, [5]]]
Grouping Records with group_by
Grouping a list of dicts by a shared field is one of the most common data transformation tasks. With vanilla Python you build a defaultdict and a loop. With Pydash it’s a single call, and it accepts both a field name string and a lambda for computed groupings:
# list_group_by.py
import pydash as _
orders = [
{"id": 1, "status": "shipped", "amount": 120.00},
{"id": 2, "status": "pending", "amount": 45.50},
{"id": 3, "status": "shipped", "amount": 89.99},
{"id": 4, "status": "cancelled", "amount": 30.00},
{"id": 5, "status": "pending", "amount": 210.75},
]
# Group by field name string
by_status = _.group_by(orders, "status")
for status, items in by_status.items():
total = sum(o["amount"] for o in items)
print(f" {status}: {len(items)} orders, ${total:.2f} total")
# Group by computed value (lambda)
by_size = _.group_by(orders, lambda o: "large" if o["amount"] > 100 else "small")
print("\nLarge orders:", len(by_size.get("large", [])))
print("Small orders:", len(by_size.get("small", [])))
Output:
shipped: 2 orders, $209.99 total
pending: 2 orders, $256.25 total
cancelled: 1 orders, $30.00 total
Large orders: 2
Small orders: 3
Deduplication with uniq_by
When merging datasets or deduplicating records from multiple sources, _.uniq_by() keeps the first occurrence of each unique key value — no seen-set bookkeeping required:
# list_uniq.py
import pydash as _
# Raw records with duplicates (e.g., merged from two data sources)
contacts = [
{"id": 1, "name": "Alice", "source": "CRM"},
{"id": 2, "name": "Bob", "source": "CRM"},
{"id": 1, "name": "Alice", "source": "CSV"}, # duplicate ID 1
{"id": 3, "name": "Carol", "source": "CSV"},
]
unique_contacts = _.uniq_by(contacts, "id")
print("Unique contacts:", [c["name"] for c in unique_contacts])
print("Sources kept:", [c["source"] for c in unique_contacts])
Output:
Unique contacts: ['Alice', 'Bob', 'Carol']
Sources kept: ['CRM', 'CRM', 'CSV']
The first occurrence wins — so when merging CRM data over CSV data, put the preferred source first in the input list before calling _.uniq_by().
String Utilities: camel_case, snake_case, truncate, words
Pydash includes a full set of string case converters that are useful when normalizing data from APIs (which often use camelCase) into Python conventions (snake_case), or formatting output for display:
# string_utils.py
import pydash as _
# Case conversion — common when consuming REST APIs
api_key = "getUserProfileData"
print("Snake case:", _.snake_case(api_key)) # get_user_profile_data
print("Kebab case:", _.kebab_case(api_key)) # get-user-profile-data
print("Title case:", _.start_case(api_key)) # Get User Profile Data
# Reverse: snake_case to camelCase for sending back to API
field_name = "user_created_at"
print("Camel case:", _.camel_case(field_name)) # userCreatedAt
# String truncation for display
long_text = "Python is a versatile programming language used in web development, data science, and automation."
print("Truncated:", _.truncate(long_text, 60))
# Split into words (handles camelCase and snake_case)
print("Words:", _.words("getUserData")) # ['get', 'User', 'Data']
print("Words:", _.words("get_user_data")) # ['get', 'user', 'data']
# Pad strings for tabular output
print(_.pad("OK", 10)) # ' OK '
print(_.pad_end("Loading", 12, ".")) # 'Loading.....'
Output:
Snake case: get_user_profile_data
Kebab case: get-user-profile-data
Title case: Get User Profile Data
Camel case: userCreatedAt
Truncated: Python is a versatile programming language used in...
Words: ['get', 'User', 'Data']
Words: ['get', 'user', 'data']
OK
Loading.....
These functions are particularly valuable when writing API adapters that translate between external naming conventions and your internal Python code. Calling _.snake_case() on every key in an API response dict is faster to read and less error-prone than a regex substitution.
Functional Utilities: curry, partial, flow
Currying Functions
Currying transforms a multi-argument function into a chain of single-argument functions. This is useful for creating specialized functions from general ones without repeating arguments everywhere:
# functional_curry.py
import pydash as _
# A general function with multiple parameters
def multiply(a, b):
return a * b
# Curry it — now each call takes one argument at a time
curried_multiply = _.curry(multiply)
double = curried_multiply(2) # fix the first argument
triple = curried_multiply(3)
print("Double 5:", double(5)) # 10
print("Triple 5:", triple(5)) # 15
# Practical use: apply a tax rate to a list of prices
add_tax = _.curry(lambda rate, price: round(price * (1 + rate), 2))
add_gst = add_tax(0.10) # 10% GST
prices = [19.99, 49.99, 9.95, 149.00]
with_tax = list(map(add_gst, prices))
print("Prices with GST:", with_tax)
Output:
Double 5: 10
Triple 5: 15
Prices with GST: [21.99, 54.99, 10.94, 163.9]
Building Pipelines with flow
_.flow() composes a series of single-argument functions into a pipeline — the output of each function becomes the input of the next. This is the cleanest way to express multi-step data transformations without deeply nested function calls:
# functional_flow.py
import pydash as _
# Define individual transformation steps
def normalize_text(text):
return text.strip().lower()
def remove_punctuation(text):
return ''.join(c for c in text if c.isalnum() or c.isspace())
def split_words(text):
return text.split()
def count_words(words):
return len(words)
# Compose into a pipeline
word_count_pipeline = _.flow(
normalize_text,
remove_punctuation,
split_words,
count_words
)
samples = [
" Hello, World! This is Python. ",
"Pydash makes functional programming easy!",
" One. ",
]
for sample in samples:
count = word_count_pipeline(sample)
print(f"'{sample.strip()[:40]}...' -> {count} words")
Output:
'Hello, World! This is Python.' -> 5 words
'Pydash makes functional programming easy!' -> 5 words
'One.' -> 1 words
_.flow() makes the transformation sequence explicit and readable. Adding a new step is one line — insert the function anywhere in the chain. Compare this to the equivalent nested call: count_words(split_words(remove_punctuation(normalize_text(text)))), which you have to read right-to-left to understand the execution order.
Method Chaining with _.chain()
Pydash’s chaining API lets you apply multiple operations to a collection in sequence without intermediate variables or nested calls. The chain is lazy — nothing executes until you call .value():
# chaining.py
import pydash as _
employees = [
{"name": "Alice", "dept": "Engineering", "salary": 95000, "years": 4},
{"name": "Bob", "dept": "Marketing", "salary": 72000, "years": 2},
{"name": "Carol", "dept": "Engineering", "salary": 110000, "years": 7},
{"name": "Dan", "dept": "Marketing", "salary": 68000, "years": 1},
{"name": "Eve", "dept": "Engineering", "salary": 88000, "years": 3},
{"name": "Frank", "dept": "HR", "salary": 65000, "years": 5},
]
# Chain: filter engineers -> sort by salary desc -> take top 2 -> extract names
top_engineers = (
_.chain(employees)
.filter_(lambda e: e["dept"] == "Engineering")
.sort_by("salary", reverse=True)
.take(2)
.map_("name")
.value()
)
print("Top 2 engineers by salary:", top_engineers)
# Chain: group by dept -> map to dept summary stats
dept_summary = (
_.chain(employees)
.group_by("dept")
.map_values(lambda members: {
"count": len(members),
"avg_salary": round(sum(m["salary"] for m in members) / len(members))
})
.value()
)
for dept, stats in dept_summary.items():
print(f" {dept}: {stats['count']} people, avg ${stats['avg_salary']:,}")
Output:
Top 2 engineers by salary: ['Carol', 'Alice']
Engineering: 3 people, avg $97,667
Marketing: 2 people, avg $70,000
HR: 1 people, avg $65,000
Each method in the chain wraps the previous result. The chain object accumulates operations without executing them — execution happens only when .value() is called. This means you can build reusable chain templates and conditionally add operations before calling .value().
Real-Life Example: Employee Report Pipeline
This project combines everything from the article into a self-contained pipeline that ingests raw employee records, cleans and validates them, computes department statistics, and produces a formatted summary report — all using Pydash functions.
# employee_report.py
import pydash as _
RAW_DATA = [
{"id": 1, "full_name": " alice chen ", "department": "engineering", "annual_salary": 95000, "tenure_years": 4, "active": True},
{"id": 2, "full_name": "BOB SMITH", "department": "marketing", "annual_salary": 72000, "tenure_years": 2, "active": True},
{"id": 3, "full_name": "Carol Ng", "department": "engineering", "annual_salary": 110000,"tenure_years": 7, "active": True},
{"id": 4, "full_name": "dan jones", "department": "marketing", "annual_salary": 68000, "tenure_years": 1, "active": False},
{"id": 5, "full_name": "Eve Rodrigo", "department": "engineering", "annual_salary": 88000, "tenure_years": 3, "active": True},
{"id": 6, "full_name": " FRANK LEE ", "department": "hr", "annual_salary": 65000, "tenure_years": 5, "active": True},
{"id": 7, "full_name": "Grace Kim", "department": "engineering", "annual_salary": 102000,"tenure_years": 6, "active": True},
{"id": 8, "full_name": "henry park", "department": "hr", "annual_salary": 61000, "tenure_years": 2, "active": False},
]
# Step 1: Clean and normalize records
def normalize_record(rec):
return _.assign({}, rec, {
"full_name": _.start_case(rec["full_name"].strip().lower()),
"department": _.start_case(rec["department"]),
})
# Step 2: Build the report pipeline
report = (
_.chain(RAW_DATA)
# Normalize names and departments
.map_(normalize_record)
# Active employees only
.filter_(lambda e: e["active"])
# Group by department
.group_by("department")
# Compute stats per department
.map_values(lambda members: {
"headcount": len(members),
"avg_salary": round(sum(m["annual_salary"] for m in members) / len(members)),
"max_salary": max(m["annual_salary"] for m in members),
"avg_tenure": round(sum(m["tenure_years"] for m in members) / len(members), 1),
"top_earner": _.max_by(members, "annual_salary")["full_name"],
})
.value()
)
# Step 3: Print the report
print("=" * 54)
print("EMPLOYEE REPORT — ACTIVE STAFF BY DEPARTMENT")
print("=" * 54)
for dept, stats in sorted(report.items()):
print(f"\n {dept}")
print(f" Headcount : {stats['headcount']}")
print(f" Avg Salary : ${stats['avg_salary']:,}")
print(f" Max Salary : ${stats['max_salary']:,}")
print(f" Avg Tenure : {stats['avg_tenure']} years")
print(f" Top Earner : {stats['top_earner']}")
# Step 4: Global stats
all_active = _.filter_(_.map_(RAW_DATA, normalize_record), lambda e: e["active"])
print(f"\n{'=' * 54}")
print(f" Total active employees : {len(all_active)}")
print(f" Company avg salary : ${round(_.mean(_.map_(all_active, 'annual_salary'))):,}")
print(f" Highest paid overall : {_.max_by(all_active, 'annual_salary')['full_name']}")
print("=" * 54)
Output:
======================================================
EMPLOYEE REPORT — ACTIVE STAFF BY DEPARTMENT
======================================================
Engineering
Headcount : 4
Avg Salary : $98,750
Max Salary : $110,000
Avg Tenure : 5.0 years
Top Earner : Carol Ng
Hr
Headcount : 1
Avg Salary : $65,000
Max Salary : $65,000
Avg Tenure : 5.0 years
Top Earner : Frank Lee
Marketing
Headcount : 1
Avg Salary : $72,000
Max Salary : $72,000
Avg Tenure : 2.0 years
Top Earner : Bob Smith
======================================================
Total active employees : 6
Company avg salary : $90,333
Highest paid overall : Carol Ng
======================================================
The pipeline reads as a clear sequence of intentions: normalize, filter, group, aggregate. To add a new transformation — say, flagging departments with average tenure under 2 years — you add one .map_values() step to the chain. No refactoring, no new loop variables, no off-by-one concerns.
Frequently Asked Questions
How does Pydash compare to toolz or cytoolz?
Toolz and cytoolz (the Cython-accelerated version) focus on purely functional composition — they’re excellent for pipeline-heavy code with large datasets and prioritize performance. Pydash covers a broader surface area including string utilities, nested dict access, and a chaining API, and is more beginner-friendly because its function names mirror Lodash. For data pipelines that process millions of records, toolz may be faster; for everyday JSON and dict manipulation, Pydash’s ergonomics usually win.
What path string formats does _.get() support?
Pydash’s _.get() accepts dot-notation for nested dicts ("user.address.city"), bracket notation for list indices ("scores[0]"), and combinations of both ("users[2].profile.name"). If a key itself contains a dot (rare but possible), you can pass a list of key segments instead: _.get(data, ["key.with.dot", "nested"]). This covers virtually all real-world JSON structures.
Does Pydash mutate the original data?
Most Pydash functions return new objects and do not mutate their inputs — _.filter_(), _.map_(), _.pick(), _.omit(), and so on. The exceptions are functions that explicitly write: _.set_() and _.assign() mutate their first argument by design. If you need immutable behavior from these, pass a copy: _.set_(dict(original), "key", value). The trailing underscore on function names does not indicate mutation — it’s just used to avoid shadowing Python builtins like filter and map.
Is Pydash performant enough for production?
Pydash is pure Python, so it will be slower than numpy, pandas, or C-extension libraries for large-scale data processing. For typical web application work — processing API responses, transforming configuration data, building report summaries — performance is more than adequate. The library’s functions are implemented straightforwardly and don’t introduce significant overhead over vanilla Python. If you’re processing millions of records in a tight loop, profile first; for everything else, developer productivity gains from clean Pydash code usually outweigh the marginal speed difference.
Is the _.chain() API truly lazy?
Yes — _.chain() creates a wrapper that accumulates operations without executing them. No iteration happens until you call .value(). This means you can build a chain object, conditionally add steps based on runtime conditions, and then execute it — only one pass through the data occurs at the end. It also means a bug in a late chain step won’t be revealed until .value() is called, so testing individual chain steps in isolation during development is good practice.
Conclusion
Pydash fills a genuine gap in Python’s utility landscape. We covered the most useful parts of the library: safe nested dict access with _.get() and _.set_(), targeted key selection with _.pick() and _.omit(), list operations including _.chunk(), _.flatten(), _.group_by(), and _.uniq_by(), string case converters, functional tools like _.curry() and _.flow(), and the full method chaining API with _.chain().
The best way to extend the real-life example is to connect it to a real data source — a CSV export, a REST API response, or a database query result. Try replacing the RAW_DATA list with records from a requests.get() call to jsonplaceholder.typicode.com/users and applying the same pipeline to normalize and summarize that data. The chain won’t change; only the source does.
For the full API reference, including the 150+ functions not covered here, see the official Pydash documentation.
Related Articles
Frequently Asked Questions
What is ConfigParser used for in Python?
ConfigParser is a built-in Python module for reading and writing configuration files in INI format. It handles settings organized into sections with key-value pairs, making it easy to store and retrieve application configuration without hardcoding values.
What format does ConfigParser use?
ConfigParser uses the INI file format with sections in square brackets ([section]), followed by key-value pairs using = or : as delimiters. Comments start with # or ;. There is always a [DEFAULT] section for fallback values.
How do I read a config file with ConfigParser?
Create a ConfigParser() instance, call config.read('filename.ini'), then access values with config['section']['key'] or config.get('section', 'key'). Use getint(), getfloat(), or getboolean() for type conversion.
Can ConfigParser handle nested sections?
No, ConfigParser does not support nested sections natively. For nested configuration structures, consider using TOML (tomllib in Python 3.11+), YAML (PyYAML), or JSON configuration files instead.
What is the difference between ConfigParser and JSON for configuration?
ConfigParser uses human-friendly INI format with sections and is ideal for simple settings. JSON supports nested structures and lists but lacks comments. ConfigParser has built-in type conversion methods and a DEFAULT section for fallback values, while JSON requires manual type handling.
Trackbacks/Pingbacks