How To Generate Random Numbers In Python

Last Updated: June 01, 2026

Table of Contents

Generating the same random number each time and why this matters
Python Random Number Between 1 and 10
Python Generate Random Numbers From A Range
Generate Random String Of Length n in Python
Random Choice Without Replacement In Python
Generate Date Between Two Dates in Python
Generate Random Temporary Filename in Python
Conclusion
Subscribe
Related Articles
Frequently Asked Questions

Generating random numbers in Python is a fairly straightforward activity which can be done in a few lines. There maybe many variations which you need to do ranging from decimal places, random numbers between a start and end number, and many more. We’ll go through many useful examples in this article.

The most basic way to generate random numbers in python is with the random library:

import random

num = random.random()

print( f"Random number between 0.0 and 1.0 ={num}\n")

Output as follows:

You’ll see that each time it is run it has a new random number.

Written by Pubs

Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program — 270+ in-depth tutorials covering the modern Python stack.

View all tutorials by Pubs →

Generating the same random number each time and why this matters

Sometimes, you may want to generate some random numbers, but then be able to generate the same random numbers each time. Now this may sound counter intuitive as the whole point of getting random numbers is so that, well, they are random. One scenario where you would like to regenerate the same random numbers is during testing. You may find some unusual behaviour and this is where you may want to replicate that behaviour for which you’l l need the same input. This is where you’d want to generate the same random number and you can do that in python using the seed function from the random library.

The idea behind the seed function is that you can think of it as a specific key which can be used to generate a series of random numbers which stems from a given key. Use a different seed and you’ll generate a different set of random numbers.

See the following example code which generates a random number between 1 and 0:

import random

random.seed(1)

for i in range(1,5):
    num = random.random()
    print( f"Random number between 0.0 and 1.0 ={num}\n")

Output as follows:

No matter how many times it is run, since the seed is the same each time, it generates the same numbers.

Python Random Number Between 1 and 10

Now that we know how to generate random numbers, how do you do it between two numbers? This is easily done in with either randint() for whole numbers or with uniform() for decimal numbers.

import random

num_int = random.randint(1,10)
print( f"Random whole number between 1 and 10 ={num_int}\n")

num_uni = random.uniform(1,10)
print( f"Random decimal number between 1 and 10 ={num_uni}\n")

Python Generate Random Numbers From A Range

Suppose you needed to generate random numbers from a range of data whether that be numbers, names or even a pack of cards. This can be done through selecting the random element in an array by choosing the index randomly. For example, if you had an array of 5 items, then you can randomly chose and index from 0 to 4 (where 0 is the index of the first item).

There is another and shorter way in python which is to use the random.choice() function. If you pass it an array, it will then randomly return one of the elements.

Here’s an example to randomly select a name from a list with both using the index (to show you how it works), and the much most efficient random.choice() library function:

import random

###### Selecing numbers from a range
names_list = [ "Judy", "Harry", "Sarah", "Tom", "Gloria"]

rand_index = random.randint( 0, len(names_list)-1 )

print( f"Randomly selected person 1 is = { names_list[ rand_index] }\n")
print( f"Randomly selected person 2 is = { random.choice( names_list) }\n")

And the output is different each time:

Generate Random String Of Length n in Python

If you want to generate a specific length string (e.g. to generate a password), both the random and the string libraries can come in handy where you can use it to create an easy password generator as follows:

import random, string

###### Create a random password
def generate_password( pass_len=10):
    password = "" 

    for i in range(1,pass_len+1):
        password = password + random.choice( string.ascii_letters + string.punctuation )
    return password

print( f"Password generated = [{ generate_password(10) }] ")

This will output a new password each time between square brackets:

If there are specific characters you want to include or exclude, you can simply replace the string.punctuation with your own list/array of specific characters to be included

Random Choice Without Replacement In Python

Suppose you wanted to randomly select items from a list without repeating any items. For example, you have a list of students and you have to select them in a random order to go first in a specific activity. In many programming languages you may need to generate a random list and remember the previously selected items to prevent any repeated selections. In the random library, there is a function called random.sample() that will do all that for you:

import random

#### Select unique random elements
students = ["John", "Tom", "Paul", "Sarah", "July", "Rachel"]

random_order = random.sample( students, 6)
print(random_order)

This will generate a unique list without repeating any selections:

[mfe_send_fox body=”

” list=”237850″ redirect=”https://pythonhowtoprogram.com/thank-you-for-subscribing/” check_last_name=”off” layout=”top_bottom” first_name_fullwidth=”off” email_fullwidth=”off” _builder_version=”4.17.4″ _module_preset=”default” body_font=”|700|||||||” body_line_height=”1em” result_message_font=”|700|||||||” body_ul_line_height=”0.1em” custom_button=”on” button_bg_color=”#0C71C3″ button_border_color=”#FFFFFF” button_border_radius=”20px” button_letter_spacing=”0px” button_font=”|800|||||||” button_use_icon=”off” button_custom_margin=”0px||||false|false” button_custom_padding=”1px|1px|1px|1px|false|false” text_orientation=”center” background_layout=”light” custom_padding=”20px|30px|20px|30px|false|false” hover_enabled=”0″ border_radii=”on|3px|3px|3px|3px” box_shadow_style_button=”preset2″ box_shadow_vertical_button=”2px” global_colors_info=”{}” sticky_enabled=”0″][/mfe_send_fox]

Generate Date Between Two Dates in Python

In order to generate a date between two dates, this can be done by converting the dates into days first. This can be combined with the random.randint() in addition to the days of the date differences then adding back to the start date:

import random, datetime  

#### Select a random date between two dates: 
d1 = datetime.date( 2013,  2, 26 )
d2 = datetime.date( 2015, 12, 15 )
diff = d2 - d1 
new_date_days = random.randint( 0, diff.days )

print( f"Random date is {   d1 + datetime.timedelta( days=new_date_days ) }")

The output would be as follows:

Generate Random Temporary Filename in Python

A common need is to generate a random filename often for temporary storage. This might be for a log file, a cache file or some other scenario and can be easily done with the similar string generation as above. First a letter should be determined and then the remaining letters can be added with also numbers as well.

import random, string

def generate_random_filename( filename_len=10):
    filename = "" 
    filename = filename + random.choice( string.ascii_lowercase  )

    for i in range(2, filename_len+1):
        filename = filename + random.choice( string.ascii_lowercase + string.digits )
    return filename

print( f"Random filename = [{ generate_random_filename( 10) }.txt]")

Output as follows:

There is in fact a specific python library though that does this which is even simpler:

import tempfile

filename = tempfile.NamedTemporaryFile( prefix="temp_" , suffix =".txt" )

print( f" Temporary filename is [{ filename.name }] ")

Output of the temporary filename generator is:

Conclusion

The random library has many uses from generating numbers to specific strings with a given length for password generation. Typically, these use cases sometimes have specialised libraries as there can be nuances (e.g for passwords, you may not want a repeating sequence which may be possible through random luck) which you can search for through pypi.org. However, many can be created with simple lines of code as demonstrated above. Send comments below or email me to ask further questions.

Not subscribed to our email list? Sign up now and get your next article in your inbox:

How To Use Narwhals for DataFrame-Agnostic Python Code

by Pubs | Jul 26, 2026 | Data Processing

Intermediate

You write a data processing function in pandas. It works great. Then a teammate switches the project to polars for performance reasons, and suddenly half your pipeline is broken. Or you maintain an open-source library that accepts a DataFrame as input — except now you need to support pandas, polars, and maybe cuDF for GPU users, which means three different code paths for what is essentially the same logic. This is the DataFrame fragmentation problem, and it gets more painful the more libraries you support.

Narwhals is a lightweight compatibility layer that lets you write DataFrame code once and run it on pandas, polars, modin, cuDF, and any other compliant backend. Instead of writing df.rename(columns={"old": "new"}) for pandas and df.rename({"old": "new"}) for polars, you write the Narwhals version once and it dispatches to the correct backend automatically. The library has zero mandatory dependencies — if the user passes in a pandas DataFrame, Narwhals uses pandas; if they pass in a polars DataFrame, it uses polars. Your code never needs to know which one it received.

This article covers everything you need to start writing DataFrame-agnostic Python code with Narwhals. You will learn how to install it, wrap inputs with narwhals.from_native(), use the Narwhals expression API for filtering, grouping, and aggregation, write backend-agnostic library functions, and handle the conversion back to native DataFrames. By the end you will have a working data pipeline that runs identically on pandas and polars without a single if isinstance check.

Narwhals DataFrame-Agnostic Code: Quick Example

Here is the shortest possible demonstration of the core idea. The function below accepts any supported DataFrame, filters rows, and returns a result — without knowing or caring whether the caller passed in pandas or polars.

# quick_narwhals.py
import narwhals as nw
import pandas as pd
import polars as pl

def get_high_earners(df_native, threshold=70000):
    df = nw.from_native(df_native)
    result = df.filter(nw.col("salary") > threshold)
    return nw.to_native(result)

# Works with pandas
pandas_df = pd.DataFrame({"name": ["Alice", "Bob", "Carol"], "salary": [90000, 55000, 80000]})
print(get_high_earners(pandas_df))

# Works with polars -- same function, no changes
polars_df = pl.DataFrame({"name": ["Alice", "Bob", "Carol"], "salary": [90000, 55000, 80000]})
print(get_high_earners(polars_df))

Output (pandas call):

    name  salary
0  Alice   90000
2  Carol   80000

Output (polars call):

shape: (2, 2)
+-------+--------+
| name  | salary |
| str   | i64    |
+=======+========+
| Alice | 90000  |
| Carol | 80000  |
+-------+--------+

The pattern is always the same three steps: wrap the native DataFrame with nw.from_native(), apply your transformations using Narwhals expressions, then call nw.to_native() to hand back a DataFrame in whichever format the caller originally provided. The function has no idea what backend it is working with — and it does not need to.

The sections below cover the full expression API, how to use Narwhals inside library functions, groupby and aggregation, schema inspection, and a real-world pipeline that processes sales data from either backend. Read on for the complete picture.

What Is Narwhals and Why Use It?

Narwhals is a thin compatibility layer for the Python DataFrame ecosystem. Think of it as a universal remote control — different devices (pandas, polars, modin), one set of buttons. Under the hood it translates each Narwhals expression into the equivalent native call on whichever backend is in use. When you write nw.col("price").mean(), Narwhals emits df["price"].mean() for pandas and pl.col("price").mean() for polars. The translation is handled for you.

The primary use case is writing libraries and utilities that accept a DataFrame from a caller you do not control. If you write a data validation function, a feature-engineering helper, or a report generator, you probably do not want to force all your users onto a single DataFrame library. Narwhals lets you accept whatever they have and return the same type back.

Feature	Narwhals	pandas only	polars only
Works with pandas	Yes	Yes	No
Works with polars	Yes	No	Yes
Works with modin/cuDF	Yes	Partial	No
Unified expression API	Yes	No	No
Zero mandatory deps	Yes	No	No
Returns caller’s type	Yes	N/A	N/A

Narwhals is not a replacement for polars or pandas — it is a wrapper you use at the boundaries of your code where the input DataFrame type is unknown. Your internal data-science scripts where you know the type should still use pandas or polars directly. Narwhals earns its place in shared utilities, open-source libraries, and pipelines that need to support multiple backends without duplication.

Developer holding a universal adapter connecting multiple different DataFrame formats — One API. Pandas, polars, modin, cuDF. Your utility function finally stops caring.

Installing Narwhals

Narwhals is on PyPI. It has no mandatory runtime dependencies — the only packages it imports are the ones your caller already has installed.

# Install narwhals
pip install narwhals

# Install your backends of choice (narwhals works with whichever you have)
pip install pandas polars

Output:

Successfully installed narwhals-1.x.x

Verify the install and check which backends are detectable:

# check_narwhals.py
import narwhals as nw
print("Narwhals version:", nw.__version__)

# Check available backends
import importlib
for backend in ["pandas", "polars", "modin.pandas", "cudf"]:
    available = importlib.util.find_spec(backend.split(".")[0]) is not None
    print(f"  {backend}: {'available' if available else 'not installed'}")

Output:

Narwhals version: 1.x.x
  pandas: available
  polars: available
  modin.pandas: not installed
  cudf: not installed

You need at least one DataFrame backend installed. Narwhals itself imports in milliseconds and adds no overhead to import time for packages that depend on it.

Wrapping and Unwrapping DataFrames

Every Narwhals operation starts with nw.from_native() and usually ends with nw.to_native(). Understanding these two functions is the foundation of the entire library.

Converting In: nw.from_native()

nw.from_native() wraps any supported native DataFrame or Series in a Narwhals proxy object. The proxy exposes a consistent API regardless of what is underneath. You can pass eager_only=True to restrict the function to eager DataFrames (pandas, polars eager) and get better type hints.

# wrapping.py
import narwhals as nw
import pandas as pd
import polars as pl

# Wrap a pandas DataFrame
pdf = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
ndf_pd = nw.from_native(pdf, eager_only=True)
print(type(ndf_pd))        # narwhals DataFrame
print(ndf_pd.schema)       # {'x': Int64, 'y': Int64}

# Wrap a polars DataFrame -- same API
plf = pl.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
ndf_pl = nw.from_native(plf, eager_only=True)
print(type(ndf_pl))        # narwhals DataFrame
print(ndf_pl.schema)       # {'x': Int64, 'y': Int64}

Output:

<class 'narwhals.dataframe.DataFrame'>
{'x': Int64, 'y': Int64}
<class 'narwhals.dataframe.DataFrame'>
{'x': Int64, 'y': Int64}

Both DataFrames produce the same Narwhals type with the same schema representation. From this point on, any operation you perform on ndf_pd and ndf_pl uses the same Narwhals method calls — no if isinstance branching needed.

Converting Out: nw.to_native()

nw.to_native() unwraps the Narwhals proxy and returns the underlying native DataFrame. The returned type matches whatever was passed in originally — if you wrapped a pandas DataFrame, you get a pandas DataFrame back; if you wrapped polars, you get polars back. This is how you preserve the caller’s type through a transformation pipeline.

# unwrapping.py
import narwhals as nw
import pandas as pd
import polars as pl

def double_x(df_native):
    df = nw.from_native(df_native, eager_only=True)
    result = df.with_columns((nw.col("x") * 2).alias("x_doubled"))
    return nw.to_native(result)

pdf = pd.DataFrame({"x": [1, 2, 3]})
plf = pl.DataFrame({"x": [1, 2, 3]})

pd_result = double_x(pdf)
pl_result = double_x(plf)

print(type(pd_result))   # pandas DataFrame
print(type(pl_result))   # polars DataFrame
print(pd_result)
print(pl_result)

Output:

<class 'pandas.core.frame.DataFrame'>
<class 'polars.dataframe.frame.DataFrame'>
   x  x_doubled
0  1          2
1  2          4
2  3          6
shape: (3, 2)
+---------+-----------+
| x       | x_doubled |
| i64     | i64       |
+=========+===========+
| 1       | 2         |
| 2       | 4         |
| 3       | 6         |
+---------+-----------+

The function returns whatever type it received. Callers do not need to convert their data before calling your function, and they do not need to convert the result afterward — it comes back in the form they expected.

Developer routing data from two separate conveyor belts through a single central machine — from_native() on the left, to_native() on the right. Your logic lives in the middle.

Using the Narwhals Expression API

The Narwhals expression system works like polars expressions: you build a lazy description of a computation using nw.col(), and the DataFrame executes it when you call a method like filter(), select(), or with_columns(). This is more composable than pandas’ method chains and translates cleanly to both backends.

Filtering and Selecting Columns

Use df.filter() to keep rows matching a condition, and df.select() to choose and rename columns. Both accept Narwhals expressions built with nw.col().

# filter_select.py
import narwhals as nw
import pandas as pd

data = {
    "product": ["Laptop", "Monitor", "Keyboard", "Mouse", "Webcam"],
    "price": [1200, 450, 80, 35, 120],
    "units_sold": [15, 30, 200, 350, 80],
}
df = nw.from_native(pd.DataFrame(data), eager_only=True)

# Filter: only products that sold more than 50 units
popular = df.filter(nw.col("units_sold") > 50)
print("Popular products:")
print(nw.to_native(popular))

# Select: build a revenue column and return two columns
revenue = df.select(
    nw.col("product"),
    (nw.col("price") * nw.col("units_sold")).alias("revenue"),
)
print("\nRevenue by product:")
print(nw.to_native(revenue))

Output:

Popular products:
    product  price  units_sold
1   Monitor    450          30
2  Keyboard     80         200
4    Webcam    120          80

Revenue by product:
    product  revenue
0    Laptop    18000
1   Monitor    13500
2  Keyboard    16000
3     Mouse    12250
4    Webcam     9600

Notice that the filter and select expressions read identically regardless of backend. The same code works on a polars DataFrame with no changes. Narwhals translates nw.col("units_sold") > 50 to the correct native expression at call time.

Adding and Transforming Columns

df.with_columns() adds new columns or replaces existing ones without dropping the rest of the DataFrame. It accepts a list of Narwhals expressions and is the equivalent of pandas’ df.assign() or polars’ df.with_columns().

# with_columns.py
import narwhals as nw
import polars as pl

df = nw.from_native(
    pl.DataFrame({
        "name": ["Alice", "Bob", "Carol", "Dave"],
        "score": [82, 67, 91, 74],
        "attempts": [3, 5, 2, 4],
    }),
    eager_only=True,
)

enriched = df.with_columns(
    (nw.col("score") / nw.col("attempts")).round(1).alias("score_per_attempt"),
    (nw.col("score") >= 80).alias("passed"),
)
print(nw.to_native(enriched))

Output:

shape: (4, 5)
+-------+-------+----------+-------------------+--------+
| name  | score | attempts | score_per_attempt | passed |
| str   | i64   | i64      | f64               | bool   |
+=======+=======+==========+===================+========+
| Alice |    82 |        3 |              27.3 | true   |
| Bob   |    67 |        5 |              13.4 | false  |
| Carol |    91 |        2 |              45.5 | true   |
| Dave  |    74 |        4 |              18.5 | false  |
+-------+-------+----------+-------------------+--------+

The .round(1) call chains directly onto the expression — Narwhals expression chaining works the same way in both backends. You can chain as many transformations as you need before passing the expression to with_columns().

GroupBy and Aggregation

GroupBy operations are where pandas and polars diverge most sharply in their native APIs. Narwhals unifies them with a group_by().agg() pattern that works identically on both backends and produces the same result shape.

# groupby_agg.py
import narwhals as nw
import pandas as pd

sales_data = {
    "region": ["North", "South", "North", "East", "South", "East", "North"],
    "category": ["Electronics", "Clothing", "Electronics", "Clothing", "Electronics", "Electronics", "Clothing"],
    "revenue": [4200, 1800, 3100, 2400, 2900, 3600, 1500],
    "units": [14, 60, 10, 80, 29, 36, 50],
}

def regional_summary(df_native):
    df = nw.from_native(df_native, eager_only=True)
    summary = (
        df.group_by("region")
        .agg(
            nw.col("revenue").sum().alias("total_revenue"),
            nw.col("units").sum().alias("total_units"),
            nw.col("revenue").mean().round(0).alias("avg_deal_size"),
        )
        .sort("total_revenue", descending=True)
    )
    return nw.to_native(summary)

# Test with pandas
result = regional_summary(pd.DataFrame(sales_data))
print(result)

Output:

  region  total_revenue  total_units  avg_deal_size
0   East           6000          116         3000.0
1  North           8800           74         2933.0
2  South           4700           89         2350.0

The same function called with a polars DataFrame produces identical data (though polars formats the output differently). The key insight is that group_by().agg() is a Narwhals pattern — your code never touches df.groupby() (pandas) or df.group_by() (polars) directly, so there is no divergence to manage.

Developer holding a calculator above two separate stacks of colored boxes representing grouped data — group_by().agg() — one spelling, two backends, zero if-else.

Schema Inspection and Type Handling

Narwhals exposes a unified schema that normalizes type names across backends. This is useful when you need to validate that a DataFrame has the expected columns and types before processing it.

# schema_check.py
import narwhals as nw
import pandas as pd
import polars as pl

def validate_and_describe(df_native):
    df = nw.from_native(df_native, eager_only=True)
    schema = df.schema
    print("Schema:", schema)
    print("Columns:", df.columns)
    print("Shape:", df.shape)

    # Check for required columns
    required = {"name", "age", "salary"}
    missing = required - set(df.columns)
    if missing:
        raise ValueError(f"Missing required columns: {missing}")

    # Narwhals dtype comparison works across backends
    for col, dtype in schema.items():
        print(f"  {col}: {dtype} (numeric={dtype.is_numeric()})")

pandas_df = pd.DataFrame({"name": ["Alice", "Bob"], "age": [30, 25], "salary": [75000, 62000]})
polars_df = pl.DataFrame({"name": ["Alice", "Bob"], "age": [30, 25], "salary": [75000, 62000]})

print("--- Pandas ---")
validate_and_describe(pandas_df)
print("\n--- Polars ---")
validate_and_describe(polars_df)

Output:

--- Pandas ---
Schema: {'name': String, 'age': Int64, 'salary': Int64}
Columns: ['name', 'age', 'salary']
Shape: (2, 3)
  name: String (numeric=False)
  age: Int64 (numeric=True)
  salary: Int64 (numeric=True)

--- Polars ---
Schema: {'name': String, 'age': Int64, 'salary': Int64}
Columns: ['name', 'age', 'salary']
Shape: (2, 3)
  name: String (numeric=False)
  age: Int64 (numeric=True)
  salary: Int64 (numeric=True)

The schemas are identical even though the underlying backends store and display types differently. dtype.is_numeric(), dtype.is_temporal(), and similar methods work consistently, so you can write type-based validation logic that runs on any backend without a lookup table of backend-specific type names.

Writing Backend-Agnostic Library Functions

The most powerful use of Narwhals is writing utility functions that you can publish in a library or share across a team, where callers may use different DataFrame libraries. The decorator @nw.narwhalify handles the wrap/unwrap automatically — no need to call from_native and to_native manually.

# library_utils.py
import narwhals as nw

@nw.narwhalify
def normalize_scores(df, score_col="score"):
    """Normalize a score column to 0-1 range. Works on any Narwhals-supported DataFrame."""
    col_min = df[score_col].min()
    col_max = df[score_col].max()
    return df.with_columns(
        ((nw.col(score_col) - col_min) / (col_max - col_min)).alias(f"{score_col}_normalized")
    )

@nw.narwhalify
def drop_nulls_and_report(df):
    """Drop null rows and print how many were removed."""
    original_len = len(df)
    cleaned = df.drop_nulls()
    dropped = original_len - len(cleaned)
    if dropped > 0:
        print(f"Dropped {dropped} rows with null values ({dropped/original_len:.1%} of data)")
    return cleaned

# Test with both backends
import pandas as pd
import polars as pl

pdf = pd.DataFrame({"student": ["Alice", "Bob", "Carol"], "score": [78, 95, 61]})
plf = pl.DataFrame({"student": ["Alice", "Bob", "Carol"], "score": [78, 95, 61]})

print("Pandas result:")
print(normalize_scores(pdf))

print("\nPolars result:")
print(normalize_scores(plf))

Output:

Pandas result:
  student  score  score_normalized
0   Alice     78          0.500000
1     Bob     95          1.000000
2   Carol     61          0.000000

Polars result:
shape: (3, 3)
+---------+-------+------------------+
| student | score | score_normalized |
| str     | i64   | f64              |
+=========+=======+==================+
| Alice   |    78 |             0.5  |
| Bob     |    95 |             1.0  |
| Carol   |    61 |             0.0  |
+---------+-------+------------------+

The @nw.narwhalify decorator wraps all DataFrame and Series arguments automatically when the function is called, then unwraps the return value back to the caller’s native type. This is the pattern to use when publishing functions in a shared library — it is the most Pythonic way to expose a Narwhals-powered API to callers who may not know or care that Narwhals is involved.

Developer enthusiastically decorating a function with a glowing badge as inputs arrive from both sides — @nw.narwhalify — because wrapping and unwrapping by hand is what interns are for.

Real-Life Example: A Backend-Agnostic Sales Report Generator

This project builds a complete sales report generator that accepts any supported DataFrame, computes revenue metrics by region and category, flags underperformers, and returns a formatted summary. It is designed to be dropped into any project as a standalone utility.

# sales_report.py
import narwhals as nw
from typing import Any

@nw.narwhalify
def generate_sales_report(df, revenue_col="revenue", group_col="region", threshold_pct=0.8):
    """
    Generate a sales performance report grouped by region.
    Works with any Narwhals-supported DataFrame (pandas, polars, modin, etc.).

    Args:
        df: Any supported DataFrame with at minimum 'revenue' and 'region' columns.
        revenue_col: Name of the revenue column.
        group_col: Column to group by (default: 'region').
        threshold_pct: Groups below this fraction of the mean are flagged as underperformers.

    Returns:
        DataFrame with group totals, averages, deal counts, and performance flags.
        Return type matches the input type.
    """
    # Step 1: Group and aggregate
    summary = (
        df.group_by(group_col)
        .agg(
            nw.col(revenue_col).sum().alias("total_revenue"),
            nw.col(revenue_col).mean().round(0).alias("avg_deal"),
            nw.len().alias("deal_count"),
        )
        .sort("total_revenue", descending=True)
    )

    # Step 2: Compute overall mean for flagging
    mean_rev = summary["total_revenue"].mean()

    # Step 3: Add performance flag
    summary = summary.with_columns(
        (nw.col("total_revenue") < mean_rev * threshold_pct).alias("underperforming")
    )

    return summary


# --- Demo with pandas ---
import pandas as pd
import polars as pl

sample_data = {
    "region": ["North", "South", "North", "East", "South", "East", "West", "North", "West"],
    "category": ["SaaS", "Hardware", "SaaS", "Services", "Hardware", "SaaS", "Services", "Hardware", "SaaS"],
    "revenue": [12000, 8500, 9800, 15000, 7200, 11000, 4300, 6100, 9700],
    "sales_rep": ["Ana", "Ben", "Ana", "Cara", "Ben", "Cara", "Dan", "Ana", "Dan"],
}

print("=== Pandas Backend ===")
pandas_report = generate_sales_report(pd.DataFrame(sample_data))
print(pandas_report)
print(f"Return type: {type(pandas_report).__name__}\n")

print("=== Polars Backend ===")
polars_report = generate_sales_report(pl.DataFrame(sample_data))
print(polars_report)
print(f"Return type: {type(polars_report).__name__}")

Output:

=== Pandas Backend ===
  region  total_revenue  avg_deal  deal_count  underperforming
0   East          26000   13000.0           2            False
1  North          27900    9300.0           3            False
2  South          15700    7850.0           2             True
3   West          14000    7000.0           2             True
Return type: DataFrame

=== Polars Backend ===
shape: (4, 5)
+--------+---------------+----------+------------+-----------------+
| region | total_revenue | avg_deal | deal_count | underperforming |
| str    | i64           | f64      | u32        | bool            |
+========+===============+==========+============+=================+
| East   |         26000 |  13000.0 |          2 | false           |
| North  |         27900 |   9300.0 |          3 | false           |
| South  |         15700 |   7850.0 |          2 | true            |
| West   |         14000 |   7000.0 |          2 | true            |
+--------+---------------+----------+------------+-----------------+
Return type: DataFrame

Both backends produce the same data. The function is self-contained -- drop it into any project and it works regardless of which DataFrame library the project uses. To extend this project, add a category groupby dimension, compute month-over-month growth by joining with historical data, or build an HTML report using the aggregated summary. Because the return type matches the input, the result slots naturally into any downstream pipeline the caller already has.

Frequently Asked Questions

Does Narwhals support polars LazyFrame?

Yes. nw.from_native() wraps polars LazyFrame into a Narwhals LazyFrame, and the expression API works the same way. The main difference is that you cannot inspect rows or compute values until you call .collect() -- just like native polars lazy mode. When you call nw.to_native() on a Narwhals LazyFrame, you get a polars LazyFrame back (not a collected DataFrame). If you need the result immediately, call nw.to_native(result.collect()). Use eager_only=True in from_native() to raise an error if a LazyFrame is passed, which is useful for functions that need immediate results.

What do I do if Narwhals doesn't support an operation I need?

Narwhals covers the most common DataFrame operations (filter, select, with_columns, group_by, sort, join, drop_nulls, rename, schema inspection), but it does not wrap every method of every backend. If you need a backend-specific operation, you can always call nw.to_native(df) to drop back to the native DataFrame and use the native API directly. The typical pattern is: do as much as possible in Narwhals, then escape to native only for the specific operation that Narwhals does not cover. You can always wrap the result again with nw.from_native() to continue with the unified API afterward.

Does Narwhals add overhead?

The overhead is minimal -- Narwhals is a thin dispatch layer that translates method calls, not a data-processing engine. Each Narwhals expression compiles to a native expression at call time, and the underlying backend does the actual computation. For large datasets the dominant cost is the backend computation, not the Narwhals translation. Benchmarks from the Narwhals project show overhead in the microseconds range for typical operations. If you are processing tens of millions of rows, use polars directly where performance is the primary concern and Narwhals only at the interoperability boundaries.

Can I use Narwhals with Series, not just DataFrames?

Yes. nw.from_native(series, series_only=True) wraps a pandas or polars Series into a Narwhals Series with a unified API. You can use arithmetic, string methods (.str.to_lowercase(), .str.starts_with()), and datetime accessors (.dt.year(), .dt.month()) the same way across backends. The @nw.narwhalify decorator also handles Series arguments automatically when it detects them. This is useful for column-level utility functions that only need to transform a single column.

How does join work in Narwhals?

Narwhals supports df.join(other, on="key", how="inner") for inner, left, and anti joins. The syntax mirrors polars -- pass on for same-name keys or left_on / right_on for different-named keys. The how parameter accepts "inner", "left", and "anti". Cross joins and full outer joins are not universally supported across all backends and are currently out of scope for Narwhals. For those cases, escape to native as described in the previous FAQ.

When should I NOT use Narwhals?

Avoid Narwhals in three scenarios: (1) single-backend applications where you fully control the input type -- just use pandas or polars directly; (2) performance-critical inner loops where even microsecond overhead compounds -- use native polars there; (3) operations heavily relying on pandas-specific features like MultiIndex, Panel data, or in-place mutation -- Narwhals does not expose these. Narwhals is a tool for interoperability at function boundaries, not a replacement for mastering the individual backends.

Conclusion

Narwhals removes the choice between "support pandas" and "support polars" by making it unnecessary. The pattern is simple: nw.from_native() at the entry point of your function, Narwhals expressions for all your logic, and nw.to_native() at the exit. Or use @nw.narwhalify to handle the wrapping automatically. The unified schema API, expression system, and group_by().agg() pattern cover the vast majority of data transformation work you need to do at library boundaries.

The best next step is to take an existing utility function in your codebase that accepts a DataFrame and add Narwhals support to it. Pick a function with a clear input and output, wrap it with @nw.narwhalify, replace pandas-specific calls with Narwhals expressions, and run your tests against both backends. The migration is usually straightforward for filter, select, group_by, and with_columns operations. For the official documentation, the API reference, and the list of fully supported operations across backends, see narwhals-dev.github.io/narwhals/. The project is actively maintained and expanding its operation coverage with each release.

« Older Entries

Further Reading: For more details, see the Python random module documentation.

Frequently Asked Questions

How do I generate a random number in Python?

Use random.randint(a, b) for integers or random.random() for a float between 0 and 1. Example: import random; num = random.randint(1, 100).

What is the difference between random and secrets?

The random module is for simulations and games but NOT for security. The secrets module provides cryptographically secure randomness for passwords, tokens, and security-sensitive applications.

How do I generate a random list of numbers?

Use [random.randint(1, 100) for _ in range(10)] for random integers. For unique numbers, use random.sample(range(1, 101), 10). For float arrays, use numpy.random.rand(10).

How do I set a random seed?

Call random.seed(42) before generating numbers. The same seed always produces the same sequence, useful for testing and reproducible experiments.

Can I generate numbers following a specific distribution?

Yes. Use random.gauss() for normal, random.uniform() for uniform. NumPy offers numpy.random.normal(), poisson(), binomial(), and many more.

Continue Learning Python

Tutorials you might also find useful:

Post Views: 893

How To Generate Random Numbers In Python

Generating the same random number each time and why this matters

Python Random Number Between 1 and 10

Python Generate Random Numbers From A Range

Generate Random String Of Length n in Python

Random Choice Without Replacement In Python

Generate Date Between Two Dates in Python

Generate Random Temporary Filename in Python

Conclusion

Subscribe

Related Articles

How To Use Narwhals for DataFrame-Agnostic Python Code

Narwhals DataFrame-Agnostic Code: Quick Example

What Is Narwhals and Why Use It?

Installing Narwhals

Wrapping and Unwrapping DataFrames

Converting In: nw.from_native()

Converting Out: nw.to_native()

Using the Narwhals Expression API

Filtering and Selecting Columns

Adding and Transforming Columns

GroupBy and Aggregation

Schema Inspection and Type Handling

Writing Backend-Agnostic Library Functions

Real-Life Example: A Backend-Agnostic Sales Report Generator

Frequently Asked Questions

Does Narwhals support polars LazyFrame?

What do I do if Narwhals doesn't support an operation I need?

Does Narwhals add overhead?

Can I use Narwhals with Series, not just DataFrames?

How does join work in Narwhals?

When should I NOT use Narwhals?

Conclusion

Related Articles

Frequently Asked Questions

How do I generate a random number in Python?

What is the difference between random and secrets?

How do I generate a random list of numbers?

How do I set a random seed?

Can I generate numbers following a specific distribution?

Continue Learning Python

Submit a Comment Cancel reply