Intermediate

Raw numbers in a spreadsheet rarely tell a compelling story. A well-crafted chart, on the other hand, can reveal patterns in seconds that would take minutes of scanning rows and columns. Python’s seaborn library sits on top of matplotlib and turns complex statistical visualizations into one-line function calls with beautiful default styling. Whether you need a quick histogram, a correlation heatmap, or a multi-faceted regression plot, Seaborn handles the heavy lifting so you can focus on understanding your data.

In this tutorial, you will learn how to install Seaborn, create common chart types including scatter plots, bar charts, histograms, and heatmaps, customise styles and colour palettes, work with real-world datasets, build multi-plot grids with FacetGrid, and export publication-ready figures. By the end, you will have the tools to turn any pandas DataFrame into a visual story.

Data Visualizations with Seaborn: Quick Example

Let us start with a scatter plot that reveals the relationship between two variables in a built-in dataset. This gets you from zero to a polished chart in four lines.

# quick_seaborn.py
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")
sns.scatterplot(data=tips, x="total_bill", y="tip", hue="time")
plt.title("Tips by Total Bill")
plt.tight_layout()
plt.savefig("tips_scatter.png", dpi=150)
plt.show()

What this does step by step:

sns.load_dataset("tips") fetches a built-in DataFrame with restaurant tipping data. sns.scatterplot creates a scatter plot with total_bill on the x-axis and tip on the y-axis, automatically colouring points by time (Lunch vs Dinner). The plt.tight_layout() call prevents labels from being clipped, and savefig exports the chart at 150 DPI. You should see a clear upward trend: larger bills tend to produce larger tips, with dinner and lunch points neatly separated by colour.

Installing Seaborn and Its Dependencies

Seaborn requires matplotlib, pandas, and numpy, but pip handles all of them automatically.

# Install seaborn (pulls matplotlib, pandas, numpy)
pip install seaborn

# Verify the installation
python -c "import seaborn as sns; print(sns.__version__)"

If you are using Jupyter notebooks, Seaborn works out of the box. For scripts, remember to call plt.show() to display charts in a window, or use plt.savefig() to save them directly to files. Seaborn version 0.12 and above uses a new interface with the objects module, but the classic functional API we use here remains fully supported and is still the most common approach you will find in tutorials and production code.

Essential Chart Types in Seaborn

Seaborn organises its plotting functions into categories based on the type of relationship you want to show. Understanding which function to use for your data is half the battle.

Scatter Plots and Line Plots for Relationships

Use sns.scatterplot when you want to see how two continuous variables relate. Use sns.lineplot when your x-axis represents a sequence like time.

import seaborn as sns
import matplotlib.pyplot as plt

# Scatter plot with size encoding
tips = sns.load_dataset("tips")
sns.scatterplot(
    data=tips,
    x="total_bill",
    y="tip",
    hue="day",
    size="size",
    sizes=(20, 200),
    alpha=0.7
)
plt.title("Tip Amount vs Total Bill by Day")
plt.show()

# Line plot for time series
fmri = sns.load_dataset("fmri")
sns.lineplot(
    data=fmri,
    x="timepoint",
    y="signal",
    hue="region",
    style="event",
    errorbar="sd"
)
plt.title("FMRI Signal Over Time")
plt.show()

The hue parameter assigns colours by category, size maps a numeric column to marker diameter, and alpha controls transparency so overlapping points remain visible. For line plots, errorbar="sd" adds a shaded band showing the standard deviation, giving you a sense of how much the data varies at each time point.

Bar Charts and Count Plots for Categories

When one of your axes represents a category rather than a continuous number, bar charts are the right choice.

# Average tip by day
sns.barplot(data=tips, x="day", y="tip", hue="sex", errorbar="sd")
plt.title("Average Tip by Day and Gender")
plt.show()

# Count occurrences in each category
sns.countplot(data=tips, x="day", hue="smoker", palette="Set2")
plt.title("Visits per Day by Smoker Status")
plt.show()

sns.barplot automatically computes the mean and adds error bars. sns.countplot simply tallies how many rows fall into each category, which is perfect for understanding the distribution of categorical variables in your dataset.

Histograms and Distribution Plots

Understanding how a single variable is distributed is fundamental to any analysis. Seaborn gives you several options.

# Histogram with KDE overlay
sns.histplot(data=tips, x="total_bill", bins=25, kde=True, color="steelblue")
plt.title("Distribution of Total Bills")
plt.show()

# KDE plot comparing groups
sns.kdeplot(data=tips, x="total_bill", hue="time", fill=True, alpha=0.5)
plt.title("Bill Distribution: Lunch vs Dinner")
plt.show()

# Box plot for comparing distributions
sns.boxplot(data=tips, x="day", y="total_bill", hue="smoker", palette="coolwarm")
plt.title("Bill Amounts by Day and Smoker Status")
plt.show()

Setting kde=True overlays a kernel density estimate curve on the histogram, smoothing out the bars into a continuous shape. sns.kdeplot with fill=True creates shaded density curves, making it easy to compare two groups visually. Box plots show the median, quartiles, and outliers in a compact format that works well when you have multiple categories to compare side by side.

Building Heatmaps for Correlation Analysis

Heatmaps turn a matrix of numbers into a colour-coded grid, making it easy to spot strong correlations at a glance. They are one of Seaborn’s most popular features.

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Load a dataset and compute correlation matrix
penguins = sns.load_dataset("penguins").dropna()
numeric_cols = penguins.select_dtypes(include=[np.number])
corr_matrix = numeric_cols.corr()

# Create an annotated heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(
    corr_matrix,
    annot=True,
    fmt=".2f",
    cmap="RdBu_r",
    center=0,
    square=True,
    linewidths=0.5,
    vmin=-1,
    vmax=1
)
plt.title("Penguin Measurements Correlation Matrix")
plt.tight_layout()
plt.show()

The annot=True parameter prints the correlation value inside each cell, while fmt=".2f" rounds it to two decimal places. cmap="RdBu_r" uses a red-blue diverging colour scheme where red means strong positive correlation and blue means strong negative. Setting center=0 ensures that zero correlation appears as white, making the pattern immediately interpretable.

Customising Styles and Colour Palettes

Seaborn comes with five built-in themes and a flexible palette system that lets you match your charts to any brand or presentation style.

# Set a global theme
sns.set_theme(style="whitegrid", font_scale=1.2)

# Compare available styles
fig, axes = plt.subplots(1, 4, figsize=(16, 4))
styles = ["darkgrid", "whitegrid", "dark", "white"]

for ax, style in zip(axes, styles):
    with sns.axes_style(style):
        sns.barplot(data=tips, x="day", y="tip", ax=ax, palette="viridis")
        ax.set_title(style)

plt.tight_layout()
plt.show()

The five styles are darkgrid, whitegrid, dark, white, and ticks. For colour palettes, you can use named palettes like "viridis", "Set2", or "coolwarm", or create your own with sns.color_palette("husl", 8) for 8 evenly spaced hues. The font_scale parameter is especially useful when preparing charts for presentations where you need larger text.

# Custom colour palette
custom_palette = sns.color_palette(["#2ecc71", "#e74c3c", "#3498db", "#f39c12"])
sns.barplot(data=tips, x="day", y="tip", palette=custom_palette)
plt.title("Tips by Day (Custom Colours)")
plt.show()

Multi-Plot Grids with FacetGrid

When you want to see how a pattern changes across different categories, FacetGrid creates a grid of small multiples, each showing the same chart type for a different subset of your data.

# FacetGrid: histogram for each day
g = sns.FacetGrid(tips, col="day", col_wrap=2, height=4)
g.map_dataframe(sns.histplot, x="total_bill", kde=True, color="steelblue")
g.set_titles("{col_name}")
g.set_axis_labels("Total Bill ($)", "Count")
g.tight_layout()
plt.show()

# PairGrid: scatter matrix for numeric columns
penguins = sns.load_dataset("penguins").dropna()
g = sns.pairplot(
    penguins,
    hue="species",
    diag_kind="kde",
    plot_kws={"alpha": 0.6, "s": 40}
)
g.fig.suptitle("Penguin Species Comparison", y=1.02)
plt.show()

FacetGrid takes a DataFrame and a column to split on (col for columns, row for rows). The col_wrap parameter controls how many plots fit in each row before wrapping. pairplot is a convenience function that creates a scatter matrix showing every numeric column against every other, with distribution plots on the diagonal. It is one of the fastest ways to explore a new dataset.

Real-World Example: Analysing Flight Delays

Let us put everything together with a real-world scenario. Suppose you have a dataset of flight information and want to understand seasonal passenger patterns.

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load the flights dataset (monthly passenger counts 1949-1960)
flights = sns.load_dataset("flights")

# Pivot for heatmap
flights_pivot = flights.pivot(index="month", columns="year", values="passengers")

# Create a comprehensive dashboard
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# 1. Heatmap of passengers by month and year
sns.heatmap(flights_pivot, annot=True, fmt="d", cmap="YlOrRd",
            ax=axes[0, 0], cbar_kws={"label": "Passengers"})
axes[0, 0].set_title("Monthly Passengers (1949-1960)")

# 2. Line plot showing yearly trends
sns.lineplot(data=flights, x="year", y="passengers", hue="month",
             palette="tab20", ax=axes[0, 1], legend=False)
axes[0, 1].set_title("Passenger Trends by Month")

# 3. Box plot of monthly distributions
sns.boxplot(data=flights, x="month", y="passengers",
            palette="coolwarm", ax=axes[1, 0])
axes[1, 0].set_title("Monthly Passenger Distribution")
axes[1, 0].tick_params(axis="x", rotation=45)

# 4. Bar plot of yearly totals
yearly = flights.groupby("year")["passengers"].sum().reset_index()
sns.barplot(data=yearly, x="year", y="passengers",
            palette="Blues_d", ax=axes[1, 1])
axes[1, 1].set_title("Total Passengers per Year")
axes[1, 1].tick_params(axis="x", rotation=45)

plt.suptitle("Flight Passenger Analysis Dashboard", fontsize=16, y=1.02)
plt.tight_layout()
plt.savefig("flight_dashboard.png", dpi=150, bbox_inches="tight")
plt.show()

This dashboard combines four chart types into a single figure. The heatmap reveals that summer months consistently have the highest passenger counts. The line plot shows a clear upward trend across all months over the years. The box plot highlights that July has the widest range of values, while the bar chart confirms that total yearly passengers grew steadily from 1949 to 1960. Building multi-panel dashboards like this is one of the most practical skills you can develop with Seaborn.

Saving and Exporting Publication-Ready Figures

Creating a great chart is only half the job. You also need to export it at the right resolution and format for your audience.

# Save as PNG at high resolution
fig, ax = plt.subplots(figsize=(10, 6))
sns.histplot(data=tips, x="total_bill", kde=True, ax=ax)
ax.set_title("Distribution of Total Bills")

# PNG for web and presentations
plt.savefig("chart.png", dpi=300, bbox_inches="tight", facecolor="white")

# SVG for scalable vector graphics (papers, reports)
plt.savefig("chart.svg", format="svg", bbox_inches="tight")

# PDF for LaTeX documents
plt.savefig("chart.pdf", format="pdf", bbox_inches="tight")

plt.close()
print("Charts saved successfully")

Use dpi=300 for print quality and dpi=150 for web use. The bbox_inches="tight" parameter trims whitespace around the chart. SVG format is ideal for reports because it scales without pixelation. Always call plt.close() after saving to free memory, especially when generating many charts in a loop.

Frequently Asked Questions

What is the difference between Seaborn and Matplotlib?

Matplotlib is the foundation that handles the actual drawing of axes, lines, and shapes. Seaborn sits on top of matplotlib and provides a higher-level interface with better default styles, built-in statistical aggregation, and simpler syntax for common chart types. You can mix both in the same script since every Seaborn function returns a matplotlib axes object.

Can I use Seaborn with data that is not in a pandas DataFrame?

Yes, most Seaborn functions accept numpy arrays, Python lists, or dictionaries in addition to DataFrames. However, the DataFrame interface is the most powerful because it allows you to reference column names directly for parameters like hue, size, and style. Converting your data to a DataFrame first is almost always worth the extra line of code.

How do I change the figure size in Seaborn?

For axes-level functions like sns.scatterplot, create the figure first with plt.figure(figsize=(10, 6)) or fig, ax = plt.subplots(figsize=(10, 6)) and pass the ax parameter. For figure-level functions like sns.catplot, use the height and aspect parameters directly.

Why do my Seaborn charts look different from the examples online?

Seaborn version 0.12 changed several default behaviours, including the default theme and some function names. Run sns.set_theme() at the start of your script to apply the modern defaults, and check your version with sns.__version__. Also note that some older tutorials use deprecated functions like distplot which was replaced by histplot and kdeplot.

How do I add labels and annotations to Seaborn plots?

Since Seaborn returns matplotlib axes, you can use all matplotlib annotation functions. Call ax.set_xlabel(), ax.set_ylabel(), and ax.set_title() for basic labels. For annotations pointing to specific data points, use ax.annotate("text", xy=(x, y), xytext=(x2, y2), arrowprops=dict(arrowstyle="->")).

Wrapping Up

Seaborn transforms the often tedious process of data visualization into something fast and enjoyable. You have learned how to create scatter plots, bar charts, histograms, heatmaps, and multi-plot grids, all with Seaborn’s clean one-line syntax. The key to mastering Seaborn is understanding which chart type matches your data: use scatter plots for two continuous variables, bar charts for categorical comparisons, histograms and KDE plots for distributions, and heatmaps for correlation matrices. Combined with FacetGrid for multi-panel layouts and Seaborn’s built-in themes for consistent styling, you now have a complete toolkit for turning raw data into compelling visual stories. Start with the built-in datasets to practice, then apply these patterns to your own data.

Related Articles