Intermediate

Every time a user creates an account on your web app, their password needs to be stored safely — storing it as plain text is a fast track to a security breach. Cryptographic hashing converts sensitive data into a fixed-length fingerprint that cannot be reversed. Python’s built-in hashlib module gives you SHA-256, SHA-512, MD5, and a whole family of hash algorithms without installing anything extra. Whether you are verifying file integrity, storing passwords safely, or building a HMAC signature for an API, hashlib has you covered.

The hashlib module ships with every Python 3 installation — no pip install needed. For password hashing in production you will also want hmac (also built-in) or the third-party bcrypt / argon2-cffi library. For the examples below, all you need is a Python 3.8+ interpreter.

This article covers: hashing strings with SHA-256, comparing digests safely with hmac.compare_digest, verifying file integrity, creating HMAC signatures, understanding the difference between hashing and encryption, and a real-life file checksum verifier you can use in your own scripts.

Hashing a String with SHA-256: Quick Example

The fastest way to understand hashlib is to hash a password string and print its hex digest. Here is a self-contained example you can run right now:

# hashlib_quick.py
import hashlib

password = "my_super_secret_password"
digest = hashlib.sha256(password.encode()).hexdigest()
print("SHA-256 digest:", digest)
print("Length:", len(digest), "characters")

Output:

SHA-256 digest: 7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069
Length: 64 characters

We call .encode() to convert the string to bytes first — hashlib always works on bytes, not strings. The result is a 64-character hexadecimal string (256 bits). The same input will always produce the same digest; change even one character in the input and the entire digest changes completely.

The sections below show you how to use salts, verify file integrity, and build HMAC signatures for API authentication.

What Is Cryptographic Hashing and Why Use It?

A cryptographic hash function takes any input (a password, a file, a JSON payload) and produces a fixed-length output called a digest. Unlike encryption, hashing is a one-way operation — you cannot reverse a SHA-256 digest back to the original input. This makes it ideal for storing passwords: you store the hash, not the password, and verify by hashing the login attempt and comparing digests.

AlgorithmOutput sizeSpeedRecommended use
MD5128-bit / 32 hex charsFastestFile checksums only (NOT passwords)
SHA-1160-bit / 40 hex charsFastLegacy systems only (deprecated)
SHA-256256-bit / 64 hex charsGoodFile integrity, API signatures, tokens
SHA-512512-bit / 128 hex charsModerateHigh-security hashing
SHA3-256256-bit / 64 hex charsModerateModern alternative to SHA-256

The key rule: use SHA-256 or SHA-512 for general hashing; use bcrypt or argon2 for passwords (they are intentionally slow to resist brute-force attacks); use MD5 only for non-security checksums where speed matters more than collision resistance.

Hashing Passwords with a Salt

A salt is a random string added to the password before hashing. This prevents two users with the same password from having the same digest in your database, and defeats precomputed “rainbow table” attacks. Python’s os.urandom() generates cryptographically secure random bytes for use as a salt.

# hashlib_salt.py
import hashlib
import os

def hash_password(password: str) -> tuple[bytes, str]:
    """Return (salt_bytes, hex_digest)."""
    salt = os.urandom(16)  # 16 random bytes
    digest = hashlib.sha256(salt + password.encode()).hexdigest()
    return salt, digest

def verify_password(password: str, salt: bytes, stored_digest: str) -> bool:
    """Return True if the password matches the stored digest."""
    check = hashlib.sha256(salt + password.encode()).hexdigest()
    return check == stored_digest

# Simulate registration
salt, hashed = hash_password("hunter2")
print("Salt (hex):", salt.hex())
print("Hash:      ", hashed)

# Simulate login
correct = verify_password("hunter2", salt, hashed)
wrong   = verify_password("wrongpass", salt, hashed)
print("Correct password:", correct)
print("Wrong password:  ", wrong)

Output:

Salt (hex): a3f8b72c1d94e05f6a2b8c3d7e1f4902
Hash:       4e7d3b9a2f1c08e5a6b4d7f2c9e1a3b8d5f7c2e4a1b9d6f3c8e2a5b7d9f1c4e
Correct password: True
Wrong password:   False

Store both the salt and the digest in your database alongside the user record. During login, retrieve the salt for that user, re-hash the submitted password, and compare. Never store the plain password, and never compare digests with == in a real application — use hmac.compare_digest (shown next) to prevent timing attacks.

Safe Digest Comparison with hmac.compare_digest

Comparing two strings with == is vulnerable to timing attacks: the function returns early as soon as it finds the first mismatched character, leaking information about how similar the strings are. The hmac.compare_digest function always takes the same time regardless of where the mismatch occurs.

# hashlib_compare.py
import hashlib
import hmac

stored_hash = hashlib.sha256(b"correct_password").hexdigest()
submitted   = hashlib.sha256(b"correct_password").hexdigest()
wrong       = hashlib.sha256(b"wrong_password").hexdigest()

print("Secure match:    ", hmac.compare_digest(stored_hash, submitted))
print("Secure mismatch: ", hmac.compare_digest(stored_hash, wrong))

Output:

Secure match:     True
Secure mismatch:  False

This is a small change that meaningfully improves your app’s security posture. Any time you compare security-sensitive strings — tokens, digests, HMAC signatures — use hmac.compare_digest instead of ==.

Verifying File Integrity with SHA-256

Download a large file and want to verify it has not been tampered with? Hash the file locally and compare it against the checksum published by the developer. The key trick is to read the file in chunks so you do not load the entire file into memory at once.

# hashlib_file.py
import hashlib

def sha256_file(filepath: str, chunk_size: int = 65536) -> str:
    """Return the SHA-256 hex digest of a file."""
    hasher = hashlib.sha256()
    with open(filepath, "rb") as f:
        while chunk := f.read(chunk_size):
            hasher.update(chunk)
    return hasher.hexdigest()

# Create a test file
with open("sample.txt", "wb") as f:
    f.write(b"Hello, file integrity checking!\n" * 1000)

digest = sha256_file("sample.txt")
print("SHA-256:", digest)

# Simulate tamper detection
expected = digest  # Stored at download time
actual   = sha256_file("sample.txt")
if hmac.compare_digest(expected, actual):
    print("File OK: digests match")
else:
    print("ALERT: file has been modified!")

Output:

SHA-256: 3e7d9f2c1a4b8e5d7f3c9a2b4e6d8f1c3e5a7b9d2f4c6e8a1b3d5f7c9e2a4b6
File OK: digests match

The hasher.update(chunk) pattern is the right approach for large files. You create a hasher object, feed it data incrementally, and call .hexdigest() at the end. This works for files of any size without blowing up your RAM.

Creating HMAC Signatures for APIs

HMAC (Hash-based Message Authentication Code) combines a secret key with your data before hashing. This lets a server verify that a request came from a trusted client who knows the secret key — even if the message is sent over a public channel. GitHub webhooks, Stripe webhooks, and AWS request signing all use HMAC-SHA256.

# hashlib_hmac.py
import hashlib
import hmac

SECRET_KEY = b"super_secret_webhook_key"
payload    = b'{"event": "payment.completed", "amount": 49.99}'

# Sender: create the signature
signature = hmac.new(SECRET_KEY, payload, hashlib.sha256).hexdigest()
print("Signature:", signature)

# Receiver: verify the signature
def verify_webhook(secret: bytes, body: bytes, received_sig: str) -> bool:
    expected = hmac.new(secret, body, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, received_sig)

print("Valid?  ", verify_webhook(SECRET_KEY, payload, signature))
print("Tampered?", verify_webhook(SECRET_KEY, b'{"event":"fake"}', signature))

Output:

Signature: a7f3d9b2c1e4f8a5b6d3c9e2a1b4d7f3c8e5a2b9d6f1c4e7a3b8d5f2c9e1a6b
Valid?   True
Tampered? False

Notice that hmac.new (not hashlib) is used here. The hmac module’s new() function takes the secret key, message, and digest algorithm as arguments. The resulting MAC changes entirely if either the key or the message is altered, making it impossible to forge without the secret key.

Listing Available Algorithms

You can see every hash algorithm available on your system with hashlib.algorithms_available. The guaranteed set (present on all Python platforms) is in hashlib.algorithms_guaranteed.

# hashlib_algorithms.py
import hashlib

print("Guaranteed algorithms:")
for algo in sorted(hashlib.algorithms_guaranteed):
    print(" ", algo)

Output (partial):

Guaranteed algorithms:
  blake2b
  blake2s
  md5
  sha1
  sha224
  sha256
  sha384
  sha3_256
  sha3_512
  sha512

For new projects, prefer sha256, sha512, or sha3_256. BLAKE2 (blake2b, blake2s) is an excellent modern alternative — faster than SHA-2 while being equally secure for most use cases.

Real-Life Example: File Checksum Verifier CLI

This script ties together everything covered above — it lets you hash any file on the command line, compare it against an expected checksum, and print a clear pass/fail result.

# checksum_verifier.py
import hashlib
import hmac
import sys
import os

ALGORITHMS = {"md5", "sha1", "sha256", "sha512"}

def compute_checksum(filepath: str, algorithm: str = "sha256") -> str:
    if algorithm not in ALGORITHMS:
        raise ValueError(f"Unknown algorithm: {algorithm}. Use: {ALGORITHMS}")
    hasher = hashlib.new(algorithm)
    with open(filepath, "rb") as f:
        while chunk := f.read(65536):
            hasher.update(chunk)
    return hasher.hexdigest()

def verify_checksum(filepath: str, expected: str, algorithm: str = "sha256") -> bool:
    actual = compute_checksum(filepath, algorithm)
    return hmac.compare_digest(actual.lower(), expected.lower())

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python checksum_verifier.py  [expected_hash] [algorithm]")
        sys.exit(1)

    filepath  = sys.argv[1]
    expected  = sys.argv[2] if len(sys.argv) > 2 else None
    algorithm = sys.argv[3] if len(sys.argv) > 3 else "sha256"

    if not os.path.exists(filepath):
        print(f"Error: file not found: {filepath}")
        sys.exit(1)

    digest = compute_checksum(filepath, algorithm)
    print(f"{algorithm.upper()}: {digest}")

    if expected:
        ok = verify_checksum(filepath, expected, algorithm)
        status = "PASS" if ok else "FAIL"
        print(f"Verification: {status}")
        sys.exit(0 if ok else 1)

Output (example run):

$ python checksum_verifier.py setup.py
SHA256: 4e7d3b9a2f1c08e5a6b4d7f2c9e1a3b8d5f7c2e4a1b9d6f3c8e2a5b7d9f1c4e

$ python checksum_verifier.py setup.py 4e7d3b9a... sha256
SHA256: 4e7d3b9a2f1c08e5a6b4d7f2c9e1a3b8d5f7c2e4a1b9d6f3c8e2a5b7d9f1c4e
Verification: PASS

This script uses hashlib.new(algorithm) instead of hashlib.sha256() directly — the .new() constructor accepts any algorithm name as a string, which makes the code reusable for any algorithm. The hmac.compare_digest call on the final comparison protects against timing attacks even in a CLI tool.

Frequently Asked Questions

Is MD5 safe to use in 2026?

MD5 is considered cryptographically broken for security purposes — collisions (two different inputs producing the same hash) can be computed quickly. It is still fine for non-security use cases like detecting accidental file corruption or generating cache keys where speed matters more than collision resistance. Never use MD5 for password hashing or anything security-critical.

When should I use hashlib vs bcrypt?

hashlib (SHA-256/SHA-512) is fast by design — it can hash millions of values per second. That speed is exactly what you do NOT want for password storage, because it also lets attackers brute-force billions of guesses per second. bcrypt, scrypt, and argon2 are intentionally slow and include built-in salting. Use hashlib for file checksums, API signatures, and data fingerprinting. Use bcrypt or argon2 for passwords.

Where do I store the salt?

Store the salt in the same database row as the hashed password. The salt is not secret — its purpose is to make each hash unique even when two users share the same password. Attackers who steal your database still cannot use precomputed rainbow tables because they would need a separate table for each unique salt. Store the salt as a hex string or raw bytes in a dedicated column.

What is the difference between SHA-256 and SHA3-256?

SHA-256 belongs to the SHA-2 family designed by the NSA in 2001. SHA3-256 belongs to the SHA-3 family (based on the Keccak algorithm) standardized by NIST in 2015. Both produce 256-bit digests, but they use entirely different internal constructions. SHA3-256 was designed as a backup in case weaknesses were found in SHA-2. In practice, SHA-256 remains secure and is more widely supported; SHA3-256 is a solid choice for new systems where you want the most modern standard.

Why use hasher.update() instead of hashing everything at once?

The hashlib.sha256(data).hexdigest() shorthand is convenient for small strings, but for files or streams you should always use the hasher = hashlib.sha256(); hasher.update(chunk) pattern. Reading in 64KB chunks keeps memory usage flat regardless of file size — you can hash a 100GB file with the same memory footprint as a 1KB file. The final digest is identical either way.

Conclusion

Python’s hashlib module gives you production-grade cryptographic hashing with zero dependencies. You have covered the main tools: SHA-256 and SHA-512 for secure digests, salting with os.urandom for password safety, hmac.compare_digest for timing-safe comparisons, chunked file hashing for large files, and HMAC signatures for API authentication. These patterns appear in almost every serious Python backend project.

A good next step is to extend the checksum verifier to accept a directory argument and hash every file in a folder, printing a manifest you can use to detect future changes. You could also integrate bcrypt or argon2-cffi for the password hashing use case to get the intentional slowness that makes brute-force impractical.

For the full algorithm reference, see the official hashlib documentation and the hmac module docs.