iKit
Tutorial · 11 min read ·

How to Verify File Integrity with MD5 and SHA-256 Checksums

Verify any download with MD5 or SHA-256 in your terminal, on Windows, or fully in your browser — and learn when MD5 is dangerously weak.

How to Verify File Integrity with MD5 and SHA-256 Checksums

How to Verify File Integrity with MD5 and SHA-256 Checksums

You downloaded an installer, an ISO, or a release tarball. The publisher posts a SHA256SUMS file next to it. Now what? Most guides tell you to "compare the hashes" and stop there, leaving out which command to run on which OS, what a mismatch actually means, and when MD5 is fine versus dangerously broken. This post walks through the entire workflow — generate, verify, and interpret — across macOS, Linux, Windows, and the browser.

What a checksum actually proves

A checksum is a fixed-length fingerprint of a file. Pass the same bytes in, get the same digest out. Change one bit anywhere — a flipped pixel, a stray newline, a corrupted block — and the digest changes completely. That property is what makes it useful for integrity verification.

How a hash function turns a file into a fingerprint

A cryptographic hash function reads the file in fixed-size blocks, mixes each block into an internal state, and outputs a digest of fixed length. SHA-256 always produces 256 bits (64 hex characters), regardless of whether the input is a 4-byte text file or a 50 GB Blu-ray rip. The function is deterministic, one-way, and avalanche-sensitive: a one-bit change in the input flips roughly half the output bits.

$ echo -n "hello" | shasum -a 256
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

$ echo -n "Hello" | shasum -a 256
185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969

One byte changed (hH), and the digest is unrecognizable. That's the property a checksum exploits.

What checksums catch — and what they don't

A matching checksum proves three things: the file you have is byte-for-byte identical to the one the publisher hashed, no accidental corruption happened in transit, and no silent disk error mangled the file at rest. It does not prove the publisher is who they claim to be — for that you need a digital signature, where the checksum is encrypted with the publisher's private key. The iKit Hash Generator makes the integrity step easy; authenticity is a separate problem solved by GPG, code signing, or HTTPS with a trusted certificate authority.

A worked example

When the Linux Mint team's web server was breached in 2016, attackers swapped the ISO download but didn't have access to the announcement page. Anyone who actually compared the published SHA-256 to their downloaded file caught the substitution within minutes. Anyone who skipped the step installed a backdoored OS. Checksums are cheap insurance, and the cost of skipping them is asymmetric — small effort, large potential damage.

Generate a checksum on every operating system

You don't need to install anything. Every OS shipped with this for a decade.

macOS and Linux

Both ship with a shasum (or sha256sum) and md5 (or md5sum) utility:

# SHA-256 of a single file
shasum -a 256 ubuntu-24.04.iso
# 9d6c9ad...e3f2  ubuntu-24.04.iso

# MD5
md5 ubuntu-24.04.iso          # macOS
md5sum ubuntu-24.04.iso       # Linux

# Multiple algorithms in one pass with OpenSSL
openssl dgst -sha256 -md5 ubuntu-24.04.iso

The -a flag selects the algorithm: 1 for SHA-1 (deprecated), 256 for SHA-256, 512 for SHA-512. On Linux, sha256sum is the GNU coreutils equivalent — same output, slightly different formatting.

Windows

Windows ships CertUtil (since Windows 7) and PowerShell's Get-FileHash (since PowerShell 4). Both work without admin rights or third-party downloads:

# PowerShell — recommended
Get-FileHash -Algorithm SHA256 .\setup.exe
Get-FileHash -Algorithm MD5    .\setup.exe

# CertUtil — works on every Windows since 7
certutil -hashfile setup.exe SHA256
certutil -hashfile setup.exe MD5

CertUtil prints the digest with spaces between bytes; pipe it through Where-Object or strip whitespace before comparing. PowerShell's output is cleaner: an object with Algorithm, Hash, and Path properties you can pipe into Compare-Object.

In the browser without uploading

The Web Crypto API (SubtleCrypto.digest) hashes files entirely on the client. No server touches the bytes. Here's the minimal working version:

async function sha256(file) {
  const buffer = await file.arrayBuffer();
  const digest = await crypto.subtle.digest("SHA-256", buffer);
  return Array.from(new Uint8Array(digest))
    .map(b => b.toString(16).padStart(2, "0"))
    .join("");
}

const input = document.querySelector("input[type=file]");
input.addEventListener("change", async () => {
  const hash = await sha256(input.files[0]);
  console.log(hash);
});

For files larger than a few hundred megabytes, switch to streaming with TransformStream so you don't load the whole buffer into memory. The free iKit Hash Generator handles streaming and supports MD5, SHA-1, SHA-256, SHA-512, and CRC32 — all in the browser, nothing uploaded.

Verify a downloaded file step by step

Generating a hash is half the job. The other half is comparing it correctly to the official one.

Step 1: Find the publisher's checksum

Reputable projects publish their checksums in one of three ways. A SHA256SUMS text file alongside the download, a hash printed on the release page (always over HTTPS), or a detached GPG signature (.sig or .asc) that wraps the checksum file. Avoid checksums hosted on a different mirror than the file — if the mirror is compromised, both the file and its checksum can be swapped together. Always pull the checksum from the canonical, HTTPS-protected origin.

Step 2: Compare byte-for-byte, not by eyeball

A SHA-256 digest is 64 hex characters. Eyeballing them invites mistakes. Use a tool that compares the strings programmatically:

# macOS / Linux — exit code 0 means match
echo "9d6c9ad...e3f2  ubuntu-24.04.iso" | shasum -a 256 -c

# Windows PowerShell
$expected = "9d6c9ad...e3f2"
(Get-FileHash -Algorithm SHA256 .\ubuntu-24.04.iso).Hash -eq $expected

The -c flag of shasum reads a SHA256SUMS file and verifies every entry, exiting non-zero if anything fails to match. This is what you want in CI pipelines and release scripts.

Step 3: Interpret a mismatch

If the hashes differ, the file was changed somewhere between the publisher and your disk. Common causes, in order of likelihood:

  • The download was truncated. Re-download and try again.
  • A man-in-the-middle on a coffee-shop Wi-Fi rewrote the bytes (rare with HTTPS, more common with mirror redirects).
  • The publisher updated the file but forgot to update the checksum.
  • The file was actually tampered with.

Re-download once. If the second hash still mismatches, do not run the file until you've confirmed the official hash from a second trusted channel — for example, the project's GitHub releases page, their Mastodon or X announcement, or a signed SHASUMS.gpg.

MD5 vs SHA-1 vs SHA-256 vs BLAKE3

Not all hashes are equal, and the popular advice "use SHA-256 for everything" hides nuance. Here's what to actually pick:

Algorithm Output size Status Speed (1 GB) Best for
MD5 128 bits Broken for collisions (2008+) 2.1 s Cache keys, dedup, error checking
SHA-1 160 bits Broken for collisions (SHAttered, 2017) 3.4 s Legacy systems only — avoid
SHA-256 256 bits Secure, NIST-recommended 4.0 s Downloads, signatures, blockchains
SHA-512 512 bits Secure, faster than SHA-256 on 64-bit 2.7 s Servers and 64-bit Linux ISOs
BLAKE3 256 bits Secure, modern parallel design 0.4 s Bulk hashing, content-addressed storage

Speeds are approximate, measured on a 2024 M3 MacBook Pro. BLAKE3 is the fastest secure choice on modern multicore hardware, but isn't yet ubiquitous on package distributions, so most projects still publish SHA-256.

Why MD5 is broken — and when it's still fine

MD5 was designed in 1991 and standardized as RFC 1321. In 2004, researchers found a collision — two different 128-byte inputs producing the same MD5. By 2008, attackers had used a chosen-prefix collision attack to forge a real SSL certificate (Stevens et al., 2009). Today, generating an MD5 collision takes minutes on a single GPU.

That said, MD5 is still useful where accidental corruption is the threat:

  • ETag headers and HTTP cache validation.
  • Deduplicating files in a backup tool.
  • Detecting bit rot during a tape restore.
  • Quick sanity checks on internal artifacts.

It is not safe when an attacker controls the input — never use MD5 to verify code, installers, or anything where someone could craft a malicious file with a colliding hash.

SHA-1 is also dead — even Git is moving on

SHA-1 fell to a practical collision in 2017 (Google's SHAttered attack) and a chosen-prefix collision in 2020 (the SHA-1 is a Shambles paper). Browsers stopped accepting SHA-1 certificates in 2017. Git is in the middle of a multi-year migration to SHA-256 object IDs. If you control the system, switch.

SHA-256 is the modern default

SHA-256 is part of the SHA-2 family standardized in NIST FIPS 180-4. No practical attack exists against it as of 2026, and the closest theoretical advances still leave more than 100 bits of margin. Hardware acceleration (Intel SHA Extensions, ARMv8 Crypto Extensions) makes it fast enough for almost any workload. When in doubt, use SHA-256.

Common mistakes that break checksum verification

Even when the cryptography is sound, the workflow has sharp edges.

Hashing the wrong bytes

The most common cause of a mismatch is hashing a slightly different file. Watch out for:

  • Editors that add a trailing newline on save.
  • Downloads delivered with Content-Encoding: gzip and decompressed transparently by your browser.
  • Line-ending conversions on Windows (\n\r\n).
  • BOM markers prepended by Notepad to UTF-8 text files.

If you suspect transparent decompression, use curl --compressed-ssl --raw or wget --no-content-encoding to see the raw bytes.

Comparing the wrong format

Some tools output uppercase hex (A4F2…), others lowercase (a4f2…). They're equivalent but won't match a strict string comparison. Lowercase is the usual convention for SHA256SUMS files, so normalize before comparing:

expected=$(echo "$1" | tr '[:upper:]' '[:lower:]')
actual=$(shasum -a 256 "$2" | cut -d' ' -f1)
[ "$expected" = "$actual" ] && echo "ok" || echo "MISMATCH"

Trusting the checksum file blindly

A SHA256SUMS file delivered over plain HTTP from the same compromised mirror as the binary is worthless — an attacker replaces both. Always pull the checksum file over HTTPS from the project's primary domain, or verify a SHA256SUMS.asc GPG signature against a key you've cross-checked. If you're storing files in a JSON manifest with their hashes, validate the structure of the manifest with a tool like the iKit JSON Decoder before trusting the values inside.

Using the wrong algorithm for the wrong job

Don't use a fast hash like MD5 to store passwords (that's what bcrypt and Argon2 are for) and don't use bcrypt to verify a download (it's deliberately slow and produces a per-input salt). For randomly generated secrets like API keys and reset tokens, use a CSPRNG-backed tool like the iKit Password Generator — never derive them from a hash function applied to predictable input.

Related on iKit

Related posts