Base64 is the duct tape of the internet — the universal way to move binary data through text-only channels. Email attachments since the early 1990s, JWT tokens, data URIs in CSS, HTTP Basic Auth headers, certificates in PEM format. Whenever a system that handles bytes meets a protocol that expects printable characters, Base64 is the layer between them.

What Base64 actually does

Base64 takes a stream of arbitrary bytes and re-encodes it using a 64-character alphabet that survives transport through ASCII-only systems. Three input bytes (24 bits) become four output characters (4 × 6 bits = 24 bits). When the input length isn't a multiple of three, the encoder pads with one or two = characters so the output length is always a multiple of four. The cost is fixed: the encoded form is roughly 33% larger than the raw bytes.

This is not encryption, not compression, and not a hash. Base64 is fully reversible by anyone who reads it. Treat it as a transport encoding, never as a security mechanism. The number of "I Base64-encoded the password" security bugs in production is genuinely startling.

A brief history

The encoding predates the modern internet. RFC 989 (1987) and RFC 1421 (1993) defined Privacy-Enhanced Mail (PEM), which used a 64-character ASCII subset to embed RSA-signed messages and certificates in email bodies. RFC 1521 (1993, MIME) generalized the encoding for all email attachments — that's where the line-wrap-at-76-chars convention comes from. Today the canonical reference is RFC 4648 (2006), which unified the family: §4 covers standard Base64, §5 covers the URL-safe variant (Base64URL), and §6–§9 cover Base32 and Base16/hex.

Padding: keep it or strip it

The = trailing characters serve one purpose: they let the decoder know how many bytes the original input had when the count wasn't divisible by three. Modern APIs increasingly strip the padding because the length is already implicit and the = is awkward in URLs (it has its own meaning in query strings) and JSON (no semantic baggage, but it's noise).

The toggle in this tool exists because both conventions are alive in production. RFC 7519 (JSON Web Tokens) and RFC 8037 (OKP JWKs) require unpadded Base64URL. RFC 4648 §4 standard Base64 keeps the padding by default. When you encode something to drop into a URL parameter, use unpadded. When you encode something to drop into an HTTP Authorization: Basic header (RFC 7617), keep the padding.

Standard vs URL-safe — same bytes, different alphabet

Standard Base64 uses A-Z a-z 0-9 + /. The + and / characters are not safe in URLs: + is interpreted as a space when a server percent-decodes a query string, and / is the path separator. So RFC 4648 §5 defines the URL-safe variant: - replaces + and _ replaces /. The underlying bytes are identical — only the encoding alphabet changes.

The decoder in this tool auto-detects which alphabet you pasted. If it sees a - or _, it treats the input as URL-safe and translates back to +// before calling atob. If it sees a + or /, it treats the input as standard. If it sees neither (the input is just letters and digits), it defaults to standard — both alphabets agree on [A-Za-z0-9].

Where you'll actually meet Base64

HTTP Basic Auth (RFC 7617): the Authorization header is Basic ${base64(username:password)}. The Base64 here is for transport, not secrecy — anyone who sees the header can decode it. Use HTTPS.
JSON Web Tokens (RFC 7519): three Base64URL-encoded segments separated by dots. The first two segments are JSON (header + payload) that you can decode without a key. Only the signature requires the secret.
Data URIs (RFC 2397): data:image/png;base64,iVBORw0KGgoAAA… embeds a file directly in HTML or CSS. Useful for small icons, terrible for large images — the 33% size overhead defeats HTTP compression.
PEM certificates (RFC 7468): the -----BEGIN CERTIFICATE----- blocks you see in TLS configs are Base64-wrapped DER bytes with line breaks every 64 chars.
Email attachments (MIME, RFC 2045 §6.8): every attachment in an email since the 1990s travels as Base64-encoded bytes, line-wrapped at 76 characters.

Quick debugging recipes

Most production "Base64 problem" tickets resolve to one of a handful of failure modes. When a decoder rejects the input, check whether the source produced URL-safe characters and the destination expected standard alphabet (or vice versa). When a JWT debugger tells you the signature is invalid but the payload looks correct, the encoded header almost always lost or gained a = in a careless copy-paste. When a binary file decodes to garbage, you probably ran the bytes through a UTF-8 decoder somewhere along the path — Base64 is not UTF-8 and not all decoded bytes are text. Thefatal: true flag on TextDecoder, which this tool uses, catches that case explicitly rather than masking it with replacement characters.

The 33% tax, and what alternatives buy you

Every encoding that turns binary into text pays a size tax. Base64 maps 3 bytes (24 bits) into 4 characters (24 bits of information across 4 × 8-bit code units = 32 bits on the wire), so the floor is exactly 33.33% overhead. Add padding and the worst case becomes 33.33% + up to 2 chars. Add MIME line-wrapping at 76 characters and you also pay 2 chars per 76 for the CRLF. So a 1 MB binary file lands at roughly 1.37 MB Base64-encoded with MIME wrap, or about 1.33 MB without wrap.

Why is Base64 the dominant choice when alternatives exist? Base32 (RFC 4648 §6) is more human-friendly — no case sensitivity, no easily-confused glyphs — but inflates to 60% overhead. Base85/ASCII85 (used in PDF and Git diffs) drops to 25% overhead but uses characters like !, ", and < that don't survive most transport layers. Base122 squeezes further but assumes UTF-8 transport end-to-end. Base64 sits at a Goldilocks point: 33% overhead, an alphabet that fits in 7-bit ASCII, and four decades of universal tooling support. The economics rarely favor anything else.

Compression is the better lever when size matters. Gzipping a 1 MB binary down to 300 KB and then Base64-encoding gets you to 400 KB on the wire — better than the 1.33 MB you'd get from Base64-only. The order matters: compress, then encode. Encoding before compressing means the gzip compressor sees a high-entropy 64-character alphabet and finds almost nothing to squeeze.

UTF-8 and the `btoa` trap

The browser's built-in btoa() function only accepts characters in the 0–255 byte range. Calling btoa('日本語') throws InvalidCharacterError. The correct path is TextEncoder → bytes → binary string → btoa, which is what this tool does internally. If you ever see "InvalidCharacterError" while encoding Base64 in JavaScript, you're calling btoa directly on a multi-byte string. The fix is one line: encode to UTF-8 first.

Privacy: nothing leaves your browser

Server-side Base64 tools log the input you typed via load balancer logs, request mirrors, application logs, and any analytics middleware that captures POST bodies. That matters when the input is a password (Basic Auth headers), a JWT payload, a draft certificate, or a data URI of a sensitive screenshot. This tool runs the entire encode/decode pipeline locally — the Web platform's btoa and atob functions are part of every modern browser and need no network call. The recent-inputs dropdown is also privacy-preserving: it remembers your input and the chosen mode (encode-padded, encode-unpadded, decode), and recomputes the output on recall. The output itself is never written to disk.

Base64 Encoder/Decoder

About this generator

What Base64 actually does

A brief history

Padding: keep it or strip it

Standard vs URL-safe — same bytes, different alphabet

Where you'll actually meet Base64

Quick debugging recipes

The 33% tax, and what alternatives buy you

UTF-8 and the `btoa` trap

Privacy: nothing leaves your browser

Frequently asked questions

What Base64 actually does

A brief history

Padding: keep it or strip it

Standard vs URL-safe — same bytes, different alphabet

Where you'll actually meet Base64

Quick debugging recipes

The 33% tax, and what alternatives buy you

UTF-8 and the btoa trap

Privacy: nothing leaves your browser

Frequently asked questions

UTF-8 and the `btoa` trap