Binary to Text: How Binary Numbers Represent Characters
Binary to text conversion isn't magic — it's a lookup table. ASCII, Unicode, UTF-8, and how computers turn 0s and 1s into the characters you read every day.
Binary is base-2 arithmetic. Text is a sequence of characters. Converting between them requires a character encoding standard — a table that maps numbers to characters. Understanding that table is what makes “binary to text” conversion go from mysterious to obvious.
Use the Number Base Converter on this site to convert between binary, decimal, hexadecimal, and octal in real time.
Binary basics
A bit is a single binary digit: 0 or 1. A byte is 8 bits. With 8 bits, you can represent 2⁸ = 256 distinct values (0 through 255 in decimal).
| Binary | Decimal | Hex |
|---|---|---|
| 00000000 | 0 | 0x00 |
| 00000001 | 1 | 0x01 |
| 01000001 | 65 | 0x41 |
| 01100001 | 97 | 0x61 |
| 11111111 | 255 | 0xFF |
The decimal value of a binary number is calculated by multiplying each bit by the power of 2 corresponding to its position and summing:
01000001 (binary)
= 0×2⁷ + 1×2⁶ + 0×2⁵ + 0×2⁴ + 0×2³ + 0×2² + 0×2¹ + 1×2⁰
= 0 + 64 + 0 + 0 + 0 + 0 + 0 + 1
= 65 (decimal)
Binary, decimal, octal (base-8), and hexadecimal (base-16) are just different notations for the same underlying numbers. The number 65 is always 65 — whether you write it as 01000001 in binary, 65 in decimal, 101 in octal, or 41 in hex.
ASCII: the original binary-to-text table
ASCII (American Standard Code for Information Interchange, 1963) defined the first widely-adopted character encoding. It uses 7 bits to represent 128 characters: 33 non-printing control characters, 95 printable characters (uppercase letters, lowercase letters, digits, punctuation).
Key ASCII code points:
| Char | Decimal | Binary | Hex |
|---|---|---|---|
| Space | 32 | 00100000 | 0x20 |
0 | 48 | 00110000 | 0x30 |
9 | 57 | 00111001 | 0x39 |
A | 65 | 01000001 | 0x41 |
Z | 90 | 01011010 | 0x5A |
a | 97 | 01100001 | 0x61 |
z | 122 | 01111010 | 0x7A |
Notice the pattern: uppercase letters start at 65 (A), lowercase at 97 (a). The difference is exactly 32, which in binary is 00100000 — a single bit flip in the 6th position. This is why XOR-ing an ASCII letter with 32 toggles between upper and lowercase.
# Toggle case using XOR in Python
ord('A') ^ 32 # 97 → 'a'
ord('a') ^ 32 # 65 → 'A'
chr(65 ^ 32) # 'a'
Converting ASCII text to binary
“Hello” in ASCII binary:
| Char | Decimal | Binary |
|---|---|---|
| H | 72 | 01001000 |
| e | 101 | 01100101 |
| l | 108 | 01101100 |
| l | 108 | 01101100 |
| o | 111 | 01101111 |
Full binary representation: 01001000 01100101 01101100 01101100 01101111
Binary-to-text conversion works in reverse: take each 8-bit group, convert to decimal, look up the ASCII table. The Number Base Converter automates this — paste binary and it shows you the decimal equivalent; you can then cross-reference with an ASCII table or use a dedicated ASCII/binary converter.
Extended ASCII and the 128–255 range
ASCII only uses 7 bits (0–127). The 8th bit, which can push values to 128–255, is used differently depending on the code page. IBM Code Page 437 (used in DOS) put box-drawing characters in this range. Windows-1252 (common in Western European software) put accented characters like é, ñ, ü.
This is where encoding problems begin. A file created on a system using Windows-1252 contains byte 0xE9 for é. On a system using ISO-8859-1, 0xE9 is also é. But on a system expecting UTF-8, 0xE9 alone is an invalid byte (UTF-8 uses multi-byte sequences for code points above 127). Result: garbled text, typically shown as replacement characters (�) or mojibake (é).
The fix is Unicode.
Unicode: one encoding for all text
Unicode assigns a unique code point to every character in every human writing system — over 140,000 characters as of Unicode 15.1. Code points are written as U+ followed by a hex number:
| Character | Code point | Unicode name |
|---|---|---|
| A | U+0041 | LATIN CAPITAL LETTER A |
| é | U+00E9 | LATIN SMALL LETTER E WITH ACUTE |
| 中 | U+4E2D | CJK UNIFIED IDEOGRAPH-4E2D |
| 🔥 | U+1F525 | FIRE |
| 💩 | U+1F4A9 | PILE OF POO |
Unicode is a standard, not an encoding. It defines what code points exist. The encoding (how to store code points as bytes) is a separate question. The dominant encoding is UTF-8.
UTF-8: variable-width Unicode encoding
UTF-8 encodes Unicode code points as 1–4 bytes per character:
| Code point range | UTF-8 byte count | Byte structure |
|---|---|---|
| U+0000–U+007F | 1 byte | 0xxxxxxx |
| U+0080–U+07FF | 2 bytes | 110xxxxx 10xxxxxx |
| U+0800–U+FFFF | 3 bytes | 1110xxxx 10xxxxxx 10xxxxxx |
| U+10000–U+10FFFF | 4 bytes | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
The first 128 code points (U+0000–U+007F) are identical to ASCII — 1 byte each. This makes UTF-8 backwards-compatible with ASCII. A file of pure ASCII text is valid UTF-8.
UTF-8 encoding example
The é character (U+00E9, decimal 233) encodes to 2 bytes in UTF-8:
233 = 0xE9— falls in the U+0080–U+07FF range (2 bytes needed)- Binary of 233:
11101001 - Template:
110xxxxx 10xxxxxx - Fill in bits:
110 0001110 101001→0xC3 0xA9
So é is stored as two bytes: 0xC3 0xA9.
The emoji 🔥 (U+1F525, decimal 128293):
- Falls in U+10000–U+10FFFF range — 4 bytes needed
- Binary of 128293:
11111010100100101 - Template:
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx - Result:
0xF0 0x9F 0x94 0xA5
Binary in programming
In Python
# Integer to binary string:
bin(65) # '0b1000001'
bin(65)[2:] # '1000001' (strip the 0b prefix)
format(65, '08b') # '01000001' (zero-padded to 8 bits)
# Binary string to integer:
int('01000001', 2) # 65
# Text to binary (ASCII/UTF-8):
text = 'Hi'
binary = ' '.join(format(ord(c), '08b') for c in text)
# '01001000 01101001'
# Binary back to text:
binary_str = '01001000 01101001'
text = ''.join(chr(int(b, 2)) for b in binary_str.split())
# 'Hi'
In JavaScript
// Decimal to binary:
(65).toString(2) // '1000001'
(65).toString(2).padStart(8, '0') // '01000001'
// Binary string to decimal:
parseInt('01000001', 2) // 65
// Character to binary:
'A'.charCodeAt(0).toString(2).padStart(8, '0') // '01000001'
// Text to binary:
'Hi'.split('').map(c => c.charCodeAt(0).toString(2).padStart(8, '0')).join(' ')
// '01001000 01101001'
In C
#include <stdio.h>
void print_binary(unsigned char byte) {
for (int i = 7; i >= 0; i--) {
printf("%d", (byte >> i) & 1);
}
}
int main() {
char *text = "Hi";
for (int i = 0; text[i]; i++) {
print_binary(text[i]);
printf(" ");
}
// Output: 01001000 01101001
return 0;
}
Number base conversion: binary, octal, decimal, hex
Binary (base-2), octal (base-8), decimal (base-10), and hexadecimal (base-16) are all used in computing. Here’s how they relate:
| Decimal | Binary | Octal | Hex |
|---|---|---|---|
| 0 | 0000 | 0 | 0 |
| 8 | 1000 | 10 | 8 |
| 10 | 1010 | 12 | A |
| 15 | 1111 | 17 | F |
| 16 | 10000 | 20 | 10 |
| 255 | 11111111 | 377 | FF |
| 256 | 100000000 | 400 | 100 |
Binary to octal: group bits in threes from the right, convert each group.
11001010 → 11 001 010 → 3 1 2 → octal 312
Binary to hex: group bits in fours from the right, convert each group.
11001010 → 1100 1010 → C A → hex 0xCA
Octal to binary: convert each octal digit to 3 bits.
312 → 3=011, 1=001, 2=010 → 011001010
The Number Base Converter handles all these conversions instantly — paste a binary, decimal, hex, or octal value and see all representations simultaneously.
Why hexadecimal is preferred over binary in practice
Raw binary strings like 11111111 11000000 10101000 00000001 are accurate but hard to read. Engineers prefer hexadecimal because:
- Each hex digit represents exactly 4 bits, so 1 byte = 2 hex characters
0xFF 0xC0 0xA8 0x01(8 characters + spaces) is far more scannable than the 32-bit binary equivalent- IP addresses, memory addresses, color codes, and error codes are all conventionally shown in hex
The mental mapping: once you memorize 0-9 and A=10, B=11, C=12, D=13, E=14, F=15, you can convert single hex digits to 4-bit binary instantly. A = 1010, F = 1111, 5 = 0101. A hex byte like 0xAF = 1010 1111.
Related tools
- Number Base Converter — convert between binary, decimal, hex, and octal
- Hash Generator — compute MD5, SHA-256, and other checksums
- Base64 Encoder/Decoder — encode binary data as ASCII-safe text
Related posts
- Binary Arithmetic — Addition, Subtraction, and Two's Complement — Learn how computers perform binary arithmetic: binary addition with carry, two's…
- Binary to Decimal — Convert Binary Numbers the Right Way — Binary to decimal conversion is foundational to understanding how computers stor…
- Bitmask and Bitwise Operations — Flags, Permissions, and Bit Manipulation — Bitmasks store multiple boolean flags in a single integer using bitwise AND, OR,…
Related tool
Convert between binary, octal, decimal, hexadecimal, and text (UTF-8). Handles arbitrary lengths. Per-byte and per-character views.
Written by Mian Ali Khalid. Part of the Encoding & Crypto pillar.