X Xerobit

Binary to Text: How Binary Numbers Represent Characters

Binary to text conversion isn't magic — it's a lookup table. ASCII, Unicode, UTF-8, and how computers turn 0s and 1s into the characters you read every day.

Mian Ali Khalid · · 8 min read
Use the tool
Number Base Converter
Convert between binary, octal, decimal, hexadecimal, and text (UTF-8). Handles arbitrary lengths. Per-byte and per-character views.
Open Number Base Converter →

Binary is base-2 arithmetic. Text is a sequence of characters. Converting between them requires a character encoding standard — a table that maps numbers to characters. Understanding that table is what makes “binary to text” conversion go from mysterious to obvious.

Use the Number Base Converter on this site to convert between binary, decimal, hexadecimal, and octal in real time.

Binary basics

A bit is a single binary digit: 0 or 1. A byte is 8 bits. With 8 bits, you can represent 2⁸ = 256 distinct values (0 through 255 in decimal).

BinaryDecimalHex
0000000000x00
0000000110x01
01000001650x41
01100001970x61
111111112550xFF

The decimal value of a binary number is calculated by multiplying each bit by the power of 2 corresponding to its position and summing:

01000001 (binary)
= 0×2⁷ + 1×2⁶ + 0×2⁵ + 0×2⁴ + 0×2³ + 0×2² + 0×2¹ + 1×2⁰
= 0 + 64 + 0 + 0 + 0 + 0 + 0 + 1
= 65 (decimal)

Binary, decimal, octal (base-8), and hexadecimal (base-16) are just different notations for the same underlying numbers. The number 65 is always 65 — whether you write it as 01000001 in binary, 65 in decimal, 101 in octal, or 41 in hex.

ASCII: the original binary-to-text table

ASCII (American Standard Code for Information Interchange, 1963) defined the first widely-adopted character encoding. It uses 7 bits to represent 128 characters: 33 non-printing control characters, 95 printable characters (uppercase letters, lowercase letters, digits, punctuation).

Key ASCII code points:

CharDecimalBinaryHex
Space32001000000x20
048001100000x30
957001110010x39
A65010000010x41
Z90010110100x5A
a97011000010x61
z122011110100x7A

Notice the pattern: uppercase letters start at 65 (A), lowercase at 97 (a). The difference is exactly 32, which in binary is 00100000 — a single bit flip in the 6th position. This is why XOR-ing an ASCII letter with 32 toggles between upper and lowercase.

# Toggle case using XOR in Python
ord('A') ^ 32  # 97 → 'a'
ord('a') ^ 32  # 65 → 'A'
chr(65 ^ 32)   # 'a'

Converting ASCII text to binary

“Hello” in ASCII binary:

CharDecimalBinary
H7201001000
e10101100101
l10801101100
l10801101100
o11101101111

Full binary representation: 01001000 01100101 01101100 01101100 01101111

Binary-to-text conversion works in reverse: take each 8-bit group, convert to decimal, look up the ASCII table. The Number Base Converter automates this — paste binary and it shows you the decimal equivalent; you can then cross-reference with an ASCII table or use a dedicated ASCII/binary converter.

Extended ASCII and the 128–255 range

ASCII only uses 7 bits (0–127). The 8th bit, which can push values to 128–255, is used differently depending on the code page. IBM Code Page 437 (used in DOS) put box-drawing characters in this range. Windows-1252 (common in Western European software) put accented characters like é, ñ, ü.

This is where encoding problems begin. A file created on a system using Windows-1252 contains byte 0xE9 for é. On a system using ISO-8859-1, 0xE9 is also é. But on a system expecting UTF-8, 0xE9 alone is an invalid byte (UTF-8 uses multi-byte sequences for code points above 127). Result: garbled text, typically shown as replacement characters () or mojibake (é).

The fix is Unicode.

Unicode: one encoding for all text

Unicode assigns a unique code point to every character in every human writing system — over 140,000 characters as of Unicode 15.1. Code points are written as U+ followed by a hex number:

CharacterCode pointUnicode name
AU+0041LATIN CAPITAL LETTER A
éU+00E9LATIN SMALL LETTER E WITH ACUTE
U+4E2DCJK UNIFIED IDEOGRAPH-4E2D
🔥U+1F525FIRE
💩U+1F4A9PILE OF POO

Unicode is a standard, not an encoding. It defines what code points exist. The encoding (how to store code points as bytes) is a separate question. The dominant encoding is UTF-8.

UTF-8: variable-width Unicode encoding

UTF-8 encodes Unicode code points as 1–4 bytes per character:

Code point rangeUTF-8 byte countByte structure
U+0000–U+007F1 byte0xxxxxxx
U+0080–U+07FF2 bytes110xxxxx 10xxxxxx
U+0800–U+FFFF3 bytes1110xxxx 10xxxxxx 10xxxxxx
U+10000–U+10FFFF4 bytes11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

The first 128 code points (U+0000–U+007F) are identical to ASCII — 1 byte each. This makes UTF-8 backwards-compatible with ASCII. A file of pure ASCII text is valid UTF-8.

UTF-8 encoding example

The é character (U+00E9, decimal 233) encodes to 2 bytes in UTF-8:

  1. 233 = 0xE9 — falls in the U+0080–U+07FF range (2 bytes needed)
  2. Binary of 233: 11101001
  3. Template: 110xxxxx 10xxxxxx
  4. Fill in bits: 110 00011 10 1010010xC3 0xA9

So é is stored as two bytes: 0xC3 0xA9.

The emoji 🔥 (U+1F525, decimal 128293):

  1. Falls in U+10000–U+10FFFF range — 4 bytes needed
  2. Binary of 128293: 11111010100100101
  3. Template: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
  4. Result: 0xF0 0x9F 0x94 0xA5

Binary in programming

In Python

# Integer to binary string:
bin(65)        # '0b1000001'
bin(65)[2:]    # '1000001' (strip the 0b prefix)
format(65, '08b')  # '01000001' (zero-padded to 8 bits)

# Binary string to integer:
int('01000001', 2)  # 65

# Text to binary (ASCII/UTF-8):
text = 'Hi'
binary = ' '.join(format(ord(c), '08b') for c in text)
# '01001000 01101001'

# Binary back to text:
binary_str = '01001000 01101001'
text = ''.join(chr(int(b, 2)) for b in binary_str.split())
# 'Hi'

In JavaScript

// Decimal to binary:
(65).toString(2)        // '1000001'
(65).toString(2).padStart(8, '0')  // '01000001'

// Binary string to decimal:
parseInt('01000001', 2)  // 65

// Character to binary:
'A'.charCodeAt(0).toString(2).padStart(8, '0')  // '01000001'

// Text to binary:
'Hi'.split('').map(c => c.charCodeAt(0).toString(2).padStart(8, '0')).join(' ')
// '01001000 01101001'

In C

#include <stdio.h>

void print_binary(unsigned char byte) {
    for (int i = 7; i >= 0; i--) {
        printf("%d", (byte >> i) & 1);
    }
}

int main() {
    char *text = "Hi";
    for (int i = 0; text[i]; i++) {
        print_binary(text[i]);
        printf(" ");
    }
    // Output: 01001000 01101001
    return 0;
}

Number base conversion: binary, octal, decimal, hex

Binary (base-2), octal (base-8), decimal (base-10), and hexadecimal (base-16) are all used in computing. Here’s how they relate:

DecimalBinaryOctalHex
0000000
81000108
10101012A
15111117F
16100002010
25511111111377FF
256100000000400100

Binary to octal: group bits in threes from the right, convert each group. 1100101011 001 0103 1 2 → octal 312

Binary to hex: group bits in fours from the right, convert each group. 110010101100 1010C A → hex 0xCA

Octal to binary: convert each octal digit to 3 bits. 3123=011, 1=001, 2=010011001010

The Number Base Converter handles all these conversions instantly — paste a binary, decimal, hex, or octal value and see all representations simultaneously.

Why hexadecimal is preferred over binary in practice

Raw binary strings like 11111111 11000000 10101000 00000001 are accurate but hard to read. Engineers prefer hexadecimal because:

  • Each hex digit represents exactly 4 bits, so 1 byte = 2 hex characters
  • 0xFF 0xC0 0xA8 0x01 (8 characters + spaces) is far more scannable than the 32-bit binary equivalent
  • IP addresses, memory addresses, color codes, and error codes are all conventionally shown in hex

The mental mapping: once you memorize 0-9 and A=10, B=11, C=12, D=13, E=14, F=15, you can convert single hex digits to 4-bit binary instantly. A = 1010, F = 1111, 5 = 0101. A hex byte like 0xAF = 1010 1111.


Related posts

Related tool

Number Base Converter

Convert between binary, octal, decimal, hexadecimal, and text (UTF-8). Handles arbitrary lengths. Per-byte and per-character views.

Written by Mian Ali Khalid. Part of the Encoding & Crypto pillar.