Character Counter — What It Counts and Why It Matters
Character count isn't just about Twitter limits. Email subject lines, SMS, SEO meta tags, database columns, and form inputs all have hard character limits that break silently.
A character counter tells you exactly how many characters are in a piece of text. That sounds trivial until you’re debugging why your SMS campaign delivered as two messages, why your email subject got truncated in Gmail, or why a database INSERT failed with a value-too-long error at 3 AM on a Saturday.
Character counting is a precision task. This guide covers what gets counted, where limits come from, and how to use a character counter to avoid the most common truncation bugs.
Use the Word Counter on this site — it counts characters (with and without spaces), words, sentences, and paragraphs simultaneously while you type.
What counts as a character
This is less obvious than it sounds.
ASCII characters (the easy case)
For plain English text, a character is a character: a is 1 character, ! is 1 character, a space is 1 character. ASCII covers 128 code points — all the letters, digits, punctuation, and control characters you’d find on a standard US keyboard. In this range, “character” and “byte” are interchangeable.
Unicode and multi-byte characters
Most modern text isn’t pure ASCII. Accented characters (é, ñ, ü), emoji (🔥, ✅), CJK characters (中文), and mathematical symbols are encoded as Unicode code points. The number of bytes a character uses depends on the encoding:
| Character | UTF-8 bytes | UTF-16 bytes | Displayed as |
|---|---|---|---|
a | 1 | 2 | 1 character |
é | 2 | 2 | 1 character |
中 | 3 | 2 | 1 character |
🔥 | 4 | 4 | 1 character |
👨👩👧 | 25 | varies | 1 character (grapheme cluster) |
When a system says “character limit,” it usually means one of:
- Code points (what humans perceive as characters)
- UTF-8 bytes (relevant for database columns and HTTP headers)
- UTF-16 code units (relevant for JavaScript’s
.lengthproperty)
The family emoji 👨👩👧 is technically multiple Unicode code points joined by zero-width joiners — JavaScript counts it as 8 characters (str.length === 8), but humans see it as 1. This gap causes bugs.
How JavaScript counts characters
'hello'.length // 5 — correct
'café'.length // 4 — correct
'🔥'.length // 2 — WRONG (emoji are UTF-16 surrogate pairs)
'👨👩👧'.length // 8 — WRONG (multiple code points + joiners)
// Correct approach for visible characters:
[...'🔥'].length // 1
[...'👨👩👧'].length // Still wrong (3, counts ZWJ sequences as separate)
// For true grapheme clusters, use Intl.Segmenter:
const seg = new Intl.Segmenter();
[...seg.segment('👨👩👧')].length // 1 — correct
If you’re building a character-limited input field, you must decide: does the limit apply to code points, UTF-16 code units, or grapheme clusters? Twitter counts grapheme clusters (each visible character = 1). SMS counts bytes. Postgres varchar(255) counts characters (code points). Get this wrong and users can “input” text that silently exceeds the actual storage limit.
Where character limits come from
Character limits are almost never arbitrary. They come from technical constraints in the underlying protocol, storage format, or display medium.
SMS: 160 characters (or 153)
GSM 7-bit encoding fits 160 characters in a single SMS. But if your message contains any character outside the GSM 7-bit alphabet — a single emoji, a smart quote, an accented é — the encoding switches to UCS-2, and the limit drops to 70 characters per segment.
Multi-part SMS segments reserve 7 characters per segment for the User Data Header (UDH) that tells the receiving phone how to reassemble the message. So a two-part SMS gives you 153+153 = 306 GSM characters, or 134+134 = 268 UCS-2 characters.
The trap: SMS platforms typically show you a raw character count, not a segment count. You write “We’ll see you tomorrow! 🎉” (28 characters), assume it’s well under 160, and your provider bills you for a 2-segment UCS-2 message because the emoji triggered the encoding switch.
Use a character counter that shows both raw count and SMS segment count. The Word Counter shows character count; for SMS segment analysis, paste your text into an SMS-aware counter.
Email subject lines: 40–60 characters for display
There’s no hard technical limit on email subject line length (RFC 5322 allows up to 998 characters per line, 78 recommended). The limit is what email clients display:
| Client / context | Subject characters shown |
|---|---|
| Gmail desktop | ~70 characters |
| Gmail mobile | ~30–40 characters |
| Apple Mail mobile | ~35–40 characters |
| Outlook desktop | ~70 characters |
| iPhone push notification | ~45 characters |
The preheader text (the snippet of body text shown next to the subject in inbox views) is 85–140 characters in most clients. Keep subjects under 50 characters to stay safe across all clients. Put the most important word first.
HTML title tag: 50–60 characters for Google
Google truncates title tags that exceed roughly 580px width at display size. At standard display font, that’s about 55–60 characters. But Google rewrites titles it deems too long, too short, or mismatched to the content — so 50–60 characters that accurately describe the page is the target.
Meta description display limit is about 920px — roughly 155–160 characters. Beyond that, Google truncates with an ellipsis, and the truncated text still shows in the SERP.
Database columns: bytes matter more than characters
A VARCHAR(255) in MySQL defaults to 255 characters, not bytes. But:
VARCHAR(255) CHARACTER SET utf8mb4can store up to 255 × 4 = 1,020 bytes- If an indexed column with this setup participates in a combined index, MySQL may throw “Row size too large” or “Specified key was too long; max key length is 767 bytes”
PostgreSQL varchar(255) limits to 255 code points. It handles multi-byte characters transparently — you can store 255 Chinese characters in a varchar(255), consuming up to 765 UTF-8 bytes.
For backend developers: always count bytes when sizing DB columns for data from external sources (user input, API responses). Plan for multi-byte characters.
HTTP headers: 8KB total
Web servers (Nginx, Apache, and most load balancers) limit the total size of HTTP request headers. Nginx’s default is large_client_header_buffers 4 8k — four 8KB buffers. The practical limit per header line is around 8,000 characters.
Cookie bloat is the most common trigger: accumulated cookies from analytics, A/B testing, and ad platforms can push request headers over the limit, causing 414 Request-URI Too Large or 400 Bad Request responses that look unrelated to cookies.
Twitter / X: 280 characters
Twitter moved from 140 to 280 characters in 2017 — but media attachments (images, videos, polls) don’t count, and URLs are always shortened to 23 characters regardless of the original URL length.
Twitter counts grapheme clusters (not bytes, not UTF-16 code units) with a few exceptions for multi-codepoint emoji. The displayed counter in the Tweet composer is the authoritative reference.
How to use a character counter effectively
For SEO content
When writing meta titles and descriptions:
- Paste the title into the character counter
- Aim for 50–60 characters for titles (mark 55 as your target)
- Aim for 120–158 characters for meta descriptions
- Include your primary keyword in the first 50 characters of the title
For body content: Google has not confirmed a minimum word count, but pages under 300 words rarely rank for competitive terms. The sweet spot for tool support pages is 800–1,500 words.
For social media
| Platform | Recommended length | Hard limit |
|---|---|---|
| Twitter / X | 240–270 | 280 |
| LinkedIn post | 600–700 | 3,000 |
| Facebook post | 40–80 | 63,206 |
| Instagram caption | 125 chars before “more” | 2,200 |
| TikTok caption | 80–100 | 2,200 |
For code review
When reviewing database schema changes, count the maximum expected byte length of each column, not just the nominal limit. A name VARCHAR(100) that stores names from global users needs 100 × 4 = 400 bytes reserved for worst-case UTF-8. If it’s part of a composite index, this affects max key length constraints.
For API endpoints that accept text fields, add a server-side character count validation that matches your database column definition. Don’t rely on frontend validation alone — requests can bypass UI.
Counting characters in different tools
In the terminal
# Count bytes (not characters):
echo -n "hello" | wc -c
# Count Unicode characters (wc -m uses locale):
echo -n "café" | wc -m # 4 (correct)
echo -n "🔥" | wc -m # 1 (correct on most modern systems)
# Python for accurate Unicode character count:
python3 -c "print(len('your text here'))"
# Python for grapheme clusters (install grapheme library):
python3 -c "import grapheme; print(grapheme.length('👨👩👧'))"
In JavaScript / Node
const text = 'Hello, café! 🔥';
// Code units (UTF-16) — what .length gives you:
console.log(text.length); // 16
// Code points (what humans count):
console.log([...text].length); // 15
// Grapheme clusters (true visible characters):
const seg = new Intl.Segmenter('en');
console.log([...seg.segment(text)].length); // 15
In Python
text = 'Hello, café! 🔥'
# Code points (what len() counts):
len(text) # 15 — correct for most purposes
# UTF-8 bytes:
len(text.encode('utf-8')) # 18 (é=2 bytes, 🔥=4 bytes)
# For grapheme clusters, use the `grapheme` package:
import grapheme
grapheme.length(text) # 15
Common mistakes
Counting bytes instead of characters for display limits. Your CMS might store text as UTF-8 bytes and apply a “character limit” of 255. If users enter emoji or accented characters, they’ll hit the limit sooner than expected.
Ignoring HTML entities. & is 5 characters but displays as &. If your meta description is generated from HTML, decode entities before counting.
Forgetting zero-width characters. Copy-pasted text from Word, PDFs, or some websites includes zero-width non-breaking spaces (U+FEFF) or zero-width joiners that are invisible but count as characters in most systems. A character counter should strip or highlight these.
Not accounting for trailing whitespace. Paste text with a trailing newline into a character counter — that newline is a character. Many counters strip it; some don’t. Know which mode your counter uses.
The Word Counter on Xerobit
The Word Counter counts characters with and without spaces separately, which matters for systems that count only visible characters (no spaces) vs. all characters (including spaces). It updates in real time, handles multi-byte Unicode correctly, and also shows word count, sentence count, paragraph count, and estimated reading time — so you have all the metrics you need for a piece of content in one view.
Related tools
- Case Converter — convert between uppercase, lowercase, title case, camelCase, snake_case
- Lorem Ipsum Generator — generate placeholder text of specific lengths
- Text Diff — compare two versions of text to find changes
Related posts
- Character Limits Cheatsheet: Every Limit That Matters for Developers — Meta titles, descriptions, Twitter, SMS, Open Graph, HTTP headers, SQL identifie…
- Letter Counter — Count Letters and Characters in Text — A letter counter tells you how many alphabetic characters are in your text, sepa…
- Paragraph Counter — Count Paragraphs in Text Online — A paragraph counter identifies paragraph boundaries and gives you the exact coun…
Related tool
Count words, characters, sentences, paragraphs, and lines. Reading time estimate, char-limit indicators for X, LinkedIn, meta titles, and more.
Written by Mian Ali Khalid. Part of the Dev Productivity pillar.