X Xerobit

Character Counter — What It Counts and Why It Matters

Character count isn't just about Twitter limits. Email subject lines, SMS, SEO meta tags, database columns, and form inputs all have hard character limits that break silently.

Mian Ali Khalid · · 7 min read
Use the tool
Word Counter
Count words, characters, sentences, paragraphs, and lines. Reading time estimate, char-limit indicators for X, LinkedIn, meta titles, and more.
Open Word Counter →

A character counter tells you exactly how many characters are in a piece of text. That sounds trivial until you’re debugging why your SMS campaign delivered as two messages, why your email subject got truncated in Gmail, or why a database INSERT failed with a value-too-long error at 3 AM on a Saturday.

Character counting is a precision task. This guide covers what gets counted, where limits come from, and how to use a character counter to avoid the most common truncation bugs.

Use the Word Counter on this site — it counts characters (with and without spaces), words, sentences, and paragraphs simultaneously while you type.

What counts as a character

This is less obvious than it sounds.

ASCII characters (the easy case)

For plain English text, a character is a character: a is 1 character, ! is 1 character, a space is 1 character. ASCII covers 128 code points — all the letters, digits, punctuation, and control characters you’d find on a standard US keyboard. In this range, “character” and “byte” are interchangeable.

Unicode and multi-byte characters

Most modern text isn’t pure ASCII. Accented characters (é, ñ, ü), emoji (🔥, ✅), CJK characters (中文), and mathematical symbols are encoded as Unicode code points. The number of bytes a character uses depends on the encoding:

CharacterUTF-8 bytesUTF-16 bytesDisplayed as
a121 character
é221 character
321 character
🔥441 character
👨‍👩‍👧25varies1 character (grapheme cluster)

When a system says “character limit,” it usually means one of:

  • Code points (what humans perceive as characters)
  • UTF-8 bytes (relevant for database columns and HTTP headers)
  • UTF-16 code units (relevant for JavaScript’s .length property)

The family emoji 👨‍👩‍👧 is technically multiple Unicode code points joined by zero-width joiners — JavaScript counts it as 8 characters (str.length === 8), but humans see it as 1. This gap causes bugs.

How JavaScript counts characters

'hello'.length        // 5 — correct
'café'.length         // 4 — correct
'🔥'.length           // 2 — WRONG (emoji are UTF-16 surrogate pairs)
'👨‍👩‍👧'.length        // 8 — WRONG (multiple code points + joiners)

// Correct approach for visible characters:
[...'🔥'].length       // 1
[...'👨‍👩‍👧'].length     // Still wrong (3, counts ZWJ sequences as separate)

// For true grapheme clusters, use Intl.Segmenter:
const seg = new Intl.Segmenter();
[...seg.segment('👨‍👩‍👧')].length // 1 — correct

If you’re building a character-limited input field, you must decide: does the limit apply to code points, UTF-16 code units, or grapheme clusters? Twitter counts grapheme clusters (each visible character = 1). SMS counts bytes. Postgres varchar(255) counts characters (code points). Get this wrong and users can “input” text that silently exceeds the actual storage limit.

Where character limits come from

Character limits are almost never arbitrary. They come from technical constraints in the underlying protocol, storage format, or display medium.

SMS: 160 characters (or 153)

GSM 7-bit encoding fits 160 characters in a single SMS. But if your message contains any character outside the GSM 7-bit alphabet — a single emoji, a smart quote, an accented é — the encoding switches to UCS-2, and the limit drops to 70 characters per segment.

Multi-part SMS segments reserve 7 characters per segment for the User Data Header (UDH) that tells the receiving phone how to reassemble the message. So a two-part SMS gives you 153+153 = 306 GSM characters, or 134+134 = 268 UCS-2 characters.

The trap: SMS platforms typically show you a raw character count, not a segment count. You write “We’ll see you tomorrow! 🎉” (28 characters), assume it’s well under 160, and your provider bills you for a 2-segment UCS-2 message because the emoji triggered the encoding switch.

Use a character counter that shows both raw count and SMS segment count. The Word Counter shows character count; for SMS segment analysis, paste your text into an SMS-aware counter.

Email subject lines: 40–60 characters for display

There’s no hard technical limit on email subject line length (RFC 5322 allows up to 998 characters per line, 78 recommended). The limit is what email clients display:

Client / contextSubject characters shown
Gmail desktop~70 characters
Gmail mobile~30–40 characters
Apple Mail mobile~35–40 characters
Outlook desktop~70 characters
iPhone push notification~45 characters

The preheader text (the snippet of body text shown next to the subject in inbox views) is 85–140 characters in most clients. Keep subjects under 50 characters to stay safe across all clients. Put the most important word first.

HTML title tag: 50–60 characters for Google

Google truncates title tags that exceed roughly 580px width at display size. At standard display font, that’s about 55–60 characters. But Google rewrites titles it deems too long, too short, or mismatched to the content — so 50–60 characters that accurately describe the page is the target.

Meta description display limit is about 920px — roughly 155–160 characters. Beyond that, Google truncates with an ellipsis, and the truncated text still shows in the SERP.

Database columns: bytes matter more than characters

A VARCHAR(255) in MySQL defaults to 255 characters, not bytes. But:

  • VARCHAR(255) CHARACTER SET utf8mb4 can store up to 255 × 4 = 1,020 bytes
  • If an indexed column with this setup participates in a combined index, MySQL may throw “Row size too large” or “Specified key was too long; max key length is 767 bytes”

PostgreSQL varchar(255) limits to 255 code points. It handles multi-byte characters transparently — you can store 255 Chinese characters in a varchar(255), consuming up to 765 UTF-8 bytes.

For backend developers: always count bytes when sizing DB columns for data from external sources (user input, API responses). Plan for multi-byte characters.

HTTP headers: 8KB total

Web servers (Nginx, Apache, and most load balancers) limit the total size of HTTP request headers. Nginx’s default is large_client_header_buffers 4 8k — four 8KB buffers. The practical limit per header line is around 8,000 characters.

Cookie bloat is the most common trigger: accumulated cookies from analytics, A/B testing, and ad platforms can push request headers over the limit, causing 414 Request-URI Too Large or 400 Bad Request responses that look unrelated to cookies.

Twitter / X: 280 characters

Twitter moved from 140 to 280 characters in 2017 — but media attachments (images, videos, polls) don’t count, and URLs are always shortened to 23 characters regardless of the original URL length.

Twitter counts grapheme clusters (not bytes, not UTF-16 code units) with a few exceptions for multi-codepoint emoji. The displayed counter in the Tweet composer is the authoritative reference.

How to use a character counter effectively

For SEO content

When writing meta titles and descriptions:

  1. Paste the title into the character counter
  2. Aim for 50–60 characters for titles (mark 55 as your target)
  3. Aim for 120–158 characters for meta descriptions
  4. Include your primary keyword in the first 50 characters of the title

For body content: Google has not confirmed a minimum word count, but pages under 300 words rarely rank for competitive terms. The sweet spot for tool support pages is 800–1,500 words.

For social media

PlatformRecommended lengthHard limit
Twitter / X240–270280
LinkedIn post600–7003,000
Facebook post40–8063,206
Instagram caption125 chars before “more”2,200
TikTok caption80–1002,200

For code review

When reviewing database schema changes, count the maximum expected byte length of each column, not just the nominal limit. A name VARCHAR(100) that stores names from global users needs 100 × 4 = 400 bytes reserved for worst-case UTF-8. If it’s part of a composite index, this affects max key length constraints.

For API endpoints that accept text fields, add a server-side character count validation that matches your database column definition. Don’t rely on frontend validation alone — requests can bypass UI.

Counting characters in different tools

In the terminal

# Count bytes (not characters):
echo -n "hello" | wc -c

# Count Unicode characters (wc -m uses locale):
echo -n "café" | wc -m    # 4 (correct)
echo -n "🔥" | wc -m      # 1 (correct on most modern systems)

# Python for accurate Unicode character count:
python3 -c "print(len('your text here'))"

# Python for grapheme clusters (install grapheme library):
python3 -c "import grapheme; print(grapheme.length('👨‍👩‍👧'))"

In JavaScript / Node

const text = 'Hello, café! 🔥';

// Code units (UTF-16) — what .length gives you:
console.log(text.length);  // 16

// Code points (what humans count):
console.log([...text].length);  // 15

// Grapheme clusters (true visible characters):
const seg = new Intl.Segmenter('en');
console.log([...seg.segment(text)].length);  // 15

In Python

text = 'Hello, café! 🔥'

# Code points (what len() counts):
len(text)           # 15 — correct for most purposes

# UTF-8 bytes:
len(text.encode('utf-8'))  # 18 (é=2 bytes, 🔥=4 bytes)

# For grapheme clusters, use the `grapheme` package:
import grapheme
grapheme.length(text)  # 15

Common mistakes

Counting bytes instead of characters for display limits. Your CMS might store text as UTF-8 bytes and apply a “character limit” of 255. If users enter emoji or accented characters, they’ll hit the limit sooner than expected.

Ignoring HTML entities. & is 5 characters but displays as &. If your meta description is generated from HTML, decode entities before counting.

Forgetting zero-width characters. Copy-pasted text from Word, PDFs, or some websites includes zero-width non-breaking spaces (U+FEFF) or zero-width joiners that are invisible but count as characters in most systems. A character counter should strip or highlight these.

Not accounting for trailing whitespace. Paste text with a trailing newline into a character counter — that newline is a character. Many counters strip it; some don’t. Know which mode your counter uses.

The Word Counter on Xerobit

The Word Counter counts characters with and without spaces separately, which matters for systems that count only visible characters (no spaces) vs. all characters (including spaces). It updates in real time, handles multi-byte Unicode correctly, and also shows word count, sentence count, paragraph count, and estimated reading time — so you have all the metrics you need for a piece of content in one view.

  • Case Converter — convert between uppercase, lowercase, title case, camelCase, snake_case
  • Lorem Ipsum Generator — generate placeholder text of specific lengths
  • Text Diff — compare two versions of text to find changes

Related posts

Related tool

Word Counter

Count words, characters, sentences, paragraphs, and lines. Reading time estimate, char-limit indicators for X, LinkedIn, meta titles, and more.

Written by Mian Ali Khalid. Part of the Dev Productivity pillar.