CSV vs JSON — When to Use Each Data Format
CSV is ideal for tabular data with tools like Excel and pandas. JSON handles nested, hierarchical data. Here's when to choose CSV vs JSON for data exchange, APIs, and storage.
CSV and JSON are the two most common data interchange formats for tabular and structured data. CSV is a plain-text table. JSON is a hierarchical document. Choosing the wrong format creates conversion overhead and loses data fidelity. Here’s how to decide which to use.
Use the CSV to JSON Converter to convert between formats online.
Structure comparison
CSV — flat tabular data:
id,name,email,age,city
1,Alice,alice@example.com,30,New York
2,Bob,bob@example.com,25,Los Angeles
3,Charlie,charlie@example.com,35,Chicago
JSON — nested hierarchical data:
[
{
"id": 1,
"name": "Alice",
"email": "alice@example.com",
"age": 30,
"address": {
"city": "New York",
"state": "NY",
"zip": "10001"
},
"tags": ["admin", "beta_user"]
}
]
CSV can’t represent nested objects or arrays without flattening (losing structure) or additional encoding. JSON represents any level of nesting natively.
When to use CSV
Spreadsheet compatibility: Excel, Google Sheets, LibreOffice, and Apple Numbers all open CSV natively. If your audience will work with the data in a spreadsheet, CSV is the right choice.
Statistical analysis: Python (pandas), R, and data science tools are optimized for tabular CSV data. pd.read_csv() is one line of code.
Large flat datasets: CSV is the most compact format for flat tabular data. No field name repetition per row:
1,Alice,30 ← 11 bytes per row in CSV
{"id":1,"name":"Alice","age":30} ← 31 bytes per row in JSON
At millions of rows, CSV can be 3–5× smaller than JSON.
Import/export operations: Most databases export to CSV (PostgreSQL COPY, MySQL SELECT INTO OUTFILE). ETL pipelines, data warehouses, and BI tools expect CSV.
Human editing: CSV is easy to edit in a text editor or spreadsheet. JSON with proper indentation is also readable, but CSV is more familiar to non-developers.
When to use JSON
Nested data structures: One-to-many relationships, objects with arrays, hierarchical configs — these require JSON (or XML):
{
"order": {
"id": "ORD-001",
"customer": { "name": "Alice", "email": "alice@example.com" },
"items": [
{ "sku": "A100", "qty": 2, "price": 29.99 },
{ "sku": "B200", "qty": 1, "price": 49.99 }
],
"total": 109.97
}
}
In CSV, you’d need to either flatten (repeating customer info per item) or split into multiple files with foreign keys.
REST APIs: JSON is the standard for REST API responses. Every language and browser can parse JSON natively.
Configuration files: YAML is popular for configs, but JSON works for programmatic configs where comments aren’t needed.
Type preservation: JSON has native number, boolean, and null types. CSV stores everything as strings — 30 in CSV is the string “30”, not the integer 30.
{ "age": 30, "active": true, "score": 9.8 } // Types preserved
age,active,score
30,true,9.8 // All strings — application must parse
Streaming and real-time: JSON Lines (NDJSON) format works for streaming data where each line is a JSON object:
{"event":"click","user_id":1,"timestamp":"2024-03-04T14:20:34Z"}
{"event":"purchase","user_id":2,"timestamp":"2024-03-04T14:21:00Z"}
Converting between formats
CSV to JSON (Python)
import csv
import json
def csv_to_json(csv_path, json_path):
with open(csv_path, encoding='utf-8-sig') as f:
reader = csv.DictReader(f)
rows = list(reader)
with open(json_path, 'w') as f:
json.dump(rows, f, indent=2)
csv_to_json('data.csv', 'data.json')
Note: All values are strings after CSV parsing. Cast types explicitly:
def cast_types(row):
return {
'id': int(row['id']),
'name': row['name'],
'age': int(row['age']),
'active': row['active'].lower() == 'true',
'score': float(row['score']),
}
rows = [cast_types(r) for r in reader]
JSON to CSV (Python)
import csv
import json
def json_to_csv(json_path, csv_path):
with open(json_path) as f:
data = json.load(f)
if not data:
return
# Get all unique keys for header:
fieldnames = list(data[0].keys())
with open(csv_path, 'w', newline='', encoding='utf-8') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
json_to_csv('data.json', 'data.csv')
Problem with nested JSON: Nested objects must be flattened:
def flatten_dict(d, parent_key='', sep='_'):
items = []
for k, v in d.items():
new_key = f'{parent_key}{sep}{k}' if parent_key else k
if isinstance(v, dict):
items.extend(flatten_dict(v, new_key, sep).items())
elif isinstance(v, list):
items.append((new_key, json.dumps(v))) # Lists as JSON string
else:
items.append((new_key, v))
return dict(items)
# Flattens {"address": {"city": "NY"}} → {"address_city": "NY"}
File size comparison
import json, csv, sys
data = [{"id": i, "name": f"User {i}", "email": f"user{i}@example.com", "age": 25+i}
for i in range(10000)]
# JSON size:
json_str = json.dumps(data)
print(f"JSON: {len(json_str.encode())/1024:.1f}KB")
# CSV size:
import io
buf = io.StringIO()
writer = csv.DictWriter(buf, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
print(f"CSV: {len(buf.getvalue().encode())/1024:.1f}KB")
# JSON: ~750KB
# CSV: ~430KB (43% smaller)
# JSON (minified): ~680KB
# Gzipped JSON: ~85KB
# Gzipped CSV: ~70KB
For network transfer, gzip compression eliminates most size differences. But for local storage and disk reads, CSV’s compactness at scale is meaningful.
Decision table
| Use case | CSV | JSON |
|---|---|---|
| Excel/Google Sheets import | ✓ | ✗ |
| Data science (pandas/R) | ✓ | ✓ |
| REST API response | ✗ | ✓ |
| Nested objects | ✗ | ✓ |
| Large flat datasets (>1M rows) | ✓ | ✗ |
| Config files | ✗ | ✓ |
| Database export | ✓ | ✓ |
| Browser consumption | ✗ | ✓ |
| Type-safe data | ✗ | ✓ |
| Non-developer editing | ✓ | ~ |
Related tools
- CSV to JSON Converter — convert CSV to JSON online
- CSV Format Guide — CSV structure and parsing
- CSV Quoting Rules — handling special characters
Related posts
- CSV Quoting and Escaping Rules (the Real Ones, Not the Folklore) — CSV looks trivial until your spreadsheet has a comma in a name field. Here's the…
- CSV Data Validation — Schema Validation, Type Checking, and Error Reporting — Validate CSV files before importing them into a database or processing pipeline.…
- Import CSV to Database — PostgreSQL, MySQL, SQLite, and Node.js — Import CSV files into PostgreSQL, MySQL, and SQLite using COPY commands, LOAD DA…
- YAML vs JSON: Which to Use When (and Why It Matters) — JSON is for machines, YAML is for humans, and choosing the wrong one quietly cos…
- CSV Format Guide — Structure, Delimiters, and Common Parsing Issues — CSV (Comma-Separated Values) is a simple tabular text format. Here's the RFC 418…
Related tool
Convert CSV files to JSON with proper quoting and escaping.
Written by Mian Ali Khalid. Part of the Data & Format pillar.