X Xerobit

XML to JSON Converter — Transform XML Data to JSON

Converting XML to JSON maps elements, attributes, and text nodes to JSON objects and arrays. Here's how the conversion works, common pitfalls, and how to convert XML to JSON in...

Mian Ali Khalid · · 6 min read
Use the tool
XML Formatter
Format, validate, and beautify XML documents.
Open XML Formatter →

XML to JSON conversion maps XML’s element and attribute model to JSON’s key-value model. The conversion isn’t one-to-one — XML has concepts (attributes, text nodes, mixed content, namespaces) that JSON doesn’t represent natively. Understanding the mapping rules prevents data loss.

Use the XML Formatter to inspect and transform XML documents.

The basic mapping

Simple XML maps cleanly to JSON:

<user id="1">
  <name>Alice</name>
  <email>alice@example.com</email>
  <age>30</age>
</user>

Becomes:

{
  "user": {
    "@id": "1",
    "name": "Alice",
    "email": "alice@example.com",
    "age": "30"
  }
}

Key differences:

  • Element attributes become keys with @ prefix (by convention)
  • All values are strings (XML has no native numeric type)
  • The root element name becomes the top-level JSON key

Handling repeated elements (arrays)

XML allows repeated sibling elements; JSON uses arrays:

<users>
  <user>Alice</user>
  <user>Bob</user>
  <user>Charlie</user>
</users>

Converts to:

{
  "users": {
    "user": ["Alice", "Bob", "Charlie"]
  }
}

The array ambiguity problem: If there’s only one <user> element, most converters produce an object, not an array:

<users>
  <user>Alice</user>
</users>
{
  "users": {
    "user": "Alice"  // NOT an array!
  }
}

This breaks code that always expects an array. Solutions:

  1. Force array mode in your converter for known-repeated elements
  2. Normalize after conversion: const users = [].concat(data.users.user)
  3. Use XSD schema to define cardinality before converting

Converting XML to JSON in JavaScript

import { XMLParser } from 'fast-xml-parser';

const xml = `<?xml version="1.0"?>
<catalog>
  <book id="bk101">
    <author>Gambardella, Matthew</author>
    <title>XML Developer's Guide</title>
    <price>44.95</price>
  </book>
  <book id="bk102">
    <author>Ralls, Kim</author>
    <title>Midnight Rain</title>
    <price>5.95</price>
  </book>
</catalog>`;

const parser = new XMLParser({
  ignoreAttributes: false,     // Include attributes (as @_attrName)
  attributeNamePrefix: '@_',   // Prefix for attribute keys
  isArray: (name) => name === 'book',  // Always treat <book> as array
  parseAttributeValue: true,   // Parse attribute values (numbers, booleans)
  parseTagValue: true,         // Parse text content (numbers, booleans)
});

const result = parser.parse(xml);
console.log(JSON.stringify(result, null, 2));
/*
{
  "catalog": {
    "book": [
      { "@_id": "bk101", "author": "Gambardella, Matthew", "title": "...", "price": 44.95 },
      { "@_id": "bk102", "author": "Ralls, Kim", "title": "...", "price": 5.95 }
    ]
  }
}
*/

xml2js (alternative)

const xml2js = require('xml2js');

xml2js.parseString(xml, {
  explicitArray: false,   // Don't wrap single items in arrays
  mergeAttrs: true,       // Merge attributes into main object (no $ wrapper)
  normalize: true,        // Trim whitespace in text nodes
}, (err, result) => {
  if (err) throw err;
  console.log(JSON.stringify(result, null, 2));
});

// Async/await version:
const result = await xml2js.parseStringPromise(xml, { mergeAttrs: true });

DOMParser (browser, no library)

function xmlToJson(xml) {
  const parser = new DOMParser();
  const doc = parser.parseFromString(xml, 'application/xml');
  return nodeToJson(doc.documentElement);
}

function nodeToJson(node) {
  const obj = {};
  
  // Handle attributes:
  if (node.attributes && node.attributes.length > 0) {
    obj['@attributes'] = {};
    for (const attr of node.attributes) {
      obj['@attributes'][attr.name] = attr.value;
    }
  }
  
  // Handle children:
  if (node.hasChildNodes()) {
    for (const child of node.childNodes) {
      if (child.nodeType === 3) {  // Text node
        const text = child.textContent.trim();
        if (text) obj['#text'] = text;
      } else if (child.nodeType === 1) {  // Element node
        const childJson = nodeToJson(child);
        if (obj[child.nodeName] !== undefined) {
          // Convert to array for repeated elements:
          if (!Array.isArray(obj[child.nodeName])) {
            obj[child.nodeName] = [obj[child.nodeName]];
          }
          obj[child.nodeName].push(childJson);
        } else {
          obj[child.nodeName] = childJson;
        }
      }
    }
  }
  
  return obj;
}

Python: XML to JSON

xml.etree.ElementTree (standard library)

import xml.etree.ElementTree as ET
import json

def xml_to_dict(element):
    """Convert an XML element to a Python dict."""
    result = {}
    
    # Add attributes with @ prefix:
    for key, value in element.attrib.items():
        result[f'@{key}'] = value
    
    # Add text content:
    if element.text and element.text.strip():
        if element.attrib or len(element):
            result['#text'] = element.text.strip()
        else:
            return element.text.strip()
    
    # Add children:
    for child in element:
        child_data = xml_to_dict(child)
        if child.tag in result:
            # Convert to list for repeated elements:
            if not isinstance(result[child.tag], list):
                result[child.tag] = [result[child.tag]]
            result[child.tag].append(child_data)
        else:
            result[child.tag] = child_data
    
    return result

xml_string = """<users>
  <user id="1"><name>Alice</name><email>alice@example.com</email></user>
  <user id="2"><name>Bob</name><email>bob@example.com</email></user>
</users>"""

root = ET.fromstring(xml_string)
data = {root.tag: xml_to_dict(root)}
print(json.dumps(data, indent=2))

xmltodict (cleaner library)

import xmltodict
import json

xml_string = """<users>
  <user id="1"><name>Alice</name></user>
  <user id="2"><name>Bob</name></user>
</users>"""

# Simple conversion:
data = xmltodict.parse(xml_string)
print(json.dumps(data, indent=2))

# Force arrays for known-repeated elements:
data = xmltodict.parse(xml_string, force_list={'user'})
# Now data['users']['user'] is always a list

# Convert back (JSON to XML):
xml_back = xmltodict.unparse(data, pretty=True)

Handling XML namespaces

Namespaces add complexity:

<root xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <dc:title>My Document</dc:title>
  <dc:creator>Alice</dc:creator>
</root>

With namespace stripping:

{
  "root": {
    "title": "My Document",
    "creator": "Alice"
  }
}

With namespace preservation:

{
  "root": {
    "@xmlns:dc": "http://purl.org/dc/elements/1.1/",
    "dc:title": "My Document",
    "dc:creator": "Alice"
  }
}

Most converters let you choose. For data processing, stripping namespaces is usually cleaner. For round-trip conversion (JSON back to XML), preserve them.

Common conversion pitfalls

Mixed content: XML supports text and elements mixed together:

<para>This has <b>bold</b> and <i>italic</i> text.</para>

No JSON equivalent exists. Converters either flatten this (losing structure) or use specialized formats like {#text: ..., b: ..., i: ...}.

Ordering: JSON objects don’t guarantee key order. XML elements have a defined sequence. If element order matters (most schema-based XML), verify the converter preserves it.

Type loss: All XML content is text. Converting <age>30</age> to JSON can produce "30" (string) or 30 (number) depending on the converter. Explicitly set parseTagValue: true for number conversion.

CDATA sections: CDATA is stripped during conversion — the text content is preserved but the CDATA wrapper is lost. This is usually correct.


Related posts

Related tool

XML Formatter

Format, validate, and beautify XML documents.

Written by Mian Ali Khalid. Part of the Data & Format pillar.