Why JSON Is Not Enough for AI-First Systems

And Why Architecture Must Become Token-Aware

For the past 15 years, JSON (JavaScript Object Notation) has been the undisputed king of data exchange. It dethroned XML, survived the rise of Protocol Buffers and Thrift, and became the default language of the web. It is the lingua franca of APIs, configuration files, document stores (like MongoDB), and frontend-backend communication.

JSON succeeded because it hit the sweet spot: it is human-readable enough to debug, yet structured enough for machines to parse efficiently. It is flexible, schema-less when needed, and supported by virtually every programming language on the planet.

But there is a fundamental shift happening in modern software architecture.

We are no longer just building data-driven systems—systems that move data from a database to a UI and back.
We are building AI-first systems—systems where Large Language Models (LLMs), vector databases, and autonomous agents are first-class citizens.

And in this new AI-first world, JSON is starting to show its age. It is becoming a bottleneck, not because it is "bad," but because it was optimized for a different era and a different set of constraints.

The Problem Is Not JSON Itself

To be clear, JSON is an excellent format for what it was designed to do: represent objects in a way that browsers and servers can easily understand.

However, JSON was created for:

Object Serialization: Turning in-memory objects into text.
Browser Communication: Sending data to JavaScript clients.
Human Readability: Allowing developers to inspect payloads.

AI systems, however, operate on a completely different abstraction layer. They don't care about "objects" or "serialization" in the traditional sense.

AI models do not see objects. They see tokens.

The Token Abstraction: A New Fundamental Unit

In traditional computing, we optimize for bytes (storage, bandwidth) and cycles (CPU processing).
In AI computing, we optimize for tokens.

An LLM does not "understand" JSON syntax the way a parser does. It sees a stream of tokens—chunks of characters that may represent words, sub-words, or symbols. To an LLM, a curly brace { is a token. A quote " is a token. A colon : is a token.

Every single structural character in a JSON payload consumes tokens. And in the world of LLMs, tokens are the scarcest resource.

In AI-first systems:

Token Limits are Hard Constraints: Context windows (8k, 32k, 128k) are finite. Wasting them on syntax reduces the space available for actual reasoning and knowledge.
Token Cost is Real Money: You pay per million tokens. Verbose formats literally cost more to process.
Token Stability is Accuracy: The way data is tokenized directly impacts how well a model can "understand" and retrieve it.

JSON optimizes for structural clarity, not token efficiency.

Problem #1: The "Token Tax" of Syntax

JSON introduces a significant amount of unavoidable overhead, or what I call the "Token Tax."

Consider a standard JSON object:

[
  { "id": 1, "status": "active", "role": "admin" },
  { "id": 2, "status": "inactive", "role": "user" },
  { "id": 3, "status": "active", "role": "user" }
]

For an LLM, this structure is full of noise:

Curly Braces & Brackets: [, ], {, }
Quotes: " around every key and string value.
Repeated Keys: The words "id", "status", and "role" are repeated for every single item.

In a large dataset or a long RAG (Retrieval-Augmented Generation) context, this repetition is disastrous. You might be spending 30-40% of your context window just on JSON syntax and repeated field names.

This leads to:

Reduced Context: You can fit fewer search results into the prompt.
Increased Latency: The model has to generate more tokens to output JSON.
Higher Costs: You are paying to process brackets and quotes.

Problem #2: Embedding Instability and the "Butterfly Effect"

This is perhaps the most subtle and dangerous problem for RAG systems.

Vector embeddings (the numerical representation of text used for semantic search) depend heavily on consistency. If you change the input text slightly, the resulting vector changes.

JSON is non-deterministic by default in many serializers:

Key Order: { "a": 1, "b": 2 } and { "b": 2, "a": 1 } are semantically identical objects, but they produce different token sequences and therefore different embeddings.
Whitespace: Minified JSON vs. pretty-printed JSON produces different vectors.

This instability silently degrades the performance of your AI system:

Semantic Search Failures: A document indexed with one key order might not be found by a query using a different key order.
RAG Recall Drop: Relevant context might be missed because the vector distance is artificially large due to formatting differences.
Agent Memory Issues: An agent might fail to recognize that a new piece of information is the same as something it stored previously.

Problem #3: Trees vs. Paths (Cognitive Load)

JSON represents data as trees (nested hierarchies).
LLMs, however, tend to reason better over linear paths or flat structures.

Deeply nested JSON structures force the model to maintain a "mental stack" of where it is in the hierarchy.

{
  "company": {
    "departments": [
      {
        "engineering": {
          "teams": [
            {
              "name": "backend",
              "leads": ["Alice"]
            }
          ]
        }
      }
    ]
  }
}

To understand that "Alice" is a lead, the model has to attend to the opening braces of company, departments, engineering, and teams. If the context is long, the model might lose track of the parent keys (the "Lost in the Middle" phenomenon).

Flattened, explicit paths are much easier for models to parse and reason about because the context is local to the line.

Problem #4: Logs Become "Dead Data"

In most organizations, logs are the largest source of truth about what the system is actually doing. We log terabytes of JSON every day.

Ideally, this data should be a goldmine for:

Fine-tuning models to understand system behavior.
RAG pipelines for automated debugging.
Anomaly detection agents.

But because JSON logs are so verbose, inconsistent, and noisy, they are effectively "dead data" to AI. They are too expensive to index and too messy to feed into a prompt without massive cleaning.

We are throwing away our most valuable training data because our format is not AI-ready.

Problem #5: Fragile Prompt Construction

Every developer building with LLMs has faced the "JSON formatting" struggle.

You ask the model to output JSON. It mostly works. But sometimes:

It forgets a closing brace }.
It adds a trailing comma , (invalid in standard JSON).
It hallucinates keys.
It adds comments // (invalid in standard JSON).

We spend an enormous amount of engineering effort writing parsers, retry logic, and validators just to get the model to speak valid JSON. This is because JSON's strict syntax rules are hard for a probabilistic model to get right 100% of the time.

Conclusion: Towards a Token-Aware Architecture

The industry needs to recognize that data format is an architectural decision in AI systems.

We cannot simply copy-paste our web development practices into the AI era. We need formats that are:

Token-Efficient: Minimizing syntax overhead.
Deterministic: Ensuring consistent embeddings.
Flat: Reducing cognitive load for models.
Streamable: Easy to generate and parse token-by-token.

JSON will always have its place in the browser and the API contract. But inside the "Intelligence Layer"—where data meets the model—it is time to look for something better.

It is time for Token-Oriented Object Notation (TOON).

What is TOON?

TOON is a structured format designed specifically to sit between your backend systems and your AI models. Unlike JSON, which prioritizes object hierarchy, TOON prioritizes token efficiency and semantic clarity.

It uses a flattened, path-based structure that looks like this:

:order.id 9821
:order.total 1499.99
:order.currency "AUD"
:order.customer.type "business"

This format is:

Deterministic: Always produces the same token sequence for the same data.
Dense: Removes the "token tax" of brackets and repeated quotes.
Context-Rich: Every line contains the full path to the value, making it perfect for RAG chunking.

I have written a comprehensive proposal and specification for this new format.

👉 Read the full introduction here: TOON — Token-Oriented Object Notation