TOON

Designing Data for AI-First Architectures

For many years, JSON has been the default choice for data exchange. It works well for APIs, browsers, and human-readable payloads.
But modern systems are no longer just data-driven — they are AI-driven.

When Large Language Models, embeddings, RAG pipelines, and autonomous agents become first‑class citizens, tokens become more important than bytes.

This is where TOON (Token‑Oriented Object Notation) fits naturally.

Why I Started Thinking About TOON

In real projects — AI assistants, RAG platforms, observability pipelines — I repeatedly faced the same problems:

JSON structures are verbose for LLMs
Deep nesting increases token noise
Small schema changes break embeddings consistency
Logs are hard to reuse as AI context
Prompt construction becomes fragile

TOON is not meant to replace JSON everywhere.
It is meant to sit between your system and AI.

What Is TOON?

Token‑Oriented Object Notation (TOON) is a structured format where:

Tokens are the primary design unit
Structure is expressed via token paths
Data is flattened but semantically precise
Each line is deterministic and embedding‑friendly

Conceptually

:path.to.value <scalar>

That’s it.

JSON vs TOON — Practical View

JSON

{
  "order": {
    "id": 9821,
    "total": 1499.99,
    "currency": "AUD",
    "customer": {
      "id": 77,
      "type": "business"
    }
  }
}

:order.id 9821
:order.total 1499.99
:order.currency "AUD"
:order.customer.id 77
:order.customer.type "business"

Why this matters:

LLM tokenization becomes stable
Embeddings improve consistency
Partial context works reliably
Logs become reusable AI input

Where TOON Belongs in Architecture

Backend (Primary Target)

TOON is ideal for backend intelligence layers:

AI & RAG Pipelines

Context normalization
Prompt assembly
Chunking before embeddings
Semantic indexing

Event Sourcing & Logs

:event.type "PaymentCaptured"
:event.order.id 9821
:event.amount 1499.99
:event.provider "Stripe"

Logs stop being dead data — they become AI knowledge.