TOON: A New Format With Interesting Ideas (and a Few Sharp Edges)

TOON: A New Format With Interesting Ideas (and a Few Sharp Edges)

A new data format that could replace CSV and JSON.

Two weeks ago, we talked about how something as simple as an empty field in a CSV can carry very different meanings — and how those differences quietly affect everything downstream.

Right on cue, a new data format appeared in the developer world and sparked a surprising amount of discussion.

It’s called TOON — Token-Oriented Object Notation.
A compact hybrid of YAML- and CSV-style syntax, designed to be both human-readable and lean enough that an LLM can parse it with fewer tokens.

It’s an unusual motivation, but the idea itself is worth taking seriously:
A concise, readable format that could, in theory, replace JSON.

Let’s look at what TOON gets right, where it becomes fuzzy, and why its stated purpose raises questions.


Where TOON Is Actually Quite Appealing

There’s a real itch here: JSON is reliable but verbose. YAML is flexible but sometimes too implicit. TOON aims for a middle ground:

  • Arrays use CSV-like lists
  • Objects look like simplified YAML
  • Quoting is optional but defined
  • The format stays visually compact
  • It round-trips cleanly to JSON

And importantly:
TOON formally defines booleans and nulls, avoiding the ambiguity that plagues CSV.
Quoted values are strings.
Unquoted true|false|null become actual booleans or null.
Everything else unquoted is treated as a string (or a number if it clearly looks like one).

Not perfect, but at least the rules are written down — which is more than can be said for many “simple” formats we use every day.


Where Things Get Less Comfortable

1. Similar values can end up with different types

Consider this TOON definition of an array with three items:
values[3]: true,null,none

This becomes:

  • true → boolean
  • null → the null value
  • none → string "none"

Visually, all three entries look parallel. They’re not.
It’s easy to imagine this surprises someone later in the pipeline.

The choice makes sense if your priority is token minimalism, but it introduces type uncertainty in places where human readers expect clarity.

2. Whitespace rules in arrays are underspecified

TOON is intentionally lightweight, but whitespace handling in comma-separated lists is only vaguely described. Different implementations may interpret it differently — exactly the kind of grey zone that eventually becomes a compatibility issue. Neither of these problems is fatal. But they are the kind of early friction points that show a format is still settling.

Perhaps this is just my perspective as someone who has built a CSV parser. I’ve seen firsthand how much trouble vague whitespace rules can cause. I wish CSV had nailed this down better — and so I miss that TOON doesn’t define it more clearly from the start.


The Motivation: A New Format for LLMs?

This is the part that invites a raised eyebrow.

TOON’s headline goal is to reduce the number of tokens an LLM needs to “read” structured data. The project even includes accuracy benchmarks:

  • TOON extraction accuracy: 73.9%
  • JSON extraction accuracy: 69.7%

A measurable improvement — but still far from the reliability anyone would expect from a production workflow. Parsing errors in one out of four fields is not something you can build serious logic on.

If the real goal is to give LLMs access to structured data, creating a new serialization format may not be the most robust path.

We already have a more dependable option: MCP, which gives models controlled access to actual data sources without relying on ad-hoc parsing.

In that context, TOON feels less like a necessary invention and more like an interesting experiment that caught momentary attention.


Should You Use TOON Today?

Short answer: Probably not yet.

Longer answer:

  • If your data flow works with CSV, keep using CSV.
  • If you need well-structured, widely supported interchange, JSON remains the safe default.
  • If your goal is to provide real data to an LLM, MCP is the tool designed for that job.
  • And if you’re tempted by TOON because of the star count — wait a moment. New formats need time to prove themselves.

TOON is promising. It’s readable. It introduces some thoughtful ideas. But the use case that inspired it isn’t urgent enough to justify switching existing pipelines, and the specification still has some minor gaps.

It’s worth watching — not rushing into.

Thanks for reading,
Stefan

🧮 The Missing Number

18,600 – The number of GitHub stars the TOON project has earned so far.