jsoniter integration

The cats-eo-jsoniter module adds two cursor-backed optics for editing JSON byte streams without ever materialising an AST. They sit on top of jsoniter-scala and reuse the existing cats-eo carriers — no new carrier ships in this module.

libraryDependencies += "dev.constructive" %% "cats-eo-jsoniter" % "0.1"

Why this exists

circe parses to a Json AST first, then walks the tree. That makes circe great for dynamic-shape work (the shapes the tree can take are unbounded; you can drill into anything via cursors). It's also the source of most of circe's runtime cost — most of an HTTP-request-handler's JSON time is in the parse, not the drill.

jsoniter-scala takes the opposite trade. Codecs are derived at compile time, JSON is streamed straight into the codec, no AST materialised. Throughput is 5–10× circe on parse-heavy workloads. The cost: every focus type needs a JsonValueCodec[A] in scope; dynamic-shape work isn't natural.

cats-eo-jsoniter is the optic surface for that style. JsoniterPrism focuses a JSONPath inside a JSON byte buffer, decodes only when the user reads the focus, encodes-and-splices on write. JsoniterTraversal fans out over [*] array paths. The realistic comparison vs eo-circe on the hot path:

Operation eo-circe eo-jsoniter Speedup
Read scalar Long at depth 3 ~810 ns ~50 ns 16.2×
Read scalar String at depth 4 ~830 ns ~102 ns 8.1×
Read miss (path doesn't resolve) ~810 ns ~68 ns 11.8×
Fold over 10-element array ([*]) ~820 ns ~340 ns 2.4×
Write .replace Long at depth 3 ~1450 ns ~100 ns 14.3×
Write .modify Long at depth 3 ~1440 ns ~87 ns 16.5×

Numbers from the JsoniterBench suite — see benchmarks → JsoniterBench for the full table with confidence intervals and caveats. The traversal speedup narrows because per-element decode + array allocation accumulates; larger arrays push it back up. Writes don't degrade vs reads on the jsoniter side because the splice is bounded by O(src.length) memcpy — for hot-path 250-byte payloads that's ~50 ns.

eo-jsoniter is complementary to eo-circe, not a replacement. Reach for circe when shapes are dynamic; reach for jsoniter when the shape is fixed at compile time and you care about throughput.

JsoniterPrism

A JsoniterPrism[A] focuses a single value at a JSONPath inside an Array[Byte]. Construct it with the path string + a JsonValueCodec[A] in scope:

import com.github.plokhotnyuk.jsoniter_scala.core.JsonValueCodec
import com.github.plokhotnyuk.jsoniter_scala.macros.JsonCodecMaker

import dev.constructive.eo.jsoniter.JsoniterPrism
import dev.constructive.eo.optics.Optic.*

given JsonValueCodec[Long]   = JsonCodecMaker.make
given JsonValueCodec[String] = JsonCodecMaker.make

val sample: Array[Byte] =
  """{"payload":{"user":{"id":42,"email":"alice@example.com"}}}""".getBytes("UTF-8")

val idP    = JsoniterPrism[Long]("$.payload.user.id")
val emailP = JsoniterPrism[String]("$.payload.user.email")

Reading

.foldMap decodes the focus through the codec only when the focus hits; on miss it returns the Monoid identity:

import cats.instances.long.given
import cats.instances.string.given

idP.foldMap(identity[Long])(sample)
// res0: Long = 42L
emailP.foldMap(identity[String])(sample)
// res1: String = "alice@example.com"

// Miss: path doesn't resolve, returns Monoid.empty
val absentP = JsoniterPrism[Long]("$.payload.user.absent")
// absentP: Optic[Array[Byte], Array[Byte], Long, Long, [A >: Nothing <: Any, B >: Nothing <: Any] =>> Affine[A, B]] = dev.constructive.eo.jsoniter.JsoniterPrism$$anon$1@407ec9d6
absentP.foldMap(identity[Long])(sample)
// res2: Long = 0L

.headOption / .length / .exists light up the same way through the underlying Affine carrier — no separate code path.

Writing — .replace / .modify

The Hit branch encodes the new focus via JsonValueCodec[A] and splices into the source bytes at the recorded [start, end) span. Three arraycopys into a fresh Array[Byte]. Length-changing encodings (number widening, string growth/shrinkage) are handled transparently:

import dev.constructive.eo.data.Affine.given

new String(idP.replace(99L)(sample), "UTF-8")
// res3: String = "{\"payload\":{\"user\":{\"id\":99,\"email\":\"alice@example.com\"}}}"
new String(idP.replace(1234567L)(sample), "UTF-8")  // longer
// res4: String = "{\"payload\":{\"user\":{\"id\":1234567,\"email\":\"alice@example.com\"}}}"
new String(emailP.replace("bob@x.org")(sample), "UTF-8")  // shorter
// res5: String = "{\"payload\":{\"user\":{\"id\":42,\"email\":\"bob@x.org\"}}}"

.modify runs f on the decoded focus, then encodes the result and splices:

new String(idP.modify(_ * 10)(sample), "UTF-8")
// res6: String = "{\"payload\":{\"user\":{\"id\":420,\"email\":\"alice@example.com\"}}}"

Miss path is a no-op — the original bytes pass through unchanged:

val sameBytes = absentP.replace(99L)(sample)
// sameBytes: Array[Byte] = Array(
//   123,
//   34,
//   112,
//   97,
//   121,
//   108,
//   111,
//   97,
//   100,
//   34,
//   58,
//   123,
//   34,
//   117,
//   115,
//   101,
//   114,
//   34,
//   58,
//   123,
//   34,
//   105,
//   100,
//   34,
//   58,
//   52,
//   50,
//   44,
//   34,
//   101,
//   109,
//   97,
//   105,
//   108,
//   34,
//   58,
//   34,
//   97,
//   108,
//   105,
//   99,
//   101,
//   64,
//   101,
//   120,
//   97,
//   109,
//   112,
// ...
sameBytes.toSeq == sample.toSeq
// res7: Boolean = true

JsoniterTraversal

JsoniterTraversal[A] accepts wildcard paths ([*]) — fans out over every element of the focused array. Carrier is MultiFocus[PSVec], the same carrier Traversal.each uses, so the surface composes uniformly with the rest of cats-eo's traversal machinery.

import cats.instances.int.given

import dev.constructive.eo.data.MultiFocus.given
import dev.constructive.eo.jsoniter.JsoniterTraversal

val cart: Array[Byte] =
  """{"cart":{"items":[1,2,3,4,5,6,7,8,9,10]}}""".getBytes("UTF-8")

val itemsT = JsoniterTraversal[Long]("$.cart.items[*]")
itemsT.foldMap(identity[Long])(cart)            // sum, via Monoid[Long]
// res8: Long = 55L
itemsT.foldMap(_ => 1)(cart)                    // count, via Monoid[Int]
// res9: Int = 10

Spans whose decode throws are silently dropped — foldMap reads the focuses that exist and ignores the rest. JsoniterPrism rejects wildcard paths at construction with an explicit redirect to this factory.

Phase-1.5 ships read-only Traversal. Write-back over [*] is a phase-3 question (each splice would invalidate subsequent spans, so a naive loop is wrong) — punt unless a real workload demands it.

JSONPath subset

The accepted grammar is deliberately small:

path  := '$' (step)*
step  := '.' ident | '[' int ']' | '[*]'
ident := [A-Za-z_][A-Za-z_0-9]*
int   := [0-9]+

So:

Path Meaning
$ The whole document
$.foo.bar Field bar of object foo
$.items[0] First element of items array
$.items[*] Every element of items array (Traversal)
$.users[*].profile.email email of every user (Traversal)

Filter expressions (?(@.x > 5)), recursive descent (..), and property wildcards (.*) are out of scope. For dynamic-shape paths fall through to eo-circe.

Path-walk on the wire

The JsonPathScanner is a hand-rolled byte walker (~280 LoC) that resolves a path to a (start, end) byte span without parsing the document. It skips JSON strings (with all backslash escapes), numbers, objects, arrays, and literals; reports a Miss on the first structural mismatch. The existential X on JsoniterPrism carries (bytes, start, end) — just enough context for phase-2 splice writes to memcpy three slices into a fresh buffer.

The [*] wildcard expands the current array to a List[Span] via JsonPathScanner.findAll, then JsoniterTraversal.to decodes each span into a single Array[AnyRef] and wraps it via PSVec.unsafeWrap. One allocation per traversal, no per-element cons-cell.

Carrier reuse

This module ships zero new carriers. JsoniterPrism is an Optic[Array[Byte], Array[Byte], A, A, Affine]; JsoniterTraversal is an Optic[Array[Byte], Array[Byte], A, A, MultiFocus[PSVec]]. The standard cats-eo extensions on those carriers light up automatically:

On Optic[..., Affine] Lights up via
.foldMap / .headOption ForgetfulFold[Affine]
.modify / .replace ForgetfulFunctor[Affine]
.modifyA / .all ForgetfulTraverse[Affine, Applicative]
.andThen (same-carrier) AssociativeFunctor[Affine, _, _]
On Optic[..., MultiFocus[PSVec]] Lights up via
.foldMap / .length / .exists mfFold[PSVec]
.modify / .replace mfFunctor[PSVec]
.andThen (same-carrier) mfAssocPSVec

So JsoniterTraversal[A].andThen(Traversal.each[List, A]) — chaining a JSON-byte traversal with the classical Scala-collection traversal — composes through the standard Optic.andThen with zero-copy per-element reassembly, no Composer hop required.

When to reach for which

Task Use
Read one scalar from a JSON byte buffer, no AST JsoniterPrism[A]("$.path")
Write one scalar back into a JSON byte buffer JsoniterPrism[A].replace(b)(bytes)
Sum / count over an array on the wire JsoniterTraversal[A]("$.path[*]").foldMap(...)
Drill into dynamic shapes (no codec for surrounding) eo-circe codecPrism[…] — different module
Edit deeply through [*] on the wire Phase-3 (not yet shipped); fall through to eo-circe today
Chain a JSON traversal with a Scala collection traversal JsoniterTraversal[A].andThen(Traversal.each[F, A])

For the full failure-mode matrix (decode failures silently dropping to Miss, malformed JSON, length-changing splice mechanics) see the JsoniterPrismSpec, JsoniterPrismWriteSpec, and JsoniterTraversalSpec.