<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://kcsrk.info/feed.xml" rel="self" type="application/atom+xml" /><link href="https://kcsrk.info/" rel="alternate" type="text/html" /><updated>2026-05-13T00:30:26+00:00</updated><id>https://kcsrk.info/feed.xml</id><title type="html">KC Sivaramakrishnan</title><subtitle>KC Sivaramakrishnan</subtitle><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><entry><title type="html">Shrinking the OxCaml js_of_ocaml bundle: 285 MB to 4 MB</title><link href="https://kcsrk.info/ocaml/oxcaml/modes/2026/05/10/shrinking-the-oxcaml-bundle/" rel="alternate" type="text/html" title="Shrinking the OxCaml js_of_ocaml bundle: 285 MB to 4 MB" /><published>2026-05-10T11:00:00+00:00</published><updated>2026-05-10T11:00:00+00:00</updated><id>https://kcsrk.info/ocaml/oxcaml/modes/2026/05/10/shrinking-the-oxcaml-bundle</id><content type="html" xml:base="https://kcsrk.info/ocaml/oxcaml/modes/2026/05/10/shrinking-the-oxcaml-bundle/"><![CDATA[<p>In the
<a href="/ocaml/oxcaml/modes/blogging/2026/05/08/capsules-in-oxcaml/">previous post</a> on
capsules, I cheated. The lecture I was adapting (from my
<a href="https://github.com/fplaunchpad/cs6868_s26">CS6868</a> course on
language abstractions for parallelism) used
<code class="language-plaintext highlighter-rouge">Await_capsule.Mutex.with_lock</code>, the recommended non-deprecated way
to acquire a capsule mutex, but the post shipped
<code class="language-plaintext highlighter-rouge">Capsule_blocking_sync.Mutex</code> instead with the deprecation alert
silenced. The reason was bundle size: the <code class="language-plaintext highlighter-rouge">await</code> library, once we
chased its transitive dependencies through <code class="language-plaintext highlighter-rouge">base</code>, <code class="language-plaintext highlighter-rouge">sexplib0</code>,
<code class="language-plaintext highlighter-rouge">base_quickcheck</code> and the rest of Jane Street’s runtime, would have
ballooned the in-browser toplevel by roughly <em>285 MB</em>. The right
API would not even fit through GitHub’s 100 MB per-file push limit,
let alone be reasonable to send to a reader’s browser.</p>

<p>This post is the story of how we got from 285 MB down to 4 MB and
made the resulting bundle compose cleanly with the in-browser
toplevel, so the lecture’s <code class="language-plaintext highlighter-rouge">Await_capsule</code> form works end-to-end in
the cell at the bottom of this post. Most of the work happened on a
<a href="https://github.com/kayceesrk/js_of_ocaml/tree/kc-toplevel-extend">branch of <code class="language-plaintext highlighter-rouge">ocsigen/js_of_ocaml</code></a>,
with a smaller piece in
<a href="https://github.com/kayceesrk/x-ocaml/tree/oxcaml"><code class="language-plaintext highlighter-rouge">art-w/x-ocaml</code></a>,
the WebComponent that powers the cells.</p>

<!--more-->

<h2 id="why-bundle-size-matters">Why bundle size matters</h2>

<p>I teach two OCaml-heavy courses at IIT Madras:
<a href="https://github.com/fplaunchpad/cs3100_m20">CS3100</a>, the
undergraduate functional programming course, and
<a href="https://github.com/fplaunchpad/cs6868_s26">CS6868</a>, the more
recent graduate course on language abstractions for parallelism.
The lecture notes, examples and homework for both would be much
more useful as <em>interactive books</em> that a student can read, edit
and run entirely client-side, with no local installation. The same
shape would help us when we run hands-on OCaml and OxCaml
workshops, where the first session routinely gets eaten by the
<em>installation hump</em>: getting <code class="language-plaintext highlighter-rouge">opam</code>, the compiler and the required
libraries working on every attendee’s machine over patchy
conference WiFi, before the teaching can begin.</p>

<p>The broader effort to make installation painless is the OCaml
Platform <a href="https://ocaml.org/tools/platform-roadmap">roadmap</a>, which
we have been working on at Tarides as a “zero to OCaml in one
click” experience. That roadmap targets a developer who wants a
real local toolchain, with the full editor, debugger and
project-management story, and a generous latency budget since this
is a one-time setup. A workshop attendee has a much narrower
target: just <em>enough</em> OCaml to complete the exercises in front of
them. The client-side <code class="language-plaintext highlighter-rouge">x-ocaml</code> toplevel fits that target
naturally, because everything ships as static assets and there is
no installation step. The bundle, in this setting, <em>is</em> the latency
budget: 285 MB makes the in-browser path unshippable, 4 MB makes
it a realistic alternative to a local toolchain for a 90-minute
session.</p>

<h2 id="why-285-mb">Why 285 MB?</h2>

<p>The recipe <code class="language-plaintext highlighter-rouge">x-ocaml</code> already had for “load extra libraries into a
running in-browser toplevel” goes like this. For each <code class="language-plaintext highlighter-rouge">cma</code> you want
to ship, run</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ js_of_ocaml --toplevel &lt;library&gt;.cma -o lib.js
</code></pre></div></div>

<p>then concatenate the per-cma outputs into a single bundle and load
it via <code class="language-plaintext highlighter-rouge">&lt;script src-load=...&gt;</code>. Each per-cma output is <code class="language-plaintext highlighter-rouge">kind=cma</code>:
it registers the cma’s modules into the existing toplevel without
clobbering anything, the modules light up, and you can <code class="language-plaintext highlighter-rouge">open</code> them
from a cell. This works, and it is what the previous two OxCaml
posts already use.</p>

<p>The trouble is that <em>dead code elimination runs one cma at a time</em>.
If you ship <code class="language-plaintext highlighter-rouge">base</code>, you ship <em>all</em> of <code class="language-plaintext highlighter-rouge">base</code>, because the per-cma
DCE pass never gets to look at the linked-together program and
notice that the <code class="language-plaintext highlighter-rouge">await</code> library you actually want only touches a
small slice of <code class="language-plaintext highlighter-rouge">sexplib0</code>. So the bundle for the closure
<code class="language-plaintext highlighter-rouge">await + capsule + basement</code> ends up being the union of every cma
in the transitive dependency, fully linked, which comes out to
roughly 285 MB after the OxCaml compiler’s normal optimisations and
before any of the JavaScript-side cleverness has had a chance to
run.</p>

<p>In other words, the <code class="language-plaintext highlighter-rouge">await</code>-based blessed API is unshippable not
because the bundling tooling is broken, but because per-cma DCE is
the wrong granularity for this problem.</p>

<h2 id="the-other-recipe-and-why-it-doesnt-compose">The other recipe, and why it doesn’t compose</h2>

<p>It turns out <code class="language-plaintext highlighter-rouge">js_of_ocaml</code> already has a second recipe that
<a href="https://ocsigen.org/js_of_ocaml/latest/manual/build-toplevel">does perform cross-cma DCE</a>,
which I learned about when <a href="https://x.com/rickyvetter/status/2053215352301998424">Ricky
Vetter</a>
pointed me at the <code class="language-plaintext highlighter-rouge">--export</code> route on X. The recipe has two steps:</p>

<ol>
  <li>Build a single bytecode that links every library you want, with
<code class="language-plaintext highlighter-rouge">-linkall</code> so nothing gets pruned at the bytecode level.</li>
  <li>Hand that single bytecode to <code class="language-plaintext highlighter-rouge">js_of_ocaml --toplevel --export
units.txt</code>. The export list names the compilation units that
should remain visible to the toplevel; everything else is fair
game for DCE on the unified intermediate representation.</li>
</ol>

<p>For the same <code class="language-plaintext highlighter-rouge">await + capsule + basement</code> set, this recipe produces
a <em>4 MB</em> bundle, almost two orders of magnitude smaller than the
per-cma concatenation. The link-step DCE works at function
granularity over the whole linked program, so the unused parts of
<code class="language-plaintext highlighter-rouge">sexplib</code>, <code class="language-plaintext highlighter-rouge">base</code>, <code class="language-plaintext highlighter-rouge">base_quickcheck</code> and the rest of the dependency
closure get pruned away cleanly.</p>

<p>So why aren’t we already using this recipe?</p>

<p>Because the artifact that comes out the other end is <code class="language-plaintext highlighter-rouge">kind=exe</code>,
that is, a self-contained executable rather than a library. When
you load such a bundle into a Web Worker that already has a running
toplevel, its initialisation runs in <code class="language-plaintext highlighter-rouge">caml_main</code> style and starts
by overwriting the host’s globals:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">caml_global_data</span><span class="p">.</span><span class="nx">symbols</span>    <span class="o">=</span> <span class="o">&lt;</span><span class="nx">bundle</span><span class="dl">'</span><span class="s1">s symbol table&gt;
caml_global_data.sections   = &lt;bundle</span><span class="dl">'</span><span class="nx">s</span> <span class="nx">bytecode</span> <span class="nx">sections</span><span class="o">&gt;</span>
<span class="nx">caml_global_data</span><span class="p">.</span><span class="nx">prim_count</span> <span class="o">=</span> <span class="o">&lt;</span><span class="nx">bundle</span><span class="dl">'</span><span class="s1">s primitive count&gt;
caml_global_data.aliases    = &lt;bundle</span><span class="dl">'</span><span class="nx">s</span> <span class="nx">alias</span> <span class="nx">table</span><span class="o">&gt;</span>
</code></pre></div></div>

<p>Those four assignments <em>overwrite</em> the host toplevel’s tables. After
the bundle loads, <code class="language-plaintext highlighter-rouge">caml_get_global_data().symbols</code> is the bundle’s
symbols, not the worker’s, and anything in the host toplevel that
does name-based symbol resolution (<code class="language-plaintext highlighter-rouge">Toploop</code>, hover-types lookup,
<code class="language-plaintext highlighter-rouge">open Stdlib</code>) now consults a table that does not know about the
modules the worker had already linked. The toplevel survives, but
its typing environment is wrong, and cells stop being able to
<code class="language-plaintext highlighter-rouge">open</code> anything. We hit this dead end in the <a href="/ocaml/oxcaml/modes/blogging/2026/05/08/capsules-in-oxcaml/">capsules
post</a>; the bundle was
correct and the sizes were great, but the integration step that
connects an exe-shaped bundle to an existing toplevel was the
missing piece.</p>

<h2 id="the-patch">The patch</h2>

<p>The fix is a flag, <code class="language-plaintext highlighter-rouge">--toplevel-extend</code>, that I added on a <a href="https://github.com/kayceesrk/js_of_ocaml/tree/kc-toplevel-extend">branch of
<code class="language-plaintext highlighter-rouge">ocsigen/js_of_ocaml</code></a>.
When it is set, <code class="language-plaintext highlighter-rouge">js_of_ocaml --toplevel --export …</code> emits exactly
the same DCE output as before, with the same bytes for the
registered modules and the same <code class="language-plaintext highlighter-rouge">.cmi</code> files embedded under
<code class="language-plaintext highlighter-rouge">/static/cmis/</code>, but with three small changes:</p>

<ul>
  <li>packed as <code class="language-plaintext highlighter-rouge">~standalone:false</code>, so there is no <code class="language-plaintext highlighter-rouge">globalThis</code>
polyfill IIFE wrapping the output,</li>
  <li>with the four <code class="language-plaintext highlighter-rouge">caml_js_set</code> writes to <code class="language-plaintext highlighter-rouge">caml_global_data</code> from
above skipped, and</li>
  <li>tagged as <code class="language-plaintext highlighter-rouge">kind=cma</code> in the buildInfo header.</li>
</ul>

<p>The bundle’s modules still register themselves on load via the
ordinary
<a href="https://github.com/kayceesrk/js_of_ocaml/blob/kc-toplevel-extend/runtime/js/stdlib.js#L265"><code class="language-plaintext highlighter-rouge">caml_register_global(n, v, name)</code></a>,
which the runtime correctly merges via <code class="language-plaintext highlighter-rouge">symidx</code> into the host’s
existing tables. The result is <em>additive, not destructive</em>: the host
toplevel’s symbol table, sections, primitives and aliases all
survive intact, and its typing environment continues to resolve
<code class="language-plaintext highlighter-rouge">Stdlib</code>, <code class="language-plaintext highlighter-rouge">Toploop</code> and everything else that was already linked.
The new modules from the bundle simply show up as new symbols on
top.</p>

<p>The initial diff is small:
<a href="https://github.com/kayceesrk/js_of_ocaml/commit/6ec194f"><code class="language-plaintext highlighter-rouge">parse_bytecode.ml</code></a>
gates the <code class="language-plaintext highlighter-rouge">caml_js_set</code> block on the new flag, and
<code class="language-plaintext highlighter-rouge">cmd_arg.ml</code>/<code class="language-plaintext highlighter-rouge">compile.ml</code> thread it through. That gets the bundle
composable. The actual debugging is what came afterwards, when we
tried to make the composed bundle behave the way the host
expected, and that is what the rest of this post is about.</p>

<h2 id="wiring-it-through-x-ocaml">Wiring it through <code class="language-plaintext highlighter-rouge">x-ocaml</code></h2>

<p>On the <code class="language-plaintext highlighter-rouge">x-ocaml</code> side, the
<a href="https://github.com/kayceesrk/x-ocaml/blob/oxcaml/bin/x_ocaml.ml#L139"><code class="language-plaintext highlighter-rouge">--dce</code> flag</a>
drives the single-bytecode + <code class="language-plaintext highlighter-rouge">--export</code> build, invoking
<code class="language-plaintext highlighter-rouge">js_of_ocaml --toplevel-extend --export units.txt</code> for the bundles
we ship. The
<a href="https://github.com/kayceesrk/x-ocaml/tree/oxcaml">oxcaml branch of my <code class="language-plaintext highlighter-rouge">x-ocaml</code> fork</a>
carries the change; the patched <code class="language-plaintext highlighter-rouge">js_of_ocaml</code> needs to be on <code class="language-plaintext highlighter-rouge">PATH</code>
when the build runs.</p>

<h2 id="numbers">Numbers</h2>

<p>For the full closure of libraries the lecture’s <code class="language-plaintext highlighter-rouge">gensym</code> example
uses, the two paths come out very differently:</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>Per-cma</th>
      <th><code class="language-plaintext highlighter-rouge">--dce --toplevel-extend</code></th>
      <th>Ratio</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>basement + capsule0 only</td>
      <td>1.0 MB</td>
      <td>1.0 MB</td>
      <td>1×</td>
    </tr>
    <tr>
      <td>+ capsule + await + portable</td>
      <td>285 MB</td>
      <td>4.0 MB</td>
      <td><em>~70×</em></td>
    </tr>
  </tbody>
</table>

<p>The <code class="language-plaintext highlighter-rouge">basement + capsule0</code> row is essentially a wash, because the
bundle size at that scale is dominated by the <code class="language-plaintext highlighter-rouge">.cmi</code> files and there
is very little JavaScript code for DCE to chew on. Once <code class="language-plaintext highlighter-rouge">await</code> and
the curated <code class="language-plaintext highlighter-rouge">capsule</code> API enter the picture the per-cma path
balloons, because it has to ship every cma in the transitive
closure of <code class="language-plaintext highlighter-rouge">base</code>, <code class="language-plaintext highlighter-rouge">sexplib0</code>, <code class="language-plaintext highlighter-rouge">bin_prot</code>, <code class="language-plaintext highlighter-rouge">base_quickcheck</code> and the
<code class="language-plaintext highlighter-rouge">ppx_*</code> runtime libraries, while <code class="language-plaintext highlighter-rouge">--dce</code> keeps only the functions
that are actually reachable from the export list, plus the <code class="language-plaintext highlighter-rouge">.cmi</code>
files the toplevel needs to elaborate the signatures the user types
into a cell.</p>

<h2 id="a-few-more-snags">A few more snags</h2>

<p>Getting the bundle to be <code class="language-plaintext highlighter-rouge">kind=cma</code> was the easy part. Composing it
with an already-running toplevel turned out to surface a small
zoo of follow-on issues, each of which had a short fix once we
understood it. They all live on the same
<a href="https://github.com/kayceesrk/js_of_ocaml/tree/kc-toplevel-extend"><code class="language-plaintext highlighter-rouge">kc-toplevel-extend</code></a>
branch; the commit messages have the gory details.</p>

<ul>
  <li>
    <p><em>Predefined-exception identity drifts across bundles.</em> The
bundle’s re-allocated <code class="language-plaintext highlighter-rouge">Not_found</code>, <code class="language-plaintext highlighter-rouge">Sys_error</code> and friends are
physically distinct from the host’s copies, so a
<code class="language-plaintext highlighter-rouge">try ... with Not_found -&gt;</code> in stdlib code (the first place we
hit this was <code class="language-plaintext highlighter-rouge">Hashtbl.randomized_default</code> reading
<code class="language-plaintext highlighter-rouge">OCAMLRUNPARAM</code>) fails to catch the host’s <code class="language-plaintext highlighter-rouge">Not_found</code> raised by
the runtime. The fix is to bind each predef-exn variable in the
bundle to a runtime <code class="language-plaintext highlighter-rouge">caml_get_global_data</code> lookup so the bundle
reuses the host’s instances.</p>
  </li>
  <li>
    <p><em>The pseudo-filesystem raises on duplicate <code class="language-plaintext highlighter-rouge">.cmi</code> registrations.</em>
The bundle wants to re-emit <code class="language-plaintext highlighter-rouge">/static/cmis/stdlib.cmi</code>, which the
host has already registered at boot, and <code class="language-plaintext highlighter-rouge">MlFakeDevice.register</code>
refuses to overwrite. Making <code class="language-plaintext highlighter-rouge">register</code> idempotent removes the
conflict without losing anything: the two copies of <code class="language-plaintext highlighter-rouge">stdlib.cmi</code>
agree, since they come from the same <code class="language-plaintext highlighter-rouge">opam</code> switch.</p>
  </li>
  <li>
    <p><em>Stdlib re-registration overwrites host modules.</em> Without a
guard, <code class="language-plaintext highlighter-rouge">caml_register_global</code> was cheerfully replacing the host’s
<code class="language-plaintext highlighter-rouge">caml_global_data["Format"]</code> (and every other stdlib module the
bundle’s bytecode happens to link in) with the bundle’s freshly
initialised copy. Adding an early return when the name is already
known fixes this without changing the behaviour for any name the
host does not yet have.</p>
  </li>
  <li>
    <p><em><code class="language-plaintext highlighter-rouge">Domain.DLS</code> slot collisions silently broke hover types.</em> The
bundle re-runs stdlib’s module init when it loads, and
<code class="language-plaintext highlighter-rouge">Stdlib__Domain.DLS</code>’s <code class="language-plaintext highlighter-rouge">let key_counter = Atomic.make 0</code>
re-allocates the counter and starts it from zero. The bundle’s
<code class="language-plaintext highlighter-rouge">Format.stdbuf_key</code> then ends up at a low DLS index the host had
already assigned to <em>its</em> <code class="language-plaintext highlighter-rouge">Format.stdbuf_key</code>, and <code class="language-plaintext highlighter-rouge">DLS.set</code>
overwrites the host’s entry in the shared <code class="language-plaintext highlighter-rouge">caml_domain_dls</code>
array. The host’s <code class="language-plaintext highlighter-rouge">Format.flush_str_formatter</code> then reads from
the bundle’s empty buffer, merlin’s type-enclosing printer
(which flushes through <code class="language-plaintext highlighter-rouge">Format.str_formatter</code>) returns <code class="language-plaintext highlighter-rouge">""</code> for
every query, and hover-on-identifier tooltips come up blank. The
fix is at the bundle-load boundary in
<a href="https://github.com/kayceesrk/x-ocaml/blob/oxcaml/build_portable_js_extend.sh"><code class="language-plaintext highlighter-rouge">build_portable_js_extend.sh</code></a>:
snapshot <code class="language-plaintext highlighter-rouge">caml_domain_dls_get ()</code> before the bundle’s IIFE runs,
and restore the host-owned slots afterwards. The bundle’s <em>new</em>
high-index slots are left alone; only the host’s previously
populated slots get restored. I spent a while convinced this was
a Format DCE bug before tracing through the OCaml 5+
<code class="language-plaintext highlighter-rouge">Domain.DLS</code> init path.</p>
  </li>
  <li>
    <p><em>Bundle build:</em> the curated <code class="language-plaintext highlighter-rouge">capsule</code> and <code class="language-plaintext highlighter-rouge">await.capsule</code> APIs
both <code class="language-plaintext highlighter-rouge">open! Base</code> at the top of their files, so their interfaces
mention <code class="language-plaintext highlighter-rouge">Base.unit</code> rather than <code class="language-plaintext highlighter-rouge">Stdlib.unit</code>. To elaborate those
signatures the host toplevel needs a small chain of <code class="language-plaintext highlighter-rouge">base</code> and
<code class="language-plaintext highlighter-rouge">sexplib0</code> <code class="language-plaintext highlighter-rouge">.cmi</code> files at <code class="language-plaintext highlighter-rouge">/static/cmis/</code>. We ship three <code class="language-plaintext highlighter-rouge">base</code>
cmis (<code class="language-plaintext highlighter-rouge">base.cmi</code>, <code class="language-plaintext highlighter-rouge">base__.cmi</code>, <code class="language-plaintext highlighter-rouge">base__Unit.cmi</code>) and two
<code class="language-plaintext highlighter-rouge">sexplib0</code> cmis via <code class="language-plaintext highlighter-rouge">js_of_ocaml</code>’s <code class="language-plaintext highlighter-rouge">--file=&lt;src&gt;:/static/cmis/</code>
flag, which embeds the file directly without putting <code class="language-plaintext highlighter-rouge">base</code> on
the bytecode-link line (which would drag the whole of <code class="language-plaintext highlighter-rouge">base</code>
back in).</p>
  </li>
</ul>

<h2 id="the-cell">The cell</h2>

<p>The cell below uses the lecture’s
<a href="https://github.com/fplaunchpad/cs6868_s26/blob/main/lectures/11_oxcaml/code/08_capsules/gensym_capsule.ml"><code class="language-plaintext highlighter-rouge">gensym_capsule.ml</code></a>
shape directly: <code class="language-plaintext highlighter-rouge">Await_capsule.Mutex.with_lock</code> taking an
<code class="language-plaintext highlighter-rouge">Await_kernel.Await.t</code> witness, with <code class="language-plaintext highlighter-rouge">Capsule_expert.Data.create</code>
and <code class="language-plaintext highlighter-rouge">Capsule_expert.Data.unwrap</code> for the brand-locked counter. No
<code class="language-plaintext highlighter-rouge">[@@@alert "-deprecated"]</code>, no shim. Press Run.</p>

<x-ocaml>
let make_gensym () =
  let (P mutex) = Await_capsule.Mutex.create () in
  let counter = Capsule_expert.Data.create (fun () -&gt; ref 0) in
  let fetch_and_incr (w : Await_kernel.Await.t) =
    Await_capsule.Mutex.with_lock w mutex
      ~f:(fun access -&gt;
        let c = Capsule_expert.Data.unwrap ~access counter in
        incr c;
        !c)
  in
  fun w prefix -&gt; prefix ^ "_" ^ string_of_int (fetch_and_incr w)

let gensym = make_gensym ()
let w = Await_blocking.await Await_kernel.Terminator.never
let () = Printf.printf "%s %s\n" (gensym w "x") (gensym w "y")
</x-ocaml>

<p>We cannot actually demonstrate parallelism in the browser worker
(it is single-domain), so we hand the body the trivial blocker
<code class="language-plaintext highlighter-rouge">Await_blocking.await Terminator.never</code>. The mode-checking is the
<a href="/ocaml/oxcaml/modes/blogging/2026/05/08/capsules-in-oxcaml/#with_lock-produces-an-access-token-briefly">same brand/local/once dance</a>
as in the capsules post; what is new is that the same dance now
type-checks against the blessed <code class="language-plaintext highlighter-rouge">Await_capsule</code> API directly,
without the <code class="language-plaintext highlighter-rouge">Capsule_blocking_sync</code> shim.</p>

<h2 id="what-next">What next?</h2>

<p>The fork is small enough that <code class="language-plaintext highlighter-rouge">--toplevel-extend</code> is plausibly
upstreamable into <code class="language-plaintext highlighter-rouge">ocsigen/js_of_ocaml</code>. The DLS-snapshot dance
would also be cleaner as a proper runtime primitive than as a
wrapper around the bundle’s IIFE. The
<a href="https://github.com/kayceesrk/js_of_ocaml/tree/kc-toplevel-extend">branch</a>
is open for review.</p>

<p>For <code class="language-plaintext highlighter-rouge">x-ocaml</code> itself, the obvious next step is to broaden the set
of extension bundles. The current <code class="language-plaintext highlighter-rouge">portable.js</code> covers
<code class="language-plaintext highlighter-rouge">basement + capsule0 + capsule + await + portable</code>, which is just
enough for the capsule and parallelism material in
<a href="https://github.com/fplaunchpad/cs6868_s26">CS6868</a>; the
<a href="https://github.com/fplaunchpad/cs3100_m20">CS3100</a> material would
want a different slice, and an <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">Eio</code>-flavoured bundle
would let the same cells host concurrency examples. Once a small
library of these bundles exists, turning a lecture set or a
workshop tutorial into an interactive book becomes mostly a matter
of picking the right bundle, which is the workshop-scale “zero to
OxCaml” story I want to get to.</p>

<p><code class="language-plaintext highlighter-rouge">x-ocaml</code> is one of <a href="https://github.com/art-w">Arthur Wendling’s</a>
hacking expeditions, and it remains a pleasure to build on. Thanks
to <a href="https://x.com/rickyvetter">Ricky Vetter</a> for the <code class="language-plaintext highlighter-rouge">--export</code>
pointer that got the whole thing started, and to the OxCaml team
for the libraries.</p>

<hr />

<p><em>Written together with <a href="https://www.anthropic.com/news/claude-opus-4-7">Claude Opus 4.7 (1M
context)</a>. The
<a href="https://github.com/ocsigen/js_of_ocaml/compare/c3da0bb58eafe2d9ad3387cbbe9b8faf9ec91fb1...kayceesrk:js_of_ocaml:kc-toplevel-extend"><code class="language-plaintext highlighter-rouge">js_of_ocaml</code>
diff</a>
against the <code class="language-plaintext highlighter-rouge">+ox</code> base and the
<a href="https://github.com/kayceesrk/x-ocaml/compare/d9160c5...oxcaml"><code class="language-plaintext highlighter-rouge">x-ocaml</code> integration
diff</a>
are both on GitHub.</em></p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="OxCaml" /><category term="Modes" /><summary type="html"><![CDATA[In the previous post on capsules, I cheated. The lecture I was adapting (from my CS6868 course on language abstractions for parallelism) used Await_capsule.Mutex.with_lock, the recommended non-deprecated way to acquire a capsule mutex, but the post shipped Capsule_blocking_sync.Mutex instead with the deprecation alert silenced. The reason was bundle size: the await library, once we chased its transitive dependencies through base, sexplib0, base_quickcheck and the rest of Jane Street’s runtime, would have ballooned the in-browser toplevel by roughly 285 MB. The right API would not even fit through GitHub’s 100 MB per-file push limit, let alone be reasonable to send to a reader’s browser. This post is the story of how we got from 285 MB down to 4 MB and made the resulting bundle compose cleanly with the in-browser toplevel, so the lecture’s Await_capsule form works end-to-end in the cell at the bottom of this post. Most of the work happened on a branch of ocsigen/js_of_ocaml, with a smaller piece in art-w/x-ocaml, the WebComponent that powers the cells.]]></summary></entry><entry><title type="html">Capsules: compile-time lock discipline in OxCaml</title><link href="https://kcsrk.info/ocaml/oxcaml/modes/blogging/2026/05/08/capsules-in-oxcaml/" rel="alternate" type="text/html" title="Capsules: compile-time lock discipline in OxCaml" /><published>2026-05-08T10:00:00+00:00</published><updated>2026-05-08T10:00:00+00:00</updated><id>https://kcsrk.info/ocaml/oxcaml/modes/blogging/2026/05/08/capsules-in-oxcaml</id><content type="html" xml:base="https://kcsrk.info/ocaml/oxcaml/modes/blogging/2026/05/08/capsules-in-oxcaml/"><![CDATA[<p>In the
<a href="/ocaml/oxcaml/x-ocaml/blogging/2026/05/07/data-race-freedom-in-oxcaml/">previous post</a> we
fixed the racy <code class="language-plaintext highlighter-rouge">gensym</code> with <code class="language-plaintext highlighter-rouge">Portable.Atomic</code>. That worked because
the shared state was a single integer with atomic primitives. What
about state that needs a hash table, a multi-step update, or any
structure where atomics aren’t enough? OxCaml’s answer is the
<strong>capsule</strong>: a way to bundle state with its lock so the lock
discipline becomes a type-checker job rather than a programmer
convention.</p>

<!--more-->

<p>The example here is the capsule version of <code class="language-plaintext highlighter-rouge">gensym</code> from my
<a href="https://github.com/fplaunchpad/cs6868_s26">CS6868</a> lecture
(<a href="https://github.com/fplaunchpad/cs6868_s26/blob/main/lectures/11_oxcaml/handout.md#part-5-capsules--safe-shared-mutable-state">handout</a>).
Capsules don’t introduce a new mode axis; they’re a small library that
composes the axes we’ve already met:
<a href="/ocaml/oxcaml/x-ocaml/blogging/2026/05/07/data-race-freedom-in-oxcaml/">contention and portability</a>,
<a href="/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types/">uniqueness</a>,
and <a href="/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness/">linearity</a>.
The cells below run in the same in-browser OxCaml toplevel as the
previous post; the <a href="#a-note-on-the-api"><em>A note on the API</em></a> section
at the end explains why we use a slightly older flavour of the capsule
API (it’s a bundle-size thing, not pedagogical).</p>

<h2 id="why-atomics-arent-enough">Why atomics aren’t enough</h2>

<p>For a single integer counter, atomics are perfect. Drop the size
constraint and they’re not. Suppose <code class="language-plaintext highlighter-rouge">gensym</code> had to consult a hash
table of already-issued names, or update two counters in lockstep, or
do a multi-step modify-then-record. None of that fits an atomic word.
The standard OCaml answer is <code class="language-plaintext highlighter-rouge">Mutex.t</code>:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">mutex</span> <span class="o">=</span> <span class="nn">Mutex</span><span class="p">.</span><span class="n">create</span> <span class="bp">()</span>
<span class="k">let</span> <span class="n">table</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">16</span>

<span class="k">let</span> <span class="n">safe_insert</span> <span class="n">k</span> <span class="n">v</span> <span class="o">=</span>
  <span class="nn">Mutex</span><span class="p">.</span><span class="n">lock</span> <span class="n">mutex</span><span class="p">;</span>
  <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">table</span> <span class="n">k</span> <span class="n">v</span><span class="p">;</span>
  <span class="nn">Mutex</span><span class="p">.</span><span class="n">unlock</span> <span class="n">mutex</span>

<span class="c">(* Nothing stops you from forgetting to lock: *)</span>
<span class="k">let</span> <span class="n">unsafe_insert</span> <span class="n">k</span> <span class="n">v</span> <span class="o">=</span>
  <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">table</span> <span class="n">k</span> <span class="n">v</span>   <span class="c">(* races, but compiles. *)</span>
</code></pre></div></div>

<p>Two things are wrong here. First, the mutex and the data are separate
values; nothing at the type level connects <code class="language-plaintext highlighter-rouge">mutex</code> to <code class="language-plaintext highlighter-rouge">table</code>. Second,
“always lock before access” is a programmer convention. The compiler
can’t tell what mutex you meant for what state, and an unlocked
<code class="language-plaintext highlighter-rouge">Hashtbl.add</code> is a runtime data race that nothing catches.</p>

<p>Yes, OxCaml’s contention rule from the previous post would reject
direct mutation of a <code class="language-plaintext highlighter-rouge">@ contended</code> <code class="language-plaintext highlighter-rouge">table</code> in a parallel context. But
that only helps if the type system can see that <code class="language-plaintext highlighter-rouge">table</code> is contended
in the first place. Once you have a top-level shared mutable, you need
a structural reason for the type checker to <em>believe</em> that “you are
holding the lock right now.” That’s exactly what a capsule provides.</p>

<h2 id="gensym-in-a-capsule">Gensym in a capsule</h2>

<p>The capsule pattern packs the counter inside a brand-locked container,
making the lock the only handle to the state:</p>

<x-ocaml>
[@@@alert "-deprecated"]

module Cap = Capsule_expert
module Mtx = Capsule_blocking_sync.Mutex

let make_counter () =
  let Cap.Key.P key = Cap.create () in
  let counter = Cap.Data.create (fun () -&gt; ref 0) in
  let mutex   = Mtx.create key in
  fun () -&gt;
    Mtx.with_lock mutex ~f:(fun password -&gt;
      Cap.access ~password ~f:(fun access -&gt;
        let c = Cap.Data.unwrap ~access counter in
        incr c;
        !c))

let next   = make_counter ()
let gensym prefix = prefix ^ "_" ^ string_of_int (next ())

let () = Printf.printf "%s %s\n" (gensym "x") (gensym "y")
</x-ocaml>

<p>Run it; you get two distinct ids, and the toplevel binds
<code class="language-plaintext highlighter-rouge">val next : unit -&gt; int</code> and <code class="language-plaintext highlighter-rouge">val gensym : string -&gt; string</code>. No
handle to the inner <code class="language-plaintext highlighter-rouge">ref</code> is in scope outside <code class="language-plaintext highlighter-rouge">Mtx.with_lock</code>. Let’s
read what’s making that true.</p>

<h3 id="the-brand-k-an-existential-tied-to-a-fresh-capsule">The brand <code class="language-plaintext highlighter-rouge">'k</code>: an existential tied to a fresh capsule</h3>

<p><code class="language-plaintext highlighter-rouge">Cap.create ()</code> returns a key wrapped in an existential constructor
<code class="language-plaintext highlighter-rouge">Cap.Key.P</code>:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">packed</span> <span class="o">=</span> <span class="nc">P</span> <span class="o">:</span> <span class="k">'</span><span class="n">k</span> <span class="nn">Key</span><span class="p">.</span><span class="n">t</span> <span class="o">-&gt;</span> <span class="n">packed</span> <span class="p">[</span><span class="o">@@</span><span class="n">unboxed</span><span class="p">]</span>
<span class="k">val</span> <span class="n">create</span> <span class="o">:</span> <span class="kt">unit</span> <span class="o">-&gt;</span> <span class="n">packed</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">let Cap.Key.P key = Cap.create ()</code> pattern unwraps it and
introduces a fresh type variable, call it <code class="language-plaintext highlighter-rouge">$k</code>, in the surrounding
scope. <code class="language-plaintext highlighter-rouge">key</code> is then bound at type <code class="language-plaintext highlighter-rouge">$k Cap.Key.t</code>. A second
<code class="language-plaintext highlighter-rouge">Cap.create ()</code> would introduce <code class="language-plaintext highlighter-rouge">$k2</code>, distinct and unrelated. Each
<code class="language-plaintext highlighter-rouge">P</code> pattern brings a new type into the world.</p>

<p>That brand is the static glue between capsule, data, and mutex. When
we call <code class="language-plaintext highlighter-rouge">Cap.Data.create (fun () -&gt; ref 0)</code> next, the compiler unifies
the data’s type parameter with the brand of the key it’ll eventually
be unwrapped under, so <code class="language-plaintext highlighter-rouge">counter : (int ref, $k) Cap.Data.t</code>. The
mutex created from <code class="language-plaintext highlighter-rouge">key</code> (consuming it as <code class="language-plaintext highlighter-rouge">@ unique</code>) is also branded
<code class="language-plaintext highlighter-rouge">$k</code>. Henceforth this data only opens under this mutex’s lock.</p>

<p>(Why is the existential pattern wrapped inside <code class="language-plaintext highlighter-rouge">make_counter ()</code>
rather than at the top level? OCaml refuses top-level let-bindings
that introduce existentials, so we hide the unwrap inside a function.
The closure returned by <code class="language-plaintext highlighter-rouge">make_counter</code> carries the <code class="language-plaintext highlighter-rouge">$k</code>-branded
<code class="language-plaintext highlighter-rouge">mutex</code> and <code class="language-plaintext highlighter-rouge">counter</code> in its environment.)</p>

<h3 id="the-bare-ref-is-unreachable-from-outside-the-capsule">The bare <code class="language-plaintext highlighter-rouge">ref</code> is unreachable from outside the capsule</h3>

<p>Look at where the <code class="language-plaintext highlighter-rouge">ref 0</code> lives:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">counter</span> <span class="o">=</span> <span class="nn">Cap</span><span class="p">.</span><span class="nn">Data</span><span class="p">.</span><span class="n">create</span> <span class="p">(</span><span class="k">fun</span> <span class="bp">()</span> <span class="o">-&gt;</span> <span class="n">ref</span> <span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>

<p>That <code class="language-plaintext highlighter-rouge">ref</code> has no binding outside the closure handed to <code class="language-plaintext highlighter-rouge">Cap.Data.create</code>.
The only handle out is the <code class="language-plaintext highlighter-rouge">Cap.Data.t</code> value branded <code class="language-plaintext highlighter-rouge">$k</code>. There is no
aliased reference to smuggle out: the <code class="language-plaintext highlighter-rouge">ref</code>’s reachability is
single-pathed. This is exactly the configuration the
<a href="/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types/">uniqueness post</a>
was about, a unique reference with no aliases to it, but here we get
it for free from how the API is shaped, without writing <code class="language-plaintext highlighter-rouge">@ unique</code>
anywhere ourselves. The capsule API is <em>exploiting</em> uniqueness as a
construction principle.</p>

<h3 id="capdatat-mode-crosses-contention-and-portability"><code class="language-plaintext highlighter-rouge">Cap.Data.t</code> mode-crosses contention and portability</h3>

<p>Just like <code class="language-plaintext highlighter-rouge">Portable.Atomic.t</code> from the previous post, the type
<code class="language-plaintext highlighter-rouge">('a, 'k) Cap.Data.t</code> carries the kind annotation
<code class="language-plaintext highlighter-rouge">value mod contended portable</code>. A closure that captures a capsule
value stays <code class="language-plaintext highlighter-rouge">@ portable</code>, and any domain may hold the capsule. So the
closure returned by <code class="language-plaintext highlighter-rouge">make_counter</code> (which captures both <code class="language-plaintext highlighter-rouge">mutex</code> and
<code class="language-plaintext highlighter-rouge">counter</code>) type-checks at portable mode and is safe to send to another
domain.</p>

<p>This is the same mode-crossing move we saw with <code class="language-plaintext highlighter-rouge">Portable.Atomic</code>,
just applied to a more general container.</p>

<h3 id="with_lock-produces-an-access-token-briefly"><code class="language-plaintext highlighter-rouge">with_lock</code> produces an <code class="language-plaintext highlighter-rouge">access</code> token, briefly</h3>

<p><code class="language-plaintext highlighter-rouge">Mtx.with_lock</code> is the only way into the critical section. Its
signature:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="n">with_lock</span>
  <span class="o">:</span> <span class="k">'</span><span class="n">k</span> <span class="n">t</span>
  <span class="o">-&gt;</span> <span class="n">f</span><span class="o">:</span><span class="p">(</span><span class="k">'</span><span class="n">k</span> <span class="nn">Cap</span><span class="p">.</span><span class="nn">Password</span><span class="p">.</span><span class="n">t</span> <span class="o">@</span> <span class="n">local</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">@</span> <span class="n">once</span> <span class="n">unique</span><span class="p">)</span> <span class="o">@</span> <span class="n">local</span> <span class="n">once</span>
  <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">@</span> <span class="n">once</span> <span class="n">unique</span>
</code></pre></div></div>

<p>Three different things in this signature are doing real work.</p>

<ul>
  <li><strong>Brand <code class="language-plaintext highlighter-rouge">'k</code></strong>: the password’s brand matches the mutex’s brand. A
password from a <em>different</em> mutex (different <code class="language-plaintext highlighter-rouge">$k1</code>) cannot unwrap
our <code class="language-plaintext highlighter-rouge">counter</code>. The type checker refuses to unify the two
existentials and the program does not compile.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">@ local</code></strong>: the password is local to the body’s region. It
cannot escape <code class="language-plaintext highlighter-rouge">f</code> by being returned, stored in a global, or stuffed
into a closure that outlives the call. When <code class="language-plaintext highlighter-rouge">with_lock</code> returns the
lock is released, and there’s no <code class="language-plaintext highlighter-rouge">password</code> left in scope to attempt
to touch the data with. (Locality is the <a href="/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness/">stack-allocation /
no-escape axis we covered</a>;
same mechanism, different use.)</li>
  <li><strong><code class="language-plaintext highlighter-rouge">@ once</code></strong> on the body: <code class="language-plaintext highlighter-rouge">f</code> can be called at most once. This is
the linearity guarantee. The body cannot be re-entered with the
same password, and the runtime cannot smuggle it out for later. Each
<code class="language-plaintext highlighter-rouge">with_lock</code> use is a single shot.</li>
</ul>

<p>Inside <code class="language-plaintext highlighter-rouge">f</code>, <code class="language-plaintext highlighter-rouge">Cap.access ~password ~f:(fun access -&gt; ...)</code> converts
the password into a brand-matching <code class="language-plaintext highlighter-rouge">'k Access.t</code> for the inner body.
<code class="language-plaintext highlighter-rouge">Cap.Data.unwrap ~access counter</code> then accepts the access and returns
the underlying <code class="language-plaintext highlighter-rouge">int ref</code> at <strong><code class="language-plaintext highlighter-rouge">@ uncontended</code></strong>. We hold the lock; no
other domain can be touching the data. The next two lines, <code class="language-plaintext highlighter-rouge">incr c;
!c</code>, are ordinary OCaml mutation; the contention rule is satisfied by
construction because <code class="language-plaintext highlighter-rouge">unwrap</code> promised uncontended access.</p>

<p>Outside the lock body, neither <code class="language-plaintext highlighter-rouge">password</code> nor <code class="language-plaintext highlighter-rouge">access</code> exists in
scope, so there is no path from the outer program to <code class="language-plaintext highlighter-rouge">incr c</code> that
doesn’t go through <code class="language-plaintext highlighter-rouge">with_lock</code>.</p>

<h2 id="brand-mismatch-is-a-type-error">Brand mismatch is a type error</h2>

<p>The cell below tries the genuinely unsound case: protecting one
<code class="language-plaintext highlighter-rouge">Cap.Data.t</code> with two distinct mutexes. If this compiled, two
threads holding distinct mutexes could enter critical sections on the
same data simultaneously. The type checker refuses, and reports the
two existentials don’t match.</p>

<x-ocaml>
[@@@alert "-deprecated"]

let abuse () =
  let Cap.Key.P key1 = Cap.create () in
  let Cap.Key.P key2 = Cap.create () in
  let data  = Cap.Data.create (fun () -&gt; ref 0) in
  let m1 = Mtx.create key1 in
  let m2 = Mtx.create key2 in
  (* First unwrap pins data's brand to key1's $k. *)
  let _ = Mtx.with_lock m1 ~f:(fun password -&gt;
            Cap.access ~password ~f:(fun access -&gt;
              !(Cap.Data.unwrap ~access data))) in
  (* This second unwrap fails: data is branded with the FIRST key's $k,
     not the second's $k1. *)
  Mtx.with_lock m2 ~f:(fun password -&gt;
    Cap.access ~password ~f:(fun access -&gt;
      !(Cap.Data.unwrap ~access data)))
</x-ocaml>

<p>The error names the two existentials and points out they cannot
unify. The handout’s <code class="language-plaintext highlighter-rouge">capsule_abuse.ml</code> walks through two more
attempted escape hatches:</p>

<ul>
  <li><strong>Leak the inner ref through a top-level <code class="language-plaintext highlighter-rouge">Portable.Atomic</code>.</strong> The
store compiles, but anything inside a portable atomic is
<code class="language-plaintext highlighter-rouge">@ contended</code>, and the contention rule from the
<a href="/ocaml/oxcaml/x-ocaml/blogging/2026/05/07/data-race-freedom-in-oxcaml/">previous post</a>
forbids reading or writing a mutable field of a contended value. So
you can park an alias, but you can’t dereference it.</li>
  <li><strong>One mutex protecting two <code class="language-plaintext highlighter-rouge">Cap.Data.t</code> values.</strong> <em>Allowed</em>, and
correctly so. Both data values are branded with the same <code class="language-plaintext highlighter-rouge">$k</code>, and
a single critical section can update both. (Equivalent to one mutex
guarding two fields of a struct.)</li>
</ul>

<p>The first closes the obvious leak path (via the same uniqueness-of-paths
property that protects the bare <code class="language-plaintext highlighter-rouge">ref</code>); brand mismatch closes the
aliased-mutex case; the one-mutex-two-data case shows the rule isn’t
gratuitously restrictive.</p>

<h2 id="what-each-mode-is-doing">What each mode is doing</h2>

<p>Pulling the threads together, with one line per axis:</p>

<ul>
  <li><strong>Contention</strong> ensures shared mutable state can’t be read or
written from outside a critical section. Inside <code class="language-plaintext highlighter-rouge">with_lock</code> we get
an <code class="language-plaintext highlighter-rouge">@ uncontended</code> view via <code class="language-plaintext highlighter-rouge">unwrap</code>; outside, the contents aren’t
in scope at all.</li>
  <li><strong>Portability</strong> lets the closure cross domain boundaries.
<code class="language-plaintext highlighter-rouge">Cap.Data.t</code> mode-crosses, so the closure capturing it stays
<code class="language-plaintext highlighter-rouge">@ portable</code>.</li>
  <li><strong>Locality</strong> confines the password (and the access derived from
it) to the body of <code class="language-plaintext highlighter-rouge">with_lock</code>. No way to keep a token past lock
release.</li>
  <li><strong>Linearity</strong> (<code class="language-plaintext highlighter-rouge">@ once</code> on the body) makes the critical section
single-shot. The body cannot be re-entered with the same password.</li>
  <li><strong>Uniqueness</strong> isn’t an annotation we wrote; it’s the property
that the inner <code class="language-plaintext highlighter-rouge">ref</code> has exactly one path of access, through the
capsule. The API shape forces it.</li>
</ul>

<h2 id="a-note-on-the-api">A note on the API</h2>

<p>Two API choices in the cells above are worth flagging for anyone
transcribing this to a real <code class="language-plaintext highlighter-rouge">.ml</code> file.</p>

<p><strong>Curated vs expert capsule API.</strong> The full <code class="language-plaintext highlighter-rouge">capsule</code> opam library
wraps the brand-based primitives in a curated layer
(<code class="language-plaintext highlighter-rouge">Capsule.Isolated</code>, <code class="language-plaintext highlighter-rouge">Capsule.Guard</code>, <code class="language-plaintext highlighter-rouge">Capsule.Shared</code>,
<code class="language-plaintext highlighter-rouge">Capsule.Data</code>) that hides most of the <code class="language-plaintext highlighter-rouge">Key.P</code>/<code class="language-plaintext highlighter-rouge">password</code>/<code class="language-plaintext highlighter-rouge">access</code>
plumbing for common patterns. For most application code you reach for
those first. We use <code class="language-plaintext highlighter-rouge">Capsule_expert</code> directly here because the curated
wrapper pulls in <code class="language-plaintext highlighter-rouge">base</code>, <code class="language-plaintext highlighter-rouge">sexplib0</code>, and friends that would balloon
the in-browser bundle by ~270 MB. The lecture’s <code class="language-plaintext highlighter-rouge">Capsule.Data</code> calls
are syntactically identical to ours.</p>

<p><strong>Mutex flavour.</strong> The lecture uses <code class="language-plaintext highlighter-rouge">Await_capsule.Mutex.with_lock</code>,
the recommended (non-deprecated) primitive that hands the body an
<code class="language-plaintext highlighter-rouge">Access.t</code> directly. Bundling the <code class="language-plaintext highlighter-rouge">await</code> library into the in-browser
toplevel pulls in roughly <strong>280 MB</strong> of transitive deps (<code class="language-plaintext highlighter-rouge">sexplib</code>,
<code class="language-plaintext highlighter-rouge">stdio</code>, <code class="language-plaintext highlighter-rouge">ppx_*</code>, <code class="language-plaintext highlighter-rouge">base.shadow_stdlib</code>, …), so we use the
deprecated-but-equivalent <code class="language-plaintext highlighter-rouge">Capsule_blocking_sync.Mutex</code> with the
deprecation alert silenced. It hands the body a <code class="language-plaintext highlighter-rouge">'k Password.t @ local</code>,
which we convert to a <code class="language-plaintext highlighter-rouge">'k Access.t</code> via <code class="language-plaintext highlighter-rouge">Capsule_expert.access</code>. One
extra line of plumbing; same modes are doing the work. In a real
<code class="language-plaintext highlighter-rouge">.ml</code> file with <code class="language-plaintext highlighter-rouge">await</code> available, swap <code class="language-plaintext highlighter-rouge">Capsule_blocking_sync.Mutex</code>
for <code class="language-plaintext highlighter-rouge">Await_capsule.Mutex</code> and drop the <code class="language-plaintext highlighter-rouge">Capsule_expert.access</code> step;
the lecture’s verbatim form in the
<a href="https://github.com/fplaunchpad/cs6868_s26/blob/main/lectures/11_oxcaml/handout.md#part-5-capsules--safe-shared-mutable-state">handout</a>
is what you want.</p>

<h2 id="whats-left">What’s left</h2>

<p>The capsule pattern is enough to put a hash table or a more elaborate
shared structure under a single mutex with a compile-time guarantee
that nothing touches it without the lock. To actually run gensym from
two domains in parallel we need the parallel scheduler:
<code class="language-plaintext highlighter-rouge">Parallel.fork_join2</code> and the <code class="language-plaintext highlighter-rouge">@ portable once</code> story for closures
crossing the fork boundary. That’s the next post.</p>

<p>The point of stopping here is that the hard part of safe shared
mutable state in OxCaml isn’t the parallel primitive, it’s the lock
discipline. And the lock discipline turns out to be a small
composition of the mode axes we already had to learn anyway. The new
library is small; the framework was already in place.</p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="OxCaml" /><category term="Modes" /><category term="Blogging" /><summary type="html"><![CDATA[In the previous post we fixed the racy gensym with Portable.Atomic. That worked because the shared state was a single integer with atomic primitives. What about state that needs a hash table, a multi-step update, or any structure where atomics aren’t enough? OxCaml’s answer is the capsule: a way to bundle state with its lock so the lock discipline becomes a type-checker job rather than a programmer convention.]]></summary></entry><entry><title type="html">Data race freedom in OxCaml</title><link href="https://kcsrk.info/ocaml/oxcaml/x-ocaml/blogging/2026/05/07/data-race-freedom-in-oxcaml/" rel="alternate" type="text/html" title="Data race freedom in OxCaml" /><published>2026-05-07T10:00:00+00:00</published><updated>2026-05-07T10:00:00+00:00</updated><id>https://kcsrk.info/ocaml/oxcaml/x-ocaml/blogging/2026/05/07/data-race-freedom-in-oxcaml</id><content type="html" xml:base="https://kcsrk.info/ocaml/oxcaml/x-ocaml/blogging/2026/05/07/data-race-freedom-in-oxcaml/"><![CDATA[<p>A while back I <a href="/ocaml/x-ocaml/blogging/2025/06/20/xocaml/">wired up <code class="language-plaintext highlighter-rouge">x-ocaml</code></a> so this
blog could embed live, editable OCaml notebooks. That post used a vanilla
OCaml 5 toplevel. Today the toplevel running in your browser is built
from <a href="https://oxcaml.org">OxCaml</a>, the Jane Street fork of the compiler.
That means we can prove a small parallel program is data-race free,
interactively, without ever spawning a thread.</p>

<!--more-->

<p>The examples below are adapted from the OxCaml team’s excellent
<a href="https://oxcaml.org/documentation/tutorials/01-intro-to-parallelism-part-1/">Intro to Parallelism (Part 1)</a>
tutorial, by way of my recent
<a href="https://github.com/fplaunchpad/cs6868_s26">CS6868</a> lecture on OxCaml
(<a href="https://github.com/fplaunchpad/cs6868_s26/blob/main/lectures/11_oxcaml/handout.md">handout</a>).
The tutorial is the canonical reference and is worth reading in full;
this post tries to capture the essence in a more bite-sized form,
focused on the two new mode axes that together rule out data races at
compile time — and on one subtlety about portability that’s easy to
misread.</p>

<h2 id="a-quick-aside-data-races-in-ocaml-are-less-scary">A quick aside: data races in OCaml are less scary</h2>

<p>Before we dive in, it’s worth pausing on why we want to rule out data
races at all in OCaml. In C, C++, or unsafe Rust, a data race is
catastrophic: the standard licenses the compiler to do <em>anything</em>,
including silent memory corruption, on the basis that your program had
no defined meaning to begin with. OCaml is considerably gentler.
<a href="https://ocaml.org/manual/5.4/memorymodel.html">OCaml’s memory model</a>
guarantees that even programs with races preserve type safety and
memory safety — a racy program may observe weakly-consistent values
across domains, but it will not crash, won’t read uninitialised memory,
and won’t violate the type system’s invariants.</p>

<p>So a race in OCaml is a logic bug, not a runtime footgun. Why bother
catching them statically, then? Because once a program is data-race
free, you get <strong>sequential consistency</strong>: any observed behaviour can
be explained as some interleaving of the operations of the different
domains, with each domain’s own operations executed in program order.
That’s still concurrent reasoning — there are many possible
interleavings — but it’s the simplest model concurrent code can have,
and equational reasoning, induction, and the usual program-logic tools
all transfer over to each interleaving. Drop race freedom and you lose
this: observed behaviours can only be justified by allowing
<em>intra-thread</em> reorderings as well, where a single domain’s operations
appear to execute out of program order to other domains. Sequential
consistency is the real prize. Race freedom is what buys it.</p>

<h2 id="hello-oxcaml">Hello, OxCaml</h2>

<p>A quick sanity check that your browser really is running an OxCaml
toplevel. The <code class="language-plaintext highlighter-rouge">@ local</code> annotation is OxCaml-only syntax; on a stock OCaml
parser it wouldn’t even parse.</p>

<x-ocaml>
let use_locally (x @ local) = x + 1

let () = Printf.printf "use_locally 42 = %d\n" (use_locally 42)
</x-ocaml>

<h2 id="a-racy-gensym">A racy <code class="language-plaintext highlighter-rouge">gensym</code></h2>

<p>Here’s the example we’ll keep coming back to: a symbol generator that
hands out distinct ids by incrementing a captured counter. Sequentially
it works; the cell prints two ids. The last line tries to ship <code class="language-plaintext highlighter-rouge">gensym</code>
to another domain, and that’s where the type checker stops us.</p>

<x-ocaml>
[@@@alert "-do_not_spawn_domains"]

let gensym =
  let count = ref 0 in
  fun prefix -&gt;
    count := !count + 1;
    prefix ^ "_" ^ string_of_int !count

let () = Printf.printf "%s %s\n" (gensym "x") (gensym "x")

let _ = Domain.Safe.spawn (fun () -&gt; gensym "x")
</x-ocaml>

<p>Run it from two domains in parallel and you’d tick every box on the
four-ingredient recipe for a data race: two domains, a shared <code class="language-plaintext highlighter-rouge">count</code>,
both writing, and <code class="language-plaintext highlighter-rouge">count</code> is a plain <code class="language-plaintext highlighter-rouge">ref</code> — not atomic. On stock OCaml
5, the compiler would happily let you ship this closure to another
domain. OxCaml refuses, before anything runs. The error names two things
— <em>nonportable</em> and <em>portable</em> — that aren’t in vanilla OCaml’s
vocabulary. What do they mean?</p>

<h2 id="two-new-modes-for-data-race-freedom">Two new modes for data race freedom</h2>

<p>OxCaml extends OCaml’s type system with several
<a href="https://oxcaml.org/jane/doc/extensions/modes/intro/">modes</a> — annotations
that describe <em>how</em> a value can be used, orthogonal to its type. I’ve
already written about two of them on this blog:
<a href="/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types/">uniqueness</a>,
which tracks whether a value has at most one reference, and
<a href="/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness/">linearity</a>, which
tracks how often a value may be used. Today’s post is about a different
pair — the two modes that, together, rule out data races at compile time:</p>

<ul>
  <li><strong>contention</strong> — <code class="language-plaintext highlighter-rouge">uncontended</code> / <code class="language-plaintext highlighter-rouge">contended</code>: tracks whether multiple
domains can simultaneously access a value. A <code class="language-plaintext highlighter-rouge">contended</code> value might
be in another domain’s hands right now, so the type system limits
what you can do with it.</li>
  <li><strong>portability</strong> — <code class="language-plaintext highlighter-rouge">portable</code> / <code class="language-plaintext highlighter-rouge">nonportable</code>: tracks whether a value
can safely <em>cross</em> a domain boundary at all. Closures that are sent
to other domains must be <code class="language-plaintext highlighter-rouge">portable</code>.</li>
</ul>

<p>The pair is enough to catch the <code class="language-plaintext highlighter-rouge">gensym</code> race. Let’s look at each
restriction in isolation, then come back to the closure.</p>

<h3 id="contention-rejects-mutable-writes">Contention rejects mutable writes</h3>

<p>A <code class="language-plaintext highlighter-rouge">contended</code> value might be mutated by another domain right now. So
OxCaml refuses to let you read or write its mutable fields:</p>

<x-ocaml>
type mood = Happy | Neutral | Sad
type thing = { price : float; mutable mood : mood }

(* Reading an immutable field of a contended value is fine. *)
let price_contended (t @ contended) = t.price

(* Writing a mutable field is not. *)
let cheer_up_contended (t @ contended) = t.mood &lt;- Happy
</x-ocaml>

<p>The error on the last line names exactly the rule: <em>“This value is
contended but is expected to be uncontended because its mutable field
mood is being written.”</em> Even <em>reading</em> a mutable field on a <code class="language-plaintext highlighter-rouge">contended</code>
value is rejected — another domain might be mid-write at the same
instant.</p>

<h3 id="portability-rejects-captured-refs">Portability rejects captured refs</h3>

<p>Portability is about closures. A <code class="language-plaintext highlighter-rouge">@ portable</code> closure is one the
compiler has verified is safe to ship to another domain, with one
critical catch: every value the closure <em>captures</em> from its enclosing
scope is treated as <code class="language-plaintext highlighter-rouge">contended</code> inside the closure body. A pure
function that doesn’t mutate anything is trivially portable — there’s
nothing the contention rule can object to:</p>

<x-ocaml>
let test_portable () =
  let (f @ portable) = fun x y -&gt; x + y in
  f 1 2

let () = Printf.printf "test_portable () = %d\n" (test_portable ())
</x-ocaml>

<p>Capturing a <code class="language-plaintext highlighter-rouge">ref</code> is fine on its own; <em>mutating</em> one that’s been
captured is what falls afoul of the rule:</p>

<x-ocaml>
let test_nonportable () =
  let r = ref 0 in
  let (counter @ portable) () = incr r; !r in
  counter ()
</x-ocaml>

<p>Read the error carefully — it tells you <em>exactly why</em> <code class="language-plaintext highlighter-rouge">counter</code> isn’t
portable. The closure is <code class="language-plaintext highlighter-rouge">@ portable</code>, so the captured <code class="language-plaintext highlighter-rouge">r</code> is treated
as <code class="language-plaintext highlighter-rouge">contended</code> inside the body. But <code class="language-plaintext highlighter-rouge">incr r</code> is a mutation, and writing
through a ref requires it to be <code class="language-plaintext highlighter-rouge">uncontended</code>. The two rules collide.
And now the original <code class="language-plaintext highlighter-rouge">gensym</code> rejection makes sense: it does exactly
the same thing — mutates a captured <code class="language-plaintext highlighter-rouge">count</code>.</p>

<h2 id="the-trap-and-the-actual-rule">The trap, and the actual rule</h2>

<p>Read those last two cells together and it’s tempting to draw the wrong
moral: that to make a function safe to ship to another domain, you have
to give up side effects. That would be far too restrictive.</p>

<p>The actual rule is narrower, and the difference matters.</p>

<p>Portability constrains what a closure <strong>captures from its enclosing
scope</strong>. Captured values become contended — that’s the rule, and it’s
why <code class="language-plaintext highlighter-rouge">gensym</code> got rejected. But a closure’s <strong>parameters</strong> are not
captures. They’re handed in fresh at each call, so they can be passed at
any mode the type spells out — including <code class="language-plaintext highlighter-rouge">@ uncontended</code>. The obligation
to provide an uncontended argument shifts to whoever is doing the
calling. That’s a much weaker requirement than “no side effects.”</p>

<h3 id="captured-vs-parameter-in-code">Captured vs parameter, in code</h3>

<p>Here’s the example that makes the distinction concrete. <code class="language-plaintext highlighter-rouge">loop</code> is
<code class="language-plaintext highlighter-rouge">@ portable</code>, and its body mutates an <code class="language-plaintext highlighter-rouge">int ref</code>. That works because the
ref is a <em>parameter</em> annotated <code class="language-plaintext highlighter-rouge">@ uncontended</code>, not something captured:</p>

<x-ocaml>
let (factorial_portable @ portable) n =
  let a = ref 1 in
  let rec (loop @ portable) (a @ uncontended) i =
    if i &gt; 0 then begin
      a := !a * i;
      loop a (i - 1)
    end
  in
  loop a n;
  !a

let () = Printf.printf "factorial_portable 10 = %d\n" (factorial_portable 10)
</x-ocaml>

<p>Two <code class="language-plaintext highlighter-rouge">@ portable</code> annotations are doing work here. The inner one says
<code class="language-plaintext highlighter-rouge">loop</code> is shippable to another domain — that’s the interesting one,
and it works because <code class="language-plaintext highlighter-rouge">a</code> is <code class="language-plaintext highlighter-rouge">loop</code>’s <em>parameter</em>, not a capture: when
<code class="language-plaintext highlighter-rouge">loop</code> is eventually called from somewhere parallel, that somewhere
has to prove the <code class="language-plaintext highlighter-rouge">a</code> it passes in is uncontended. Portability didn’t
ban mutation; it pushed the proof to the call site. The outer
annotation says the whole <code class="language-plaintext highlighter-rouge">factorial_portable</code> is portable too — it
allocates a fresh <code class="language-plaintext highlighter-rouge">a</code> on each call and captures nothing from outside,
so we can ship the whole function to another domain. The compiler
verifies both annotations as part of accepting the program.</p>

<h2 id="a-formal-aside-defaults-and-submoding">A formal aside: defaults and submoding</h2>

<p>We’ve been writing <code class="language-plaintext highlighter-rouge">@ contended</code> and <code class="language-plaintext highlighter-rouge">@ portable</code> as if they were the
interesting modes and their absence was nothing in particular. There’s
actually a small lattice on each axis, with a default and a direction
of “how strong” the guarantee is. The handout summarises it like this:</p>

<table>
  <thead>
    <tr>
      <th>Axis</th>
      <th>Modes (<code class="language-plaintext highlighter-rouge">⊑</code>)</th>
      <th>Default</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Contention</td>
      <td><code class="language-plaintext highlighter-rouge">uncontended</code> ⊑ <code class="language-plaintext highlighter-rouge">shared</code> ⊑ <code class="language-plaintext highlighter-rouge">contended</code></td>
      <td><code class="language-plaintext highlighter-rouge">uncontended</code></td>
    </tr>
    <tr>
      <td>Portability</td>
      <td><code class="language-plaintext highlighter-rouge">portable</code> ⊑ <code class="language-plaintext highlighter-rouge">nonportable</code></td>
      <td><code class="language-plaintext highlighter-rouge">nonportable</code></td>
    </tr>
  </tbody>
</table>

<p>The relation <code class="language-plaintext highlighter-rouge">A ⊑ B</code> is the <strong>submoding order</strong>: a value at mode <code class="language-plaintext highlighter-rouge">A</code>
may be used wherever mode <code class="language-plaintext highlighter-rouge">B</code> is expected, because <code class="language-plaintext highlighter-rouge">A</code> carries the
stronger guarantee and <code class="language-plaintext highlighter-rouge">B</code> is the looser expectation. An <code class="language-plaintext highlighter-rouge">uncontended</code>
value satisfies a <code class="language-plaintext highlighter-rouge">@ contended</code> parameter (we just promised the callee
might let other domains touch it; if no other domain ever does, that
promise is trivially compatible). A <code class="language-plaintext highlighter-rouge">portable</code> closure satisfies a
<code class="language-plaintext highlighter-rouge">@ nonportable</code> slot. But not the other way around — submoding goes
one way only.</p>

<p>The defaults are what you get when you write plain OCaml. Every value
starts out <code class="language-plaintext highlighter-rouge">uncontended</code> — no other domain has it, so reads and writes
are unrestricted. Every closure starts out <code class="language-plaintext highlighter-rouge">nonportable</code> — we make no
claim about shipping it elsewhere. That’s why ordinary OCaml code keeps
type-checking under OxCaml: defaults are the most permissive end of
each axis, and you only meet the new restrictions when you (or a
parallel API) ask for a stronger guarantee. The post uses only the two
endpoints of the contention chain — <code class="language-plaintext highlighter-rouge">uncontended</code> and <code class="language-plaintext highlighter-rouge">contended</code> — and
skips <code class="language-plaintext highlighter-rouge">shared</code>, which permits read-only access across domains and isn’t
needed for the gensym story.</p>

<x-ocaml class="hidden">
module Portable = struct
  module Atomic : sig
    type !'a t : value mod contended portable
    val make : 'a @ portable contended -&gt; 'a t
    val fetch_and_add : int t @ local -&gt; int -&gt; int
  end = struct
    type !'a t : value mod contended portable =
      'a Basement.Portable_atomic.t
    let make = Basement.Portable_atomic.make
    let fetch_and_add = Basement.Portable_atomic.fetch_and_add
  end
end
</x-ocaml>

<h2 id="back-to-gensym-the-fix">Back to <code class="language-plaintext highlighter-rouge">gensym</code>: the fix</h2>

<p>The captured-vs-parameter distinction is enough to fix simple loops, but
<code class="language-plaintext highlighter-rouge">gensym</code> is a different shape: there’s a single counter that must be
shared across calls. We need a counter type that’s <em>itself</em> portable —
something whose values mode-cross both contention and portability, so a
closure capturing it stays portable. That’s exactly <code class="language-plaintext highlighter-rouge">Portable.Atomic.t</code>,
from OxCaml’s
<a href="https://github.com/oxcaml/oxcaml/tree/main/portable"><code class="language-plaintext highlighter-rouge">portable</code></a>
library: a <code class="language-plaintext highlighter-rouge">'a Portable.Atomic.t</code> is always portable and always
uncontended, regardless of where it lives.</p>

<p>Wrap the counter and <code class="language-plaintext highlighter-rouge">gensym</code> in a module so the toplevel can see the
whole bundle as portable, then actually do what got us in trouble at
the start: ship <code class="language-plaintext highlighter-rouge">gensym</code> to a freshly-spawned domain. The compiler
accepts it (it verified <code class="language-plaintext highlighter-rouge">Gen</code> is portable), and the program runs:</p>

<x-ocaml>
[@@@alert "-do_not_spawn_domains"]

module Gen = struct
  open Portable
  let count = Atomic.make 0
  let gensym prefix =
    let n = Atomic.fetch_and_add count 1 in
    prefix ^ "_" ^ string_of_int n
end

let d  = Domain.Safe.spawn (fun () -&gt; Gen.gensym "y")
let s1 = Gen.gensym "x"
let s2 = Domain.join d
let () = Printf.printf "%s %s\n" s1 s2
</x-ocaml>

<p>The toplevel reports <code class="language-plaintext highlighter-rouge">module Gen : sig ... end @@ portable</code> — mode
inference noticed every value inside is portable and made the whole
module portable, so <code class="language-plaintext highlighter-rouge">Gen.gensym</code> survives extraction at portable mode.
You might wonder why we don’t just write <code class="language-plaintext highlighter-rouge">let (gensym @ portable)
prefix = ...</code> at the top level, the same shape we used for
<code class="language-plaintext highlighter-rouge">factorial_portable</code> above. That’s a quirk of the toplevel: a bare
<code class="language-plaintext highlighter-rouge">let</code> lands in the implicit toplevel module, which itself sits at the
default <code class="language-plaintext highlighter-rouge">nonportable</code> mode, so when <code class="language-plaintext highlighter-rouge">Domain.Safe.spawn</code> later reads
<code class="language-plaintext highlighter-rouge">gensym</code> back out, it sees a nonportable binding and refuses — even
though the closure was verified portable at its binding site. The
<code class="language-plaintext highlighter-rouge">factorial_portable</code> cell got away with the simpler form only because
nothing else tried to extract the binding at portable mode. Wrapping
in <code class="language-plaintext highlighter-rouge">module Gen</code> gives the closure its own portable home, which is what
lets it actually cross a domain boundary. In a real <code class="language-plaintext highlighter-rouge">.ml</code> file the
whole compilation unit is a module that mode inference can mark
portable wholesale, so this dance isn’t necessary.</p>

<p>(A note on the <code class="language-plaintext highlighter-rouge">Portable.Atomic</code> you see here: in a real program you’d
get it by <code class="language-plaintext highlighter-rouge">open Portable</code> from the
<a href="https://github.com/oxcaml/oxcaml/tree/main/portable"><code class="language-plaintext highlighter-rouge">portable</code></a> opam
library. To keep the page weight reasonable, this notebook ships only
<a href="https://github.com/oxcaml/oxcaml/tree/main/basement"><code class="language-plaintext highlighter-rouge">basement</code></a> — the
small library that provides the actual atomic operations — and a
hidden setup cell at the top of the page wraps it with the kind
annotation <code class="language-plaintext highlighter-rouge">: value mod contended portable</code>. That kind is what tells the
compiler this type mode-crosses both axes — a closure capturing one
stays portable, and any domain may touch it.) The
<a href="https://github.com/fplaunchpad/cs6868_s26/blob/main/lectures/11_oxcaml/handout.md">handout</a>
notes that for state more elaborate than a single atomic counter, the
right answer is <strong>capsules</strong> — a structural way to bundle mutable state
with its access protocol so the whole package is portable.</p>

<h2 id="whats-left">What’s left</h2>

<p>The race-freedom guarantee here is independent of any test run: every
rejection above happened at the same compile step an OxCaml binary on
disk would go through, before any code executed. The final spawn is a
demonstration, not a proof — it’s the compiler accepting the program
in the first place that tells us no race is possible. Two mode axes, a
kind annotation, and a clear rule about what “portable” means were
enough to make data-race freedom a property the type checker enforces.</p>

<p>The lecture this is drawn from goes further into capsules (for shared
state more elaborate than an atomic) and <code class="language-plaintext highlighter-rouge">Parallel.fork_join2</code> (for
structured parallelism). Material for another post.</p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="OxCaml" /><category term="X-OCaml" /><category term="Blogging" /><summary type="html"><![CDATA[A while back I wired up x-ocaml so this blog could embed live, editable OCaml notebooks. That post used a vanilla OCaml 5 toplevel. Today the toplevel running in your browser is built from OxCaml, the Jane Street fork of the compiler. That means we can prove a small parallel program is data-race free, interactively, without ever spawning a thread.]]></summary></entry><entry><title type="html">From Convergence to Confidence: Push-button verification for RDTs</title><link href="https://kcsrk.info/verification/rdts/lean/2026/04/28/from-convergence-to-confidence/" rel="alternate" type="text/html" title="From Convergence to Confidence: Push-button verification for RDTs" /><published>2026-04-28T10:00:00+00:00</published><updated>2026-04-28T10:00:00+00:00</updated><id>https://kcsrk.info/verification/rdts/lean/2026/04/28/from-convergence-to-confidence</id><content type="html" xml:base="https://kcsrk.info/verification/rdts/lean/2026/04/28/from-convergence-to-confidence/"><![CDATA[<p>What does it mean for a replicated data type to be <em>correct</em>? For most of the
literature, my own prior work included, the answer has been convergence: two
replicas that have applied the same operations end up in the same state. I
argued <a href="/slides/RDT_verification_papoc_2026.pdf">in my PaPoC 2026 keynote</a> last
week that for many useful data types convergence is not enough, and agentic
proof-oriented programming can help close the gap between convergence and
confidence.</p>

<!--more-->

<p>A stronger answer,
<a href="https://dl.acm.org/doi/10.1145/3314221.3314617">replication-aware linearizability</a>
from Wang et al. (PLDI 2019), asks for full functional equivalence with a
sequential specification: the merged state should behave as if the
operations everyone did had run in some sequential interleaving.
RA-linearizability is what our verification work has aimed at for the past
few years, and for a class of useful data types it still falls short: the
ones where the state is a grow-only bag and the interesting semantics live
in the read function. This post is the longer-form version of the keynote
argument, plus a digest of recent work on
<a href="https://github.com/fplaunchpad/sal">Sal</a>.</p>

<p><a href="https://papoc-workshop.github.io/2026/">PaPoC 2026</a> was co-located
with EuroSys in Edinburgh, and was a wonderful workshop. Thanks to
the program chairs, <a href="https://decomposition.al/">Lindsey Kuper</a> and
<a href="https://jopereira.github.io/">José Orlando Pereira</a>, for running
it. A permanent entry for the keynote is on <a href="/talks#papoc_2026">my talks
page</a>.</p>

<h2 id="convergence-is-not-enough">Convergence is not enough</h2>

<p>The canonical example is the
<a href="https://inria.hal.science/inria-00555588/document">OR-Set</a>. The
state is two sets, an <em>adds</em> set
and a <em>tombstones</em> set. Adding an element tags it with a fresh id and
stores the <code class="language-plaintext highlighter-rouge">(element, id)</code> pair in <em>adds</em>. Removing an element does not
delete anything; it adds a tombstone entry for each currently observed
<code class="language-plaintext highlighter-rouge">(element, id)</code> pair to <em>tombstones</em>. Reading filters <em>adds</em> against
<em>tombstones</em>, and merging unions both components componentwise. The data type converges, but tombstones accumulate without
bound. The fix is a bit cleverer than just tagging: each replica tracks
causality explicitly, by maintaining a context of operations it has
observed, and the merge uses that context to recognise when an element
was removed at another replica without having to keep a tombstone for it.
The result is the “efficient” OR-Set.</p>

<p>In practice this is harder than it sounds. Riak’s
<a href="https://github.com/basho/riak_dt">riak_dt</a>, a production CRDT library,
shipped an optimisation to its OR-Set Map that broke monotonicity
(<a href="https://github.com/basho/riak_dt/issues/79">issue #79</a>). Christopher
Meiklejohn, one of the authors of the <a href="https://doi.org/10.1145/2596631.2596633">Riak DT Map
paper</a>, <a href="https://christophermeiklejohn.com/erlang/lasp/2019/03/08/monotonicity.html">later
wrote</a>
about how easy it is to get the inflationary, deterministic,
least-upper-bound conditions wrong, and noted that the team <em>“have even
got this wrong a few times”</em> themselves. Martin Kleppmann’s <a href="https://martin.kleppmann.com/2019/03/25/papoc-interleaving-anomalies.html">2019 PaPoC
paper on collaborative-text
anomalies</a>
had errata of the same flavour, appended to the publication page a few
years after the fact: the non-interleaving condition proposed in §2.1
“cannot be guaranteed by any algorithm,” and the algorithm in §3.1 does
not guarantee convergence.</p>

<p>A well-defined join and a join that does what the user expects are not the
same thing, and the gap between the two is where most of these bugs live.</p>

<h2 id="a-short-history-of-how-we-tried-to-close-that-gap">A short history of how we tried to close that gap</h2>

<p>My group has been chipping at this for a few years. In
<a href="/papers/certified_mrdt.pdf">Peepul</a> (PLDI 2022) we tried to capture intent
axiomatically: write the spec as a fold over a partially ordered event
graph, then prove the implementation simulates it. The proofs closed, but
the cost was telling. The queue MRDT case study took 1,123 lines of proof
spread over 75 lemmas against a 32-line implementation, with each F*
verification run taking on the order of 4,753 seconds. F*’s SMT-aided
proofs were brittle (z3 upgrades broke them, and the solver only ever
returned yes / no / timeout with no counterexample), and the methodology
did not really scale past a handful of carefully-chosen examples.</p>

<p><a href="/papers/mrdtconverge_jan_25.pdf">Neem</a> (OOPSLA 2025) replaced the axiomatic
spec with RA-linearizability, with one twist:
instead of taking a separate sequential specification as input, Neem treats
the operational definition of <code class="language-plaintext highlighter-rouge">do_</code> itself as the spec. The merge is
correct if the resulting state behaves like some sequential interleaving of
the updates under that definition. Neem also gave us a fixed schedule of 24
verification conditions (six on <code class="language-plaintext highlighter-rouge">do_</code>, twelve on <code class="language-plaintext highlighter-rouge">merge</code>, six on conflict
resolution) that, if all closed, certify RA-linearizability. That was a
real upgrade. But it was still in F*, still SMT-fragile, and there was a
class of useful RDTs for which closing the 24 VCs left the correctness
statement itself near-vacuous.</p>

<p>The pattern that exposes that vacuity is recognisable. When porting an
op-based CRDT to a state-based one, a common move is to dump every
operation into a grow-only set and leave the read function to
reconstruct the result. The merge is then set union, any sequence of
operations is trivially its own linearisation, and the 24
RA-linearizability VCs close without much work. The data type can still
be wrong, because nothing in that proof has said anything about
<em>intent</em>: which characters are bold, which key sits at the head of the
queue. Peepul did try to capture intent, but it expressed the spec over
a partially ordered graph of events with no further structure, and that
turned out to be both an awkward medium to write specs in and a brittle
one to prove against.</p>

<p>We have started calling these vacuous-convergence cases <em>Tier-C</em>
RDTs, in a three-tier taxonomy that we settled on this past month:</p>

<ul>
  <li><strong>Tier A</strong>: state <em>is</em> the semantic content. LWW-Register, PN-Counter,
Bounded-Counter, MAX-Map. The RA-linearizability VCs reduce to lattice
arithmetic and cover what one wants to know.</li>
  <li><strong>Tier B</strong>: merge does non-trivial computation. Multi-Valued-Register,
LWW-Element-Set. Still substantive.</li>
  <li><strong>Tier C</strong>: state is a grow-only bag, and the semantics live in the
<em>read</em>. OR-Set, RGA, Add-Win Priority Queue, and Peritext (which the
rest of the post comes back to).</li>
</ul>

<p>Roughly half the Sal suite, and most of the RDTs people actually want to
use, sit in Tier C. For Tier C, “verified” without read-side theorems is
an overclaim.</p>

<h2 id="what-sal-actually-is">What Sal actually is</h2>

<p>Sal is the work I presented at PaPoC, joint with Pranav Ramesh and Vimala
Soundarapandian. The wider line of work this builds on also involves
Adarsh Kamath, Aseem Rastogi, and Kartik Nagar. It ports Neem to Lean 4
and adds a multi-modal proof tactic. The repo is at
<a href="https://github.com/fplaunchpad/sal">https://github.com/fplaunchpad/sal</a>.</p>

<p>The Lean choice is not original to us. Ilya Sergey is the one who
convinced me to take Lean seriously, and Sal’s multi-modal tactic stack
is directly inspired by <a href="https://verse-lab.github.io/">Loom</a>, the
multi-modal proof-orchestration layer his group has been building.
<a href="https://proofsandintuitions.net/2026/01/21/multi-modal-verification-velvet/">Velvet</a>,
the Lean library it sits inside, verifies imperative programs; Sal
verifies RDTs. The toolchain is shared.</p>

<p>When you write <code class="language-plaintext highlighter-rouge">by sal</code> in front of a verification condition, three things
happen in sequence. First, <code class="language-plaintext highlighter-rouge">dsimp + grind</code>: pure rewriting plus Lean’s
bit-level automation. If this closes the goal, the proof term is checked
by the Lean kernel and the trusted base is just Lean. If it doesn’t, Sal
hands the goal to Z3 via
<a href="https://github.com/argumentcomputer/LeanBlaster">lean-blaster</a>. Fast, but
Z3 is now in the trusted base, and Sal records this with a <code class="language-plaintext highlighter-rouge">Blaster_admit</code>
annotation so the borrowed trust is visible. If that does not close
either, an AI agent (Claude Code or <a href="https://harmonic.fun">Aristotle</a>)
tries to write a proof term, which Lean then kernel-checks. The TCB
shrinks back to just Lean.</p>

<p>For the LWW-Register all 24 VCs close at stage 1. The whole specification
fits on a screen:</p>

<div class="language-lean highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">abbrev</span> <span class="n">concrete_st</span> := <span class="o">ℕ</span> <span class="o">×</span> <span class="o">ℕ</span>
<span class="k">def</span> <span class="n">init_st</span> : <span class="n">concrete_st</span> := (<span class="mi">0</span>, <span class="mi">0</span>)

<span class="k">def</span> <span class="n">lex_max</span> (<span class="n">a</span> <span class="n">b</span> : <span class="o">ℕ</span> <span class="o">×</span> <span class="o">ℕ</span>) : <span class="o">ℕ</span> <span class="o">×</span> <span class="o">ℕ</span> :=
  <span class="n">if</span> <span class="n">a</span><span class="o">.1</span> <span class="o">&gt;</span> <span class="n">b</span><span class="o">.1</span> <span class="n">then</span> <span class="n">a</span>
  <span class="n">else</span> <span class="n">if</span> <span class="n">b</span><span class="o">.1</span> <span class="o">&gt;</span> <span class="n">a</span><span class="o">.1</span> <span class="n">then</span> <span class="n">b</span>
  <span class="n">else</span> <span class="n">if</span> <span class="n">a</span><span class="o">.2</span> <span class="o">≥</span> <span class="n">b</span><span class="o">.2</span> <span class="n">then</span> <span class="n">a</span> <span class="n">else</span> <span class="n">b</span>

<span class="k">def</span> <span class="n">do_</span> (<span class="n">s</span> : <span class="n">concrete_st</span>) : <span class="n">op_t</span> <span class="o">→</span> <span class="n">concrete_st</span>
  <span class="o">|</span> (<span class="n">ts</span>, <span class="n">_</span>, <span class="o">.</span><span class="n">Write</span> <span class="n">v</span>) <span class="o">=&gt;</span> <span class="n">lex_max</span> <span class="n">s</span> (<span class="n">ts</span>, <span class="n">v</span>)

<span class="k">def</span> <span class="n">merge</span> (<span class="n">a</span> <span class="n">b</span> : <span class="n">concrete_st</span>) : <span class="n">concrete_st</span> := <span class="n">lex_max</span> <span class="n">a</span> <span class="n">b</span>

<span class="k">theorem</span> <span class="n">merge_comm</span> (<span class="n">a</span> <span class="n">b</span> : <span class="n">concrete_st</span>) : <span class="n">eq</span> (<span class="n">merge</span> <span class="n">a</span> <span class="n">b</span>) (<span class="n">merge</span> <span class="n">b</span> <span class="n">a</span>) := <span class="k">by</span> <span class="n">sal</span>
<span class="k">theorem</span> <span class="n">merge_idem</span> (<span class="n">s</span> : <span class="n">concrete_st</span>) : <span class="n">eq</span> (<span class="n">merge</span> <span class="n">s</span> <span class="n">s</span>) <span class="n">s</span>            := <span class="k">by</span> <span class="n">sal</span><span class="cd">
-- ... 22 more, all closed by `by sal`</span>
</code></pre></div></div>

<p>Across the current Sal suite (28 RDTs, 17 CRDTs and 11 MRDTs, 648
verification conditions) the breakdown from the paper was roughly 69%
closed at stage 1, 28% via Z3, and 3% via AI-completed ITP. That is the
“push-button” claim, and for Tier A and B it holds up well.</p>

<p>Two practical notes. First, <a href="https://harmonic.fun">Aristotle</a>
has been the workhorse at stage 3: <code class="language-plaintext highlighter-rouge">LWW-Map</code>, <code class="language-plaintext highlighter-rouge">Shopping-Cart</code>,
and <code class="language-plaintext highlighter-rouge">MAX-Map</code> each had VCs that neither <code class="language-plaintext highlighter-rouge">grind</code> nor Z3 closed, and
Aristotle produced kernel-checked intermediate lemmas that did. Second,
on the brittleness-of-tooling worry, the suite was bumped from Lean
4.26 to 4.28 between the paper draft and PaPoC. One Shopping-Cart
obligation drifted under the new <code class="language-plaintext highlighter-rouge">grind</code> and needed a tweak; everything
else carried over. That is a different experience from the F* /
z3-upgrade story above.</p>

<p>Tier C is where things get more involved.</p>

<h2 id="the-peritext-story">The Peritext story</h2>

<p><a href="https://www.inkandswitch.com/peritext/">Peritext</a> is a CRDT for
collaborative rich-text editing, by Geoffrey Litt, Sarah Lim, Martin
Kleppmann, and Peter van Hardenberg (CSCW 2022). The design has four
state components (characters, parent pointers, tombstones, formatting
marks). What makes the paper unusually pleasant to mechanise is the
care its authors put into specifying intent. §3 walks through eight
worked examples of intent preservation, each spelled out as a small
concrete scenario, and appendix §A.2 catalogues them as a single
table for reference.</p>

<p>In one, Alice bolds the entire sentence “The fox jumped” while Bob
inserts the word “brown” in the middle, and the result should be
<strong>The brown fox jumped</strong>. In another, working with the same sentence,
Alice bolds “The fox” while Bob bolds “fox jumped”, and the
overlapping word “fox” ends up bold under both. A third concerns link
spans: a link has a hard right edge, so a concurrent insert at the
end of the link should not expand it. And so on for the rest of the
eight. The paper argues informally that the design handles each
example correctly, and ships a TypeScript reference plus
property-based tests, but no machine-checked proofs.</p>

<p>The 24 RA-linearizability VCs are trivial here. Peritext’s state is
four grow-only components and componentwise union commutes; <code class="language-plaintext highlighter-rouge">by sal</code>
closes them in seconds. That work lives in <code class="language-plaintext highlighter-rouge">Peritext_CRDT.lean</code> (571
lines, mostly state plumbing), and at the level of RA-linearizability
the data type is verified.</p>

<p>The interesting questions, however, are about the <em>read</em>: the function
that takes the state and renders it as formatted text. That function
lives in <code class="language-plaintext highlighter-rouge">Peritext_ReadSide.lean</code>, currently 1,316 lines. It answers
questions like: do the start- and end-side anchor bits matter? do
overlapping bolds compose? does tombstoning a character preserve the
formatting of everything else? do insertions at a span boundary fall
inside or outside the span under the paper’s expand-vs-contract rules?
None of these are anywhere in the 24 VCs.</p>

<p>The eight worked examples in §3 turn out to be a much better
starting point than the VCs for actually mechanising the data type. Each
example is small enough to write down as a <em>concrete</em> state and a
<em>concrete</em> claim about its read. Nik Swamy and Shuvendu Lahiri have
<a href="https://risemsr.github.io/blog/2026-04-16-spotting-specs/">a recent
post</a>
calling this kind of artifact a <em>Small Proof-Oriented Test</em>, or SPOT:
“a small, concrete, verified test case, proving that the test always
succeeds.” By writing §3 the way they did, the Peritext authors had
effectively done the specification-engineering half of that job for
us; all that was left was to translate each example into Lean and
prove it against the implementation. So we did.</p>

<p>Take Example 1 from the paper. Alice bolds the entire sentence
“The fox jumped” while Bob concurrently inserts the word “brown”
in the middle. Convergence is not the question: both replicas will
agree on a state once they merge. The question is what the rendered
text looks like. Peritext’s expand-on-the-bold-side rule says the
inserted text should fall inside Alice’s bold span, so the merged
result reads <strong>“The brown fox jumped”</strong>, with “brown” formatted bold
along with everything else. In Sal we simplify Bob’s insert to a
single character ‘b’ (enough to exercise the rule, and the proof
stays tractable), and the SPOT translates almost directly:</p>

<div class="language-lean highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cd">-- Alice (rid 0) types "The fox jumped" and bolds the whole thing.</span>
<span class="k">def</span> <span class="n">ex1_pre</span> : <span class="n">Scenario</span> :=
  <span class="n">the_fox_jumped</span><span class="o">.</span><span class="n">bold</span> <span class="mi">0</span>
    [<span class="err">'</span><span class="n">T</span><span class="err">'</span>,<span class="err">'</span><span class="n">h</span><span class="err">'</span>,<span class="err">'</span><span class="n">e</span><span class="err">'</span>,<span class="err">'</span> <span class="err">'</span>,<span class="err">'</span><span class="n">f</span><span class="err">'</span>,<span class="err">'</span><span class="n">o</span><span class="err">'</span>,<span class="err">'</span><span class="n">x</span><span class="err">'</span>,<span class="err">'</span> <span class="err">'</span>,<span class="err">'</span><span class="n">j</span><span class="err">'</span>,<span class="err">'</span><span class="n">u</span><span class="err">'</span>,<span class="err">'</span><span class="n">m</span><span class="err">'</span>,<span class="err">'</span><span class="n">p</span><span class="err">'</span>,<span class="err">'</span><span class="n">e</span><span class="err">'</span>,<span class="err">'</span><span class="n">d</span><span class="err">'</span>]

<span class="cd">-- Bob (rid 1) concurrently inserts 'b' between "The " and "fox".</span>
<span class="k">def</span> <span class="n">ex1_post</span> : <span class="n">Scenario</span> := <span class="n">ex1_pre</span><span class="o">.</span><span class="n">insertCharAfter</span> <span class="mi">1</span> (<span class="mi">4</span>, <span class="mi">0</span>) <span class="err">'</span><span class="n">b</span><span class="err">'</span><span class="cd">

-- Alice's bold mark, opId (15, 0), spans the original sentence.</span>
<span class="k">def</span> <span class="n">ex1_mark</span> : <span class="n">MarkOp</span> := <span class="n">Mark</span><span class="o">.</span><span class="n">bold</span> (<span class="mi">15</span>, <span class="mi">0</span>) (<span class="mi">1</span>, <span class="mi">0</span>) (<span class="mi">14</span>, <span class="mi">0</span>)

<span class="cd">-- The intent claim: Bob's 'b' ends up inside Alice's bold span.</span>
<span class="k">example</span> : <span class="n">in_span_visible</span> <span class="n">ex1_post</span><span class="o">.</span><span class="n">state</span> <span class="n">ex1_mark</span> (<span class="mi">16</span>, <span class="mi">1</span>) := <span class="k">by</span> <span class="o">...</span>
</code></pre></div></div>

<p>The other seven worked examples take the same shape: an English
description in the paper, a few lines of DSL builder to set up the
concrete scenario, and a one-line theorem statement that captures
the paper’s intent claim. <code class="language-plaintext highlighter-rouge">Peritext_SPOT.lean</code> is 323 lines for all
eight. The proofs themselves are unremarkable; the work was in
setting up the DSL so the scenario reads like the paper.</p>

<p>The read-side theorems in <code class="language-plaintext highlighter-rouge">Peritext_ReadSide.lean</code> are universally
quantified versions of the same statements: instead of “Bob’s ‘b’ is
in Alice’s bold span at this concrete state,” the theorem
<code class="language-plaintext highlighter-rouge">insert_within_span_in_span_visible</code> says “for any state, any mark,
any character inserted strictly inside the span at a fresh opId, the
inserted character is in the span.” Each SPOT then closes via that
universal theorem applied to its concrete state. Generalisation of
the SPOT <em>is</em> the read-side proof, which I find a useful way to
think about the relationship.</p>

<p>We have since written SPOTs for all 28 RDTs in the suite, and the
generalised read-side theorems for the Tier-C ones. The Tier-A SPOTs
are mostly there as documentation that a reader can confirm the read
function does what the lattice arithmetic suggests it does.</p>

<p>It is worth saying how an early draft of the Peritext read-side
went wrong, since the failure is a sharp version of the talk’s
title. I had Claude draft a predicate called <code class="language-plaintext highlighter-rouge">in_span_boundary</code>
from a careful reading of paper §3.3, with four cases (start of
span, end of span, after-of-start, after-of-end) and a boolean side
bit. The proofs went through. About 400 lines of theorems closed
across both the CRDT and the MRDT versions, and I was about to
commit. Re-reading §3.3 with the predicate in front of me, the
predicate turned out to be backward. The <code class="language-plaintext highlighter-rouge">after_of c endId →
endSide</code> clause encoded the opposite of the link-contract case: it
included post-<code class="language-plaintext highlighter-rouge">endId</code> inserts in the span where the paper says they
should be excluded. The proofs validated the proof, not the spec.
Tests pass, code is wrong. The bug was caught only when I wrote an
alternative predicate (<code class="language-plaintext highlighter-rouge">in_span_visible</code>) and the two disagreed on
Example 8.</p>

<p>The fix was a four-rule inductive characterising the RGA visible
order (parent-child, sibling via <code class="language-plaintext highlighter-rouge">opid_max</code>,
left-descendant-of-older-sibling, transitive) plus a separate
<code class="language-plaintext highlighter-rouge">bold_expand_reach</code> predicate for the bold-expand semantics in
Example 7. About 800 lines of the buggy parallel track were
deleted in the process. Eight <code class="language-plaintext highlighter-rouge">_visible</code> theorems now map
one-to-one to the paper’s eight worked examples; there is a
<a href="https://github.com/fplaunchpad/sal/blob/main/docs/peritext-vs-paper.md">crosswalk</a>
for anyone who wants the full table.</p>

<p>The piece of methodology I would recommend to someone starting out:
the SPOTs catch this kind of failure if the SPOT itself is faithful
to the paper’s example. Had I started by formalising the worked
examples as SPOTs and only then generalised, the disagreement on
Example 8 would have shown up immediately rather than after 400
lines of proofs against the wrong predicate. John Regehr’s <a href="https://john.regehr.org/writing/zero_dof_programming.html">zero-degree-of-freedom
LLM coding</a>
post argues a similar point in a different domain: pin the agent
with fast, deterministic, executable oracles, and the agent has
nowhere to drift. The Peritext SPOTs are an executable oracle for
the read-side spec.</p>

<h2 id="beyond-peritext">Beyond Peritext</h2>

<p>Peritext is one example of a broader pattern. PaPoC 2026 had several
other talks where the hard work was not the merge but in writing
down what the data type was supposed to <em>mean</em> in the presence of
concurrent edits. Three I want to flag, because they suggest the
intent-formalisation problem has finally become the bottleneck
people are willing to talk about openly.</p>

<p><a href="https://doi.org/10.1145/3806077.3806695">AegisSheet</a> (Florian
Pfeil, David Scandurra, and Julian Haas, TU Darmstadt) studies
collaborative spreadsheets. Their starting move is honest empirical
work: they tabulate how Google Sheets and Notion behave under all
combinations of concurrent <code class="language-plaintext highlighter-rouge">EditCell</code> / <code class="language-plaintext highlighter-rouge">InsertR/C</code> / <code class="language-plaintext highlighter-rouge">RemoveR/C</code> /
<code class="language-plaintext highlighter-rouge">MoveR/C</code> and classify each outcome as <em>desirable</em>, <em>suboptimal</em>,
or <em>destructive</em>. Most of the cells in those tables are red. They
then design a compositional spreadsheet CRDT that turns the red
cells green. The CRDT is the easy half; the table of intended
semantics is the hard half, and they did it.</p>

<p><a href="https://doi.org/10.1145/3806077.3806691">ERA</a> (Kegan Dougal,
Element Creations) addresses the <em>duelling admins</em> problem in
group-management CRDTs in Matrix and Keyhive: two
equally-permissioned admins concurrently revoke each other’s
permissions. Whose revocation wins? Kleppmann’s PaPoC 2025
keynote suggested seniority ranking, but a Byzantine admin can
backdate events to roll back their own revocation. ERA proposes
<em>epoch events</em>: an external arbiter batches concurrent events into
epochs and imposes a bounded total order, which gives finality
without sacrificing availability. Like the spreadsheet case, the
contribution is mostly about specifying what <em>correct</em> looks like
under adversarial concurrency, with the data structure designed to
match.</p>

<p>The closing lightning talk by Florian Jacob, Johanna Stuber, and
Hannes Hartenstein (KIT) sketches a path to formal verification of
local-first access control. Same shape: write the intent down
precisely, then prove an implementation matches.</p>

<p>None of these are RDT-specific. Each is a different domain
(rich-text, spreadsheets, group membership, access control) where
the merges are easy and the intent is hard. SPOTs, oracles, and the multi-modal Lean stack are
all reactions to the same shift in where the difficulty lives.</p>

<h2 id="the-past-three-weeks">The past three weeks</h2>

<p>Since the Sal paper was finalised, the repo has had something like 200
commits, overwhelmingly mine, with Pranav having landed the
architectural heavy lifting earlier on. The headline change is in
coverage: 13 new CRDTs and 2 new MRDTs since the paper, taking the
suite from 13 RDTs to 28. Most of the additions are smaller scalar,
set, and map types (MAX-Register, MIN-Register, LWW-Register,
LWW-Map, MAX-Map, Grow-Only-Set, Grow-Only-Multiset,
LWW-Element-Set, Shopping-Cart, and a few more) that fill out
Tier-A and Tier-B coverage and serve as documentation that the
obvious read functions do the obvious thing. The three additions
that took real work, and that the rest of this section is about,
are the paper-only designs that finally got machine-checked Lean
ports.</p>

<p><strong>Peritext</strong> is the largest of the three: 1,316 lines of read-side, 571
of CRDT, 291 of a small DSL for the §3 worked examples, and 323 of
SPOTs. Eight intent-preservation theorems matching the paper’s worked
examples one-to-one, plus a <code class="language-plaintext highlighter-rouge">wf_afters</code> acyclicity invariant and a
<code class="language-plaintext highlighter-rouge">bold_expand_reach</code> predicate for Example 7. Roughly two days of
agent-paired work, mostly spent on the spec rather than on the proofs.</p>

<p>The <strong>Add-Wins Priority Queue</strong> of <a href="https://doi.org/10.1145/3609437.3609452">Zhang et al. (Internetware
2023)</a> is a 391-line CRDT
plus a 325-line read-side. The translation is interesting
in its own right. The paper has a per-record <code class="language-plaintext highlighter-rouge">count</code> field that ties
commutativity in knots; in Sal we flatten it into a separate <code class="language-plaintext highlighter-rouge">I</code>
component for increment records and use a snapshot-in-op-payload trick
(the <code class="language-plaintext highlighter-rouge">Rmv</code> op carries the observed-set as a parameter), which gives us
<code class="language-plaintext highlighter-rouge">rc := Either</code> everywhere and lets the RA-linearizability VCs drop out
without any SMT.</p>

<p>The <strong>Bounded Counter</strong> of <a href="https://doi.org/10.1109/SRDS.2015.32">Balegas et al. (SRDS
2015)</a>, in the <a href="https://www.bartoszsypytkowski.com/state-based-crdts-bounded-counter/">state-based
formulation</a>
that Sypytkowski wrote up in 2019, is a 465-line CRDT structurally a
PN-Counter plus a transfer matrix. The 24 VCs are
trivial. The bound itself is <em>not</em> part of the verified model: it is
enforced operationally at the client boundary (a replica refuses to emit
a <code class="language-plaintext highlighter-rouge">Dec</code> that would push its quota negative), and that part is currently
unverified. The headline number alone might suggest more than is there,
so worth calling out.</p>

<p>A separate cleanup is worth noting. About 85 Blaster-admits across the
suite closed via a single pattern, <em>per-component decomposition</em>:
instead of one monolithic SMT call on the full state, split the state
into components, prove each component independently, then combine. The
OR-Set MRDT alone went from 20 VCs trusting Z3 down to 3 (closing 17
in a single commit). The pattern was human-found; the agent applied it
consistently across files.</p>

<p>The qualifier from the Peritext section applies to all three of these
ports as well: the RA-linearizability VCs are easy, and the work that
took time was the spec.</p>

<p>For readers who want to play with the data types directly,
<a href="https://fplaunchpad.org/sal">fplaunchpad.org/sal</a> hosts an
interactive playground for every RDT in the suite. The CRDT pages let
you drive two replicas in parallel and merge one into the other;
toggling “show concrete state” exposes the lattice that the
convergence proofs are about. The MRDT pages render the operation
history as a git-style commit DAG and let you do three-way merges
over the lowest common ancestor. None of this is load-bearing for
the verification story, but it is a more direct way to see what the
data types actually do than reading the proofs.</p>

<h2 id="closing">Closing</h2>

<p>Agents have made proof-engineering noticeably cheaper. Spec engineering
still sits with the human author, and SPOTs, executable oracles, and the
multi-modal Lean stack each approach that from a different direction.
None of them is sufficient on its own; in combination they look like a
workable methodology.</p>

<p>The slides are <a href="/slides/RDT_verification_papoc_2026.pdf">here</a>, the repo
is at <a href="https://github.com/fplaunchpad/sal">https://github.com/fplaunchpad/sal</a>, and the question I closed
the talk with was: <em>what will you prove next?</em></p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="Verification" /><category term="RDTs" /><category term="Lean" /><summary type="html"><![CDATA[What does it mean for a replicated data type to be correct? For most of the literature, my own prior work included, the answer has been convergence: two replicas that have applied the same operations end up in the same state. I argued in my PaPoC 2026 keynote last week that for many useful data types convergence is not enough, and agentic proof-oriented programming can help close the gap between convergence and confidence.]]></summary></entry><entry><title type="html">Foundations for hacking on OCaml</title><link href="https://kcsrk.info/ocaml/2025/11/10/hacking/" rel="alternate" type="text/html" title="Foundations for hacking on OCaml" /><published>2025-11-10T10:35:00+00:00</published><updated>2025-11-10T10:35:00+00:00</updated><id>https://kcsrk.info/ocaml/2025/11/10/hacking</id><content type="html" xml:base="https://kcsrk.info/ocaml/2025/11/10/hacking/"><![CDATA[<p>How do you acquire the fundamental computer skills to hack on a complex
systems project like OCaml? What’s missing and how do you go about
bridging the gap?</p>

<!--more-->

<p>There are many fundamental systems skills that go into working on a
language like OCaml that only come with soaking in systems programming. By
systems programming, I mean the ability to use tools like the command-line,
editors, version control, build systems, compilers, debuggers, bash scripting,
and so on.  This is often something that one takes for granted when working on
such projects, but is often inscrutable for new contributors, who may not have
had the opportunity to develop these skills.</p>

<p>I struggle with this in my own research group. Students approach me to work on
the OCaml compiler because they have studied OS, Compilers and Computer
Architecture in class. But once they understand that working on OCaml involves
actually hacking on systems, they are often lost. How do you build the compiler
from source? How do you manage your changes? Do I have to build the entire
compiler if I make a small change in the runtime system? The compiler crashes
with a segfault – how do I debug it? Worse, the students do not even know what
questions to ask, and come back with “This is all new to me, I don’t know where
to begin. ChatGPT doesn’t help.”</p>

<p>The CS education in India often lacks a focus on these practical systems skills,
which can make it challenging for new contributors to get involved in systems
programming.  Looking at my own past, my undergraduate CS education, like many
others in India (and potentially elsewhere), had mandatory OS and Compiler
Construction courses. But neither had a dedicated lab component. It is natural
that these theoretical courses do not prepare the students for the practical
aspects of systems programming.</p>

<p>I was privileged to have a computer at my school, an IBM PC AT Model 5170 and
later an IBM PC 340, and surprisingly, had an education where I got to do
programming from a very young age. There was lots of BASIC programming but also
just tinkering with the system, learning how to use DOS, and later Windows 3.1,
95, and of course playing games (Doom and Prince of Persia, mostly). This early
exposure to computers and systems programming gave me a head start. Many
students, especially those from less privileged backgrounds, do not have this
early exposure. They may have learned some programming, but not had the time to
tinker with systems for extended periods of time.</p>

<p>This challenge of bridging the gap between theoretical CS education and
practical systems programming skills is a common one faced by professors working
in the broad systems area. The problem is compounded by the fact that these
skills are difficult to teach in a traditional classroom setting—they require
hands-on experience, experimentation, and often many hours of frustration and
debugging. These are skills that come from doing, not from reading or watching
lectures. I would be curious to hear from others about their experiences and how
they have addressed this challenge.</p>

<p>That said, there are resources available online that can help new contributors
acquire these skills. This list is biased to the areas of the compiler that I
work on. I mainly work on the backend and the runtime system. The only reason I
usually touch the frontend is to lower the features that I care about to the
backend. Here are some I have found useful for working on the OCaml compiler:</p>

<ul>
  <li>Systems programming
    <ul>
      <li><a href="https://missing.csail.mit.edu/">Course: MIT Missing Semester</a>: This is a
fantastic resource that covers a wide range of topics related to systems
programming, including command-line tools, version control, editors, and
more. The course is available online for free and includes video lectures,
notes, and exercises. I encourage you to read the <a href="https://missing.csail.mit.edu/motivation.html">motivation for this
course</a>.</li>
      <li><a href="https://cs45.stanford.edu/">Course: Stanford CS45</a>: CS45 is an extended version
of the MIT course, and delves into the topics in more detail.</li>
      <li><a href="https://www.youtube.com/watch?v=PorfLSr3DDI">Video: CppCon 2015: Greg Law “Give me 15 minutes &amp; I’ll change your view of
GDB”</a>: The talk explores GDB’s
less-known features and sheds light on some advanced debugging techniques.</li>
      <li><a href="https://rr-project.org/">Tool: rr - Lightweight Recording and Deterministic Debugging</a>:
rr is a powerful tool for recording and replaying program execution, which
can be invaluable for debugging complex issues in systems programming. I’ve
stopped using <code class="language-plaintext highlighter-rouge">gdb</code> directly for anything non-trivial and have switched to
<code class="language-plaintext highlighter-rouge">rr</code>.</li>
    </ul>
  </li>
  <li>OCaml
    <ul>
      <li><a href="https://github.com/fplaunchpad/cs3100_m20">Course: CS3100 Paradigms of Programming</a>:
The course covers a significant chunk of the OCaml language. You should be able
to self-study the course to get a good understanding of the language. That said,
the course deliberately does not cover the build system (dune), package manager
(opam), command-line tools for the compiler (ocamlc, ocamlopt), editor
integration (merlin, ocaml-lsp, ocamlformat), etc.</li>
      <li><a href="https://realworldocaml.org/">Book: Real World OCaml</a>: The book has a section on the
compiler and the runtime system, which gives a great overview of the memory
representation, garbage collection, and other aspects of the runtime system.</li>
    </ul>
  </li>
  <li>Diving deeper
    <ul>
      <li><a href="https://www.brendangregg.com/blog/2020-07-15/systems-performance-2nd-edition.html">Book: Systems Performance: Enterprise and the Cloud, 2nd Edition</a>:
This book provides an in-depth look at systems performance, covering topics
such as CPU architecture, memory hierarchy, storage systems, and networking.
It is a valuable resource for understanding the underlying principles of
systems programming and performance optimization.</li>
      <li><a href="https://gchandbook.org/">Book: The Garbage Collection Handbook</a>: This book
offers a comprehensive overview of garbage collection techniques, algorithms,
and implementations. It is an essential resource for understanding memory
management in programming languages like OCaml.</li>
      <li><a href="https://www.elsevier.com/books/the-art-of-multiprocessor-programming/herlihy/978-0-12-397337-5">Book: The Art of Multiprocessor Programming</a>:
This book provides a deep dive into concurrent programming and
synchronization techniques, which are crucial for understanding
multi-threaded runtime systems like OCaml 5’s multicore runtime and the
programming model.</li>
    </ul>
  </li>
</ul>

<p> </p>

<p>I will probably keep editing this post as I find more resources. If you have
suggestions for other useful resources or experiences to share, please feel free
to reach out to me.</p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><summary type="html"><![CDATA[How do you acquire the fundamental computer skills to hack on a complex systems project like OCaml? What’s missing and how do you go about bridging the gap?]]></summary></entry><entry><title type="html">Testing x-ocaml, OCaml notebooks as a WebComponent</title><link href="https://kcsrk.info/ocaml/x-ocaml/blogging/2025/06/20/xocaml/" rel="alternate" type="text/html" title="Testing x-ocaml, OCaml notebooks as a WebComponent" /><published>2025-06-20T10:00:00+00:00</published><updated>2025-06-20T10:00:00+00:00</updated><id>https://kcsrk.info/ocaml/x-ocaml/blogging/2025/06/20/xocaml</id><content type="html" xml:base="https://kcsrk.info/ocaml/x-ocaml/blogging/2025/06/20/xocaml/"><![CDATA[<p>Can we have OCaml notebooks as pure client-side code? Can these notebooks have
rich editor support (highlighting, formatting, types on hover, autocompletion,
inline diagnostics, etc.)? Can you take packages from OPAM and use them in these
notebooks?</p>

<p>The answer to all of these turns out to be a resounding yes thanks for
<a href="https://github.com/art-w/x-ocaml">x-ocaml</a>. This post is my experiment playing
with <code class="language-plaintext highlighter-rouge">x-ocaml</code> and integrating that into this blog.</p>

<!--more-->

<p>The most wonderful thing about programming is that it lets you experiment
freely. You can try out an idea, get instant feedback, and learn by doing—much
like playing with Lego bricks or sketching on a canvas. The two main courses
that I teach at IITM, <a href="https://github.com/fplaunchpad/cs3100_m20">CS3100</a> and
<a href="https://github.com/fplaunchpad/cs6225_s25_iitm/">CS6225</a>, both involve me
live-coding during every lecture. However, blogging about OCaml where the code
is static and non-interactive always felt a bit unsatisfying.</p>

<p>Enter <a href="https://github.com/art-w/x-ocaml">x-ocaml</a>, which allows for a way to
embed OCaml notebooks into any webpage thanks to WebComponents. All you need to
do is to load some JavaScript in your webpage and you can start embedding code
cells using <code class="language-plaintext highlighter-rouge">&lt;x-ocaml&gt;</code> tag. The snippet below:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;x-ocaml&gt;
print_endline "Hello, world"
&lt;/x-ocaml&gt;
</code></pre></div></div>

<p>renders to:</p>

<x-ocaml>
print_endline "Hello, world"
</x-ocaml>

<p>The code is interpreted in the browser thanks to the OCaml interpreter compiled
to JavaScript through the
<a href="https://ocsigen.org/js_of_ocaml/latest/manual/overview">Js_of_ocaml</a> compiler.
There is also support for <a href="https://github.com/ocaml/merlin">Merlin</a> and
<a href="https://github.com/ocaml-ppx/ocamlformat">OCamlformat</a> in the code editor. Try
hovering over the functions and writing some code. You should see inferred types
and auto-completion suggestions. It turns out that this solution integrates well
with Jekyll, which is what I use for this blog.</p>

<h2 id="reverse-mode-ad-using-effects">Reverse-mode AD using Effects</h2>

<p>Js_of_ocaml also supports <a href="https://ocsigen.org/js_of_ocaml/latest/manual/effects">effect
handlers</a>. Here’s the
implementation of <a href="https://github.com/ocaml-multicore/effects-examples/blob/master/algorithmic_differentiation.ml">reverse-mode algorithmic
differentiation</a>,
implemented using effect handlers running in the browser.</p>

<x-ocaml>
open Effect
open Effect.Deep

module F : sig
  type t

  val mk : float -&gt; t
  val ( +. ) : t -&gt; t -&gt; t
  val ( *. ) : t -&gt; t -&gt; t
  val grad : (t -&gt; t) -&gt; float -&gt; float
  val grad2 : (t * t -&gt; t) -&gt; float * float -&gt; float * float
end = struct
  type t = { v : float; mutable d : float }

  let mk v = { v; d = 0.0 }

  type _ eff += Add : t * t -&gt; t eff
  type _ eff += Mult : t * t -&gt; t eff

  let run f =
    ignore (match f () with
      | r -&gt; r.d &lt;- 1.0; r;
      | effect (Add(a,b)), k -&gt;
          let x = {v = a.v +. b.v; d = 0.0} in
          ignore (continue k x);
          a.d &lt;- a.d +. x.d;
          b.d &lt;- b.d +. x.d;
          x
      | effect (Mult(a,b)), k -&gt;
          let x = {v = a.v *. b.v; d = 0.0} in
          ignore (continue k x);
          a.d &lt;- a.d +. (b.v *. x.d);
          b.d &lt;- b.d +. (a.v *. x.d);
          x)

  let grad f x =
    let x = mk x in
    run (fun () -&gt; f x);
    x.d

  let grad2 f (x, y) =
    let x, y = (mk x, mk y) in
    run (fun () -&gt; f (x, y));
    (x.d, y.d)

  let ( +. ) a b = perform (Add (a, b))
  let ( *. ) a b = perform (Mult (a, b))
end
</x-ocaml>

<p>Here are some tests.</p>

<x-ocaml>

(* XXX KC: `assert` from standard library doesn't work. Why? *)
let assert' c = 
  if not c then raise (Failure "assertion failed!")

let test1 =
  (* f = x + x^3 =&gt;
     df/dx = 1 + 3 * x^2 *)
  for x = 0 to 10 do
    let x = float_of_int x in
    let d1 = F.(grad (fun x -&gt; x +. (x *. x *. x)) x) in
    let d2 = 1.0 +. (3.0 *. x *. x) in
    Printf.printf "%f %f\n" d1 d2;
    assert' ( d1 = d2 )
  done

let test2 = 
  (* f = x^2 + x^3 =&gt;
     df/dx = 2*x + 3 * x^2 *)
  for x = 0 to 10 do
    let x = float_of_int x in
    assert' (
      F.(grad (fun x -&gt; (x *. x) +. (x *. x *. x)) x)
      = (2.0 *. x) +. (3.0 *. x *. x))
  done

let test3 =
  (* f = x^2 * y^4 =&gt;
     df/dx = 2 * x * y^4
     df/dy = 4 * x^2 * y^3 *)
  for x = 0 to 10 do
    for y = 0 to 10 do
      let x = float_of_int x in
      let y = float_of_int y in
      assert' (
        F.(grad2 (fun (x, y) -&gt; x *. x *. y *. y *. y *. y) (x, y))
        = (2.0 *. x *. y *. y *. y *. y, 4.0 *. x *. x *. y *. y *. y))
    done
  done
</x-ocaml>

<h2 id="using-other-libraries">Using other libraries</h2>

<p><code class="language-plaintext highlighter-rouge">x-ocaml</code> also supports loading any js_of_ocaml compatible library into the
webpage. Let’s use <a href="https://github.com/mirage/digestif"><code class="language-plaintext highlighter-rouge">digestif</code></a>.</p>

<p>For any library that you want to export, install the library using opam.
<code class="language-plaintext highlighter-rouge">x-ocaml</code> provide a command-line utility to export the library.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>x-ocaml <span class="nt">--effects</span> digestif.ocaml <span class="nt">-o</span> digestif.js
</code></pre></div></div>

<p>This produces the JavaScript artifact that can be used in the webpage. It may be
instructive to look at the
<a href="https://github.com/kayceesrk/kayceesrk.github.io/blame/54ef5eea28c660aa0d8b3cd2e32d8e93d713ab19/_posts/2025-06-20-xocaml.md">source</a>
of this post to see how the compiler and the libraries are integrated into this
blog post. There is a little script at the top of the file:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="o">&lt;</span><span class="nx">script</span> <span class="k">async</span>
  <span class="nx">src</span><span class="o">=</span><span class="dl">"</span><span class="s2">{{ '/assets/x-ocaml.js' | absolute_url }}</span><span class="dl">"</span>
  <span class="nx">src</span><span class="o">-</span><span class="nx">worker</span><span class="o">=</span><span class="dl">"</span><span class="s2">{{ '/assets/x-ocaml.worker+effects.js' | absolute_url }}</span><span class="dl">"</span>
  <span class="nx">src</span><span class="o">-</span><span class="nx">load</span><span class="o">=</span><span class="dl">"</span><span class="s2">{{ '/assets/digestif.js' | absolute_url }}</span><span class="dl">"</span>
<span class="o">&gt;&lt;</span><span class="sr">/script</span><span class="err">&gt;
</span>
</code></pre></div></div>

<x-ocaml>
let hash = Digestif.MD5.(digest_string "hello" |&gt; to_hex)
</x-ocaml>

<h2 id="what-next">What next?</h2>

<p>There is a number of rough edges to <code class="language-plaintext highlighter-rouge">x-ocaml</code>. This is expected since this
project appears to be one of <a href="https://github.com/art-w">Arthur’s</a> hacking
expeditions (which, as usual, is pushing the state of the art forward).</p>

<p>It would be fun to use this for teaching
<a href="https://github.com/fplaunchpad/cs3100_m20">CS3100</a> and also
<a href="https://github.com/fplaunchpad/learn-ocaml-workshop-2024">other</a>
<a href="https://github.com/ocaml-multicore/ocaml-effects-tutorial">OCaml</a>
<a href="https://github.com/ocaml-multicore/parallel-programming-in-multicore-ocaml">tutorials</a>.
Perhaps even have an interactive version of <a href="https://dev.realworldocaml.org/">Real World OCaml
book</a>.</p>

<p>Not all OCaml libraries can be compiled to JavaScript. The common reason being
that they may depend on features not available on JavaScript. In writing this
post, I unsuccessfully tried for a long time to get
<a href="https://github.com/mirage/mirage-crypto/"><code class="language-plaintext highlighter-rouge">mirage-cypto</code></a> working.
<code class="language-plaintext highlighter-rouge">mirage-crypto</code> has a <a href="https://github.com/mirage/mirage-crypto/tree/main/src/native">large C
dependency</a>, which
does not work with Js_of_ocaml. Js_of_ocaml promises to take any opam library
installed on your opam switch and compiles that to JavaScript. However, at that
point, we’re really cross compiling the opam packages installed on your switch to
JavaScript since the installed package may make some assumptions about the
platform that it is supposed to run on. Hence, JavaScript compilation of
arbitrary OCaml packages is unlikely to work in the general case. Unfortunately,
the error was difficult to debug since the failure was at runtime, and was not
apparent in the error messages (at least for me, who has little JavaScript
experience). It would be nice to have the opam packages explicitly say whether
they are JavaScript compatible, and have build tooling that reports errors like
these early.</p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="X-OCaml" /><category term="Blogging" /><summary type="html"><![CDATA[Can we have OCaml notebooks as pure client-side code? Can these notebooks have rich editor support (highlighting, formatting, types on hover, autocompletion, inline diagnostics, etc.)? Can you take packages from OPAM and use them in these notebooks? The answer to all of these turns out to be a resounding yes thanks for x-ocaml. This post is my experiment playing with x-ocaml and integrating that into this blog.]]></summary></entry><entry><title type="html">Linearity and uniqueness</title><link href="https://kcsrk.info/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness/" rel="alternate" type="text/html" title="Linearity and uniqueness" /><published>2025-06-04T10:00:00+00:00</published><updated>2025-06-04T10:00:00+00:00</updated><id>https://kcsrk.info/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness</id><content type="html" xml:base="https://kcsrk.info/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness/"><![CDATA[<p>In the <a href="/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types/">last post</a>,
we looked at <em>uniqueness</em> mode and how uniqueness may be used to optimise. As we
will see, uniqueness alone is insufficient in practice, and we also need a
concept of <em>linearity</em> for uniqueness to be useful.</p>

<!--more-->

<h2 id="capturing-unique-values">Capturing unique values</h2>

<p>Let’s start with an example. Recall the signature of the unique reference
module.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">Unique_ref</span> <span class="o">=</span> <span class="k">sig</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span>
  <span class="k">val</span> <span class="n">alloc</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">free</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
  <span class="k">val</span> <span class="n">get</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="nn">Modes</span><span class="p">.</span><span class="nn">Aliased</span><span class="p">.</span><span class="n">t</span> <span class="o">*</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">set</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Assume that we also have an implementation of the module:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Unique_ref</span> <span class="o">:</span> <span class="nc">Unique_ref</span>
</code></pre></div></div>

<p>Consider the following example, which works fine:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">works</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">t</span> <span class="o">=</span> <span class="n">alloc</span> <span class="mi">42</span> <span class="k">in</span> <span class="c">(* Allocate a unique reference *)</span>
  <span class="n">free</span> <span class="n">t</span> <span class="c">(* free it *)</span>
</code></pre></div></div>

<p>Now consider this modified example:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">wat</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">t</span> <span class="o">=</span> <span class="n">alloc</span> <span class="mi">42</span> <span class="k">in</span> <span class="c">(* Allocate a unique reference *)</span>
  <span class="k">let</span> <span class="n">f</span> <span class="bp">()</span> <span class="o">=</span> <span class="n">free</span> <span class="n">t</span> <span class="k">in</span> <span class="c">(* capture free in a closure *)</span>
  <span class="n">f</span> <span class="bp">()</span><span class="p">;</span> <span class="c">(* free it *)</span>
  <span class="n">f</span> <span class="bp">()</span> <span class="c">(* free it again??? *)</span>
</code></pre></div></div>

<p>Observe that <code class="language-plaintext highlighter-rouge">f</code> has captured <code class="language-plaintext highlighter-rouge">t</code> in the closure, and when called frees <code class="language-plaintext highlighter-rouge">t</code>. It
should be clear that calling <code class="language-plaintext highlighter-rouge">f</code> <em>more than once</em> is bad – leads to a
double-free issue! What property do we want of <code class="language-plaintext highlighter-rouge">f</code>? Uniqueness is insufficient;
we have a unique reference to <code class="language-plaintext highlighter-rouge">f</code> in this program, with which we call <code class="language-plaintext highlighter-rouge">f</code> twice.</p>

<p>What we want to enforce is that <code class="language-plaintext highlighter-rouge">f</code> can be called <em>at most once</em>. The compiler
has a <em>linearity</em> mode which captures the idea of how many times a value can be
used. We have two modes in the linearity axis – <code class="language-plaintext highlighter-rouge">once</code>, which stands for
“at most once” and <code class="language-plaintext highlighter-rouge">many</code> (the default one for all values), which allows values
to be used arbitrary number of times.</p>

<p>Whenever a unique value is captured by a closure, the closure gets a <code class="language-plaintext highlighter-rouge">once</code>
mode, which allows the closure to be called at most once. This program rightly
gets rejected by the compiler.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">File</span> <span class="s2">"./unique_ref.ml"</span><span class="o">,</span> <span class="n">line</span> <span class="mi">32</span><span class="o">,</span> <span class="n">characters</span> <span class="mi">2</span><span class="o">-</span><span class="mi">3</span><span class="o">:</span>
<span class="mi">32</span> <span class="o">|</span>   <span class="n">f</span> <span class="bp">()</span> <span class="c">(* free it again??? *)</span>
       <span class="o">^</span>
<span class="nc">Error</span><span class="o">:</span> <span class="nc">This</span> <span class="n">value</span> <span class="n">is</span> <span class="n">used</span> <span class="n">here</span><span class="o">,</span>
       <span class="n">but</span> <span class="n">it</span> <span class="n">is</span> <span class="n">defined</span> <span class="k">as</span> <span class="n">once</span> <span class="ow">and</span> <span class="n">has</span> <span class="n">already</span> <span class="n">been</span> <span class="n">used</span><span class="o">:</span>
<span class="nc">File</span> <span class="s2">"./unique_ref.ml"</span><span class="o">,</span> <span class="n">line</span> <span class="mi">31</span><span class="o">,</span> <span class="n">characters</span> <span class="mi">2</span><span class="o">-</span><span class="mi">3</span><span class="o">:</span>
<span class="mi">31</span> <span class="o">|</span>   <span class="n">f</span> <span class="bp">()</span><span class="p">;</span> <span class="c">(* free it *)</span>
       <span class="o">^</span>
</code></pre></div></div>

<h2 id="a-linear-ref">A linear ref</h2>

<p>Now, one might wonder whether the unique reference that we’ve implemented may be
implemented with the linear mode. The answer is yes.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">Linear_ref</span> <span class="o">=</span> <span class="k">sig</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span>
  <span class="k">val</span> <span class="n">alloc</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span>
  <span class="k">val</span> <span class="n">free</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
  <span class="k">val</span> <span class="n">get</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">*</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span>
  <span class="k">val</span> <span class="n">set</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span>
<span class="k">end</span>

<span class="k">module</span> <span class="nc">Linear_ref</span> <span class="o">:</span> <span class="nc">Linear_ref</span> <span class="o">=</span> <span class="k">struct</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">=</span> <span class="p">{</span> <span class="k">mutable</span> <span class="n">value</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="p">}</span>
  <span class="k">let</span> <span class="n">alloc</span> <span class="n">x</span> <span class="o">=</span> <span class="p">{</span> <span class="n">value</span> <span class="o">=</span> <span class="n">x</span> <span class="p">}</span>
  <span class="k">let</span> <span class="n">free</span> <span class="n">t</span> <span class="o">=</span> <span class="bp">()</span>
  <span class="k">let</span> <span class="n">get</span> <span class="n">t</span> <span class="o">=</span>
    <span class="n">t</span><span class="o">.</span><span class="n">value</span><span class="o">,</span> <span class="n">t</span>
  <span class="k">let</span> <span class="n">set</span> <span class="n">t</span> <span class="n">x</span> <span class="o">=</span>
    <span class="n">t</span><span class="o">.</span><span class="n">value</span> <span class="o">&lt;-</span> <span class="n">x</span><span class="p">;</span>
    <span class="n">t</span>
<span class="k">end</span>
</code></pre></div></div>

<p>This works as expected:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">open</span> <span class="nc">Linear_ref</span>

<span class="k">let</span> <span class="n">works</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">r</span> <span class="o">=</span> <span class="n">alloc</span> <span class="mi">42</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">v</span><span class="o">,</span><span class="n">r</span> <span class="o">=</span> <span class="n">get</span> <span class="n">r</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">r</span> <span class="o">=</span> <span class="n">set</span> <span class="n">r</span> <span class="p">(</span><span class="n">v</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">v</span><span class="o">,</span><span class="n">r</span> <span class="o">=</span> <span class="n">get</span> <span class="n">r</span> <span class="k">in</span>
  <span class="n">print_int</span> <span class="n">v</span><span class="p">;</span>
  <span class="n">free</span> <span class="n">r</span><span class="p">;</span>
  <span class="bp">()</span>

<span class="k">let</span> <span class="n">fails</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">r</span> <span class="o">=</span> <span class="n">alloc</span> <span class="mi">42</span> <span class="k">in</span>
  <span class="n">free</span> <span class="n">r</span><span class="p">;</span>
  <span class="n">get</span> <span class="n">r</span> <span class="c">(* fails here *)</span>
</code></pre></div></div>

<p>with the error message:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">File</span> <span class="s2">"./linear_ref.ml"</span><span class="o">,</span> <span class="n">line</span> <span class="mi">34</span><span class="o">,</span> <span class="n">characters</span> <span class="mi">6</span><span class="o">-</span><span class="mi">7</span><span class="o">:</span>
<span class="mi">34</span> <span class="o">|</span>   <span class="n">get</span> <span class="n">r</span> <span class="c">(* fails here *)</span>
           <span class="o">^</span>
<span class="nc">Error</span><span class="o">:</span> <span class="nc">This</span> <span class="n">value</span> <span class="n">is</span> <span class="n">used</span> <span class="n">here</span><span class="o">,</span>
       <span class="n">but</span> <span class="n">it</span> <span class="n">is</span> <span class="n">defined</span> <span class="k">as</span> <span class="n">once</span> <span class="ow">and</span> <span class="n">has</span> <span class="n">already</span> <span class="n">been</span> <span class="n">used</span><span class="o">:</span>
<span class="nc">File</span> <span class="s2">"./linear_ref.ml"</span><span class="o">,</span> <span class="n">line</span> <span class="mi">33</span><span class="o">,</span> <span class="n">characters</span> <span class="mi">7</span><span class="o">-</span><span class="mi">8</span><span class="o">:</span>
<span class="mi">33</span> <span class="o">|</span>   <span class="n">free</span> <span class="n">r</span><span class="p">;</span>
</code></pre></div></div>

<h2 id="why-both-linearity-and-uniqueness">Why both linearity and uniqueness?</h2>

<p>Given this example, you might be wondering, if the <em>safe</em> reference may be
implemented equivalently using both uniqueness and linearity, why do we need
both? Obviously, there’s something interesting going on where unique values
captured in a closure needs linearity. Does that mean linearity is sufficient?</p>

<p>It turns out that only recently was the relationship between the two formally
studied in the same type system. While linear types and uniqueness types have a
long history of being studied independently, Marshall et al. in their paper,
<a href="https://starsandspira.ls/docs/esop22-draft.pdf">“Linearity and Uniqueness: An Entente
Cordiale”</a>, present the ideas in
the same type system. They provide some key insights.</p>

<p>The first insight is that</p>

<blockquote>
  <p>in a setting where all values must be linear, we can also guarantee that every value is unique, and vice versa! Intuitively, if it is never possible to duplicate a value, then it will never be possible for said value to have multiple references.</p>
</blockquote>

<p>In our <code class="language-plaintext highlighter-rouge">Unique_ref</code> and <code class="language-plaintext highlighter-rouge">Linear_ref</code> every operation that operates on the ref
requires uniqueness or linearity, respectively. Hence, they seem almost
equivalent in expressive power.</p>

<blockquote>
  <p>It is when we also have the ability for unrestricted use (non-linear/non-unique) that differences between linearity and uniqueness begin to arise, as we will soon see.</p>
</blockquote>

<p>In our language, we do have the ability for unrestricted use. That is, in the
linearity axis, <code class="language-plaintext highlighter-rouge">many</code> is the default mode attributed to all the values not
tagged or inferred as <code class="language-plaintext highlighter-rouge">once</code>. Similarly, <code class="language-plaintext highlighter-rouge">aliased</code> is the default mode
attributed to all the values not tagged or inferred as <code class="language-plaintext highlighter-rouge">unique</code>.</p>

<p>The type system has <em>submoding</em>: values may move freely to <em>greater</em> modes
(which typically restrict what can be done with those values) but not to
<em>lesser</em> modes. For example, a <code class="language-plaintext highlighter-rouge">many</code> value may be safely use in a context where
a <code class="language-plaintext highlighter-rouge">once</code> value is expected.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">works</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">set_to_20</span> <span class="p">(</span><span class="n">r</span> <span class="o">@</span> <span class="n">once</span><span class="p">)</span> <span class="o">=</span>
    <span class="n">r</span> <span class="o">:=</span> <span class="mi">20</span>
  <span class="k">in</span>
  <span class="k">let</span> <span class="n">r</span> <span class="o">@</span> <span class="n">many</span> <span class="o">=</span> <span class="n">ref</span> <span class="mi">10</span> <span class="k">in</span>
  <span class="n">set_to_20</span> <span class="n">r</span> <span class="c">(* [r @ many] is passed to a function that expects [int ref @ once] *)</span>
</code></pre></div></div>

<p>Similarly, you can use a <code class="language-plaintext highlighter-rouge">unique</code> value in a context where an <code class="language-plaintext highlighter-rouge">aliased</code> value is
expected.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">dup</span> <span class="n">r</span> <span class="o">=</span> <span class="n">r</span><span class="o">,</span><span class="n">r</span>

<span class="k">let</span> <span class="n">works</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">r</span> <span class="o">=</span> <span class="nn">Unique_ref</span><span class="p">.</span><span class="n">alloc</span> <span class="mi">42</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">a</span><span class="o">,</span><span class="n">b</span> <span class="o">=</span> <span class="n">dup</span> <span class="n">r</span> <span class="k">in</span>
  <span class="n">a</span><span class="o">,</span><span class="n">b</span>
</code></pre></div></div>

<p>The type of the <code class="language-plaintext highlighter-rouge">works</code> function is <code class="language-plaintext highlighter-rouge">val works : unit -&gt; int Unique_ref.t * int
Unique_ref.t</code>, which crucially lacks the fact that the references are at unique
mode. We can’t call any functions from the <code class="language-plaintext highlighter-rouge">Unique_ref</code> module with these
references, all of which expect a reference with <code class="language-plaintext highlighter-rouge">unique</code> mode.</p>

<h2 id="uniqueness-is-more-appropriate-for-safe-refs">Uniqueness is more appropriate for safe refs</h2>

<p>In our running example of implementing a safe ref, it turns out that uniqueness
is more appropriate. Consider the type signature of <code class="language-plaintext highlighter-rouge">free</code> in <code class="language-plaintext highlighter-rouge">Unique_ref</code>:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="n">free</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
</code></pre></div></div>

<p>The type signature says that there are no other aliases to this reference.
Hence, its memory may be safely deallocated. However, consider the
<code class="language-plaintext highlighter-rouge">Linear_ref.free</code> signature:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="n">free</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">once</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
</code></pre></div></div>

<p>The signature says that this reference must be used at most once. In particular,
just by looking at the signature, we cannot conclude that there are no other
aliases to this reference. But we know that the API is safe since the only way
to create a safe reference is through the <code class="language-plaintext highlighter-rouge">alloc</code> function, which returns a
once-usable reference, and every other operation also expects and returns a
once-usable reference.</p>

<p>The correctness of the linear version depends on reasoning over the <em>whole
API</em>, whereas the unique version can be concluded to be safe just by
looking at the signature of the <code class="language-plaintext highlighter-rouge">free</code> function. This modular reasoning makes
uniqueness more appropriate for our safe reference API.</p>

<h2 id="past-and-the-future">Past and the future</h2>

<p>In a sense, uniqueness and linearity are duals of each other. Uniqueness talks
about the <em>past</em> – whether a value may be aliased in the past. It is okay to
alias a unique value in the future and lose the uniqueness mode. Linearity talks
about the <em>future</em> – whether a value may be used more than once in the future.
You can take any value and ascribe a linear mode to it, restricting its use in
the future. However, there may be other aliases to this value in the past.</p>

<h2 id="conclusions">Conclusions</h2>

<p>The code examples are available
<a href="https://github.com/kayceesrk/code-snippets/tree/master/oxcaml/linearity_june_2025">here</a>.
Section 2.1 of <a href="https://starsandspira.ls/docs/esop22-draft.pdf">Marshall et al.’s
paper</a> is quite readable and
explains the distinction between linearity and uniqueness with some historical
context. I highly recommend it.</p>

<h2 id="acknowledgements">Acknowledgements</h2>

<p>Thanks to <a href="https://richarde.dev/">Richard Eisenberg</a> for the discussions which
spurred this post.</p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="Modes" /><category term="OxCaml" /><summary type="html"><![CDATA[In the last post, we looked at uniqueness mode and how uniqueness may be used to optimise. As we will see, uniqueness alone is insufficient in practice, and we also need a concept of linearity for uniqueness to be useful.]]></summary></entry><entry><title type="html">Uniqueness for Behavioural Types</title><link href="https://kcsrk.info/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types/" rel="alternate" type="text/html" title="Uniqueness for Behavioural Types" /><published>2025-05-29T17:56:00+00:00</published><updated>2025-05-29T17:56:00+00:00</updated><id>https://kcsrk.info/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types</id><content type="html" xml:base="https://kcsrk.info/ocaml/modes/oxcaml/2025/05/29/uniqueness_and_behavioural_types/"><![CDATA[<p>Jane Street has been developing modal types for OCaml – an extension to the
type system where modes track properties of values, such as their scope, thread
sharing, and aliasing. These modes restrict which operations are permitted on
values, enabling safer and more efficient systems programming. In this post, I
focus on the uniqueness mode, which tracks aliasing, and show how it can
eliminate certain runtime checks.</p>

<!--more-->

<p>My intention in this post is not to explain how the different modes work. There
are a number of blog posts and academic papers written about modes. I recommend
the interested reader to have look at them. The following table summarizes the
main properties tracked by modes, the corresponding mode names, and resources
for further reading:</p>

<table>
  <thead>
    <tr>
      <th>Property</th>
      <th>Modes</th>
      <th>Resources</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Scope</td>
      <td>Locality</td>
      <td><a href="https://blog.janestreet.com/oxidizing-ocaml-locality/">Blog</a>, <a href="https://dl.acm.org/doi/10.1145/3674642">Paper</a></td>
    </tr>
    <tr>
      <td>Sharing between threads</td>
      <td>Portability, Contention</td>
      <td><a href="https://blog.janestreet.com/oxidizing-ocaml-parallelism/">Blog</a>, <a href="https://dl.acm.org/doi/10.1145/3704859">Paper</a></td>
    </tr>
    <tr>
      <td>Aliasing</td>
      <td>Uniqueness, Linearity</td>
      <td><a href="https://blog.janestreet.com/oxidizing-ocaml-ownership/">Blog</a>, <a href="https://dl.acm.org/doi/10.1145/3674642">Paper</a></td>
    </tr>
  </tbody>
</table>

<p>The OCaml compiler extended with modes is <a href="https://github.com/ocaml-flambda/flambda-backend">developed in the
open</a>, and is used in
production at Jane Street. The repo also has some
<a href="https://github.com/ocaml-flambda/flambda-backend/tree/main/jane/doc">documentation</a>
of the extensions.</p>

<p>Be warned that the compiler and the language features are fast evolving. The
code examples presented in the blog and the paper referenced above are likely
not to work. I expect the same for the code examples in this post in the near
future, but that’s what one should expect with these bleeding-edge features.</p>

<h2 id="behavioural-types-and-runtime-overhead">Behavioural types and runtime overhead</h2>

<p>A couple of years ago, I wrote a post on <a href="https://kcsrk.info/ocaml/types/2016/06/30/behavioural-types/">behavioural
types</a> where the
types capture the sequence of operations that may be performed on the values
with those types. The correctness of the system depended on the linear use of
the resources. Since OCaml does not provide support for enforcing linearity
statically, the implementation uses a dynamic check, using a fresh ref cell that
gets <em>consumed</em> every time the type state changes. If we are guaranteed that the
resource is not aliased statically, then there’s no need for the dynamic check.
This is where <em>uniqueness</em> helps.</p>

<p>Uniqueness mode allows the OCaml compiler to statically guarantee that certain
values are not aliased. This enables optimizations and eliminates the need for
some runtime checks, which is particularly valuable in systems programming for
ensuring memory safety and efficient resource management.</p>

<h2 id="setting-up-ocaml-with-modes">Setting up OCaml with modes</h2>

<p>An opam repository with the modes extensions and packages supporting modes is
available
<a href="https://github.com/janestreet/opam-repository/tree/with-extensions">here</a>.
Here’s how you can set up the new compiler:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># this will take time</span>
opam switch create 5.2.0+flambda2 <span class="nt">--repos</span> with-extensions<span class="o">=</span>git+https://github.com/janestreet/opam-repository.git#with-extensions,default
<span class="nb">eval</span> <span class="si">$(</span>opam <span class="nb">env</span> <span class="nt">--switch</span> 5.2.0+flambda2<span class="si">)</span>
</code></pre></div></div>

<h2 id="an-explicitly-memory-managed-reference">An explicitly memory-managed reference</h2>

<p>Suppose you want to implement a mutable reference whose memory is explicitly
managed (not managed by the GC), you may go for the following interface:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">S</span> <span class="o">=</span> <span class="k">sig</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span>
  <span class="k">val</span> <span class="n">alloc</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span>
  <span class="k">val</span> <span class="n">free</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="kt">unit</span> <span class="c">(* unsafe *)</span>
  <span class="k">val</span> <span class="n">get</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span>
  <span class="k">val</span> <span class="n">set</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
<span class="k">end</span>
</code></pre></div></div>

<p>This interface provides an explicit <code class="language-plaintext highlighter-rouge">free</code>, which releases the memory associated
with this reference. This opens up the possibility of memory safety bugs such as
use-after-free and double-free. We can use uniqueness modality to get a <em>safe</em>
API. Here’s the interface:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">S</span> <span class="o">=</span> <span class="k">sig</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span>
  <span class="k">val</span> <span class="n">alloc</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">free</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
  <span class="k">val</span> <span class="n">get</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="nn">Modes</span><span class="p">.</span><span class="nn">Aliased</span><span class="p">.</span><span class="n">t</span> <span class="o">*</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">set</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">unique</code> annotation states that the value is not aliased. The operations on
the reference expect that this reference is not aliased. Observe that <code class="language-plaintext highlighter-rouge">get</code> and
<code class="language-plaintext highlighter-rouge">set</code> take in the unique reference and also return them unlike the original
interface. You can use this like so:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">let</span> <span class="n">okay</span> <span class="n">r</span> <span class="o">=</span>
    <span class="k">let</span> <span class="n">v</span><span class="o">,</span> <span class="n">r</span> <span class="o">=</span> <span class="n">get</span> <span class="n">r</span> <span class="k">in</span>
    <span class="k">let</span> <span class="n">r</span> <span class="o">=</span> <span class="n">set</span> <span class="n">r</span> <span class="mi">20</span> <span class="k">in</span>
    <span class="n">free</span> <span class="n">r</span><span class="p">;;</span>
<span class="k">val</span> <span class="n">okay</span> <span class="o">:</span> <span class="kt">int</span> <span class="nn">M</span><span class="p">.</span><span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="kt">unit</span> <span class="o">=</span> <span class="o">&lt;</span><span class="k">fun</span><span class="o">&gt;</span>
</code></pre></div></div>

<p>The key bit is that <code class="language-plaintext highlighter-rouge">free</code> <em>consumes</em> the unique reference; you can
no longer produce a unique handle to the same reference and hence, you cannot
call <code class="language-plaintext highlighter-rouge">free</code>, <code class="language-plaintext highlighter-rouge">get</code> or <code class="language-plaintext highlighter-rouge">set</code> on this reference which has been freed.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">let</span> <span class="n">wont_work</span> <span class="n">r</span> <span class="o">=</span>
    <span class="n">free</span> <span class="n">r</span><span class="p">;</span>
    <span class="n">get</span> <span class="n">r</span>
  <span class="p">;;</span>
<span class="nc">Error</span><span class="o">:</span> <span class="nc">This</span> <span class="n">value</span> <span class="n">is</span> <span class="n">used</span> <span class="n">here</span><span class="o">,</span> <span class="n">but</span> <span class="n">it</span> <span class="n">has</span> <span class="n">already</span> <span class="n">been</span> <span class="n">used</span> <span class="k">as</span> <span class="n">unique</span><span class="o">:</span>
<span class="nc">Line</span> <span class="mi">2</span><span class="o">,</span> <span class="n">characters</span> <span class="mi">7</span><span class="o">-</span><span class="mi">8</span><span class="o">:</span>
</code></pre></div></div>

<h3 id="modesaliasedt">Modes.Aliased.t</h3>

<p>Uniqueness applies deeply. If a value is marked as unique, then the transitive
closure of the reachable parts of the object is also expected to be unique. The
return value of <code class="language-plaintext highlighter-rouge">get</code> is a pair, which is marked as <code class="language-plaintext highlighter-rouge">unique</code><sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>. Hence, both
the components of the pair are expected to be unique. However, we don’t want to
impose uniqueness of the value stored in the reference. The language allows
parts of the value to be marked as aliased. <code class="language-plaintext highlighter-rouge">Modes.Aliased.t</code> is defined as:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Aliased</span> <span class="o">:</span> <span class="k">sig</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">=</span> <span class="p">{</span> <span class="n">aliased</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">@@</span> <span class="n">aliased</span> <span class="p">}</span> <span class="p">[</span><span class="o">@@</span><span class="n">unboxed</span><span class="p">]</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The language allows record fields to be annotated as <code class="language-plaintext highlighter-rouge">aliased</code>, while the record
itself may be uniquely referenced.</p>

<h3 id="implementation">Implementation</h3>

<p>Here’s an implementation of that satisfies the signature.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">M</span> <span class="o">:</span> <span class="nc">S</span> <span class="o">=</span> <span class="k">struct</span>
  <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">=</span> <span class="p">{</span> <span class="k">mutable</span> <span class="n">value</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="p">}</span>
  <span class="k">let</span> <span class="n">alloc</span> <span class="n">x</span> <span class="o">=</span> <span class="p">{</span> <span class="n">value</span> <span class="o">=</span> <span class="n">x</span> <span class="p">}</span>
  <span class="k">let</span> <span class="n">free</span> <span class="n">t</span> <span class="o">=</span> <span class="bp">()</span>
  <span class="k">let</span> <span class="n">get</span> <span class="n">t</span> <span class="o">=</span>
    <span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="nn">Modes</span><span class="p">.</span><span class="nn">Aliased</span><span class="p">.{</span><span class="n">aliased</span> <span class="o">=</span> <span class="n">t</span><span class="o">.</span><span class="n">value</span> <span class="p">}</span> <span class="k">in</span>
    <span class="n">a</span><span class="o">,</span> <span class="n">t</span>
  <span class="k">let</span> <span class="n">set</span> <span class="n">t</span> <span class="n">x</span> <span class="o">=</span>
    <span class="n">t</span><span class="o">.</span><span class="n">value</span> <span class="o">&lt;-</span> <span class="n">x</span><span class="p">;</span>
    <span class="n">t</span>
<span class="k">end</span>
</code></pre></div></div>

<p>There’s nothing surprising about this implementation. Note that the compiler is
doing a lot of work behind the scenes to ensure that the functions do in fact
satisfy the uniqueness requirements. For example, if you change the
implementation of <code class="language-plaintext highlighter-rouge">set</code> to do something <em>innocuous</em> where the compiler cannot
prove that the value is not aliased, the program no longer compiles:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">let</span> <span class="n">set</span> <span class="n">t</span> <span class="n">x</span> <span class="o">=</span>
    <span class="n">t</span><span class="o">.</span><span class="n">value</span> <span class="o">&lt;-</span> <span class="n">x</span><span class="p">;</span>
    <span class="k">let</span> <span class="n">t'</span> <span class="o">=</span> <span class="nn">Fun</span><span class="p">.</span><span class="n">id</span> <span class="n">t</span> <span class="k">in</span> <span class="c">(* compiler cannot prove [t'] is not aliased *)</span>
    <span class="n">t'</span>
<span class="nc">Error</span><span class="o">:</span> <span class="o">&lt;</span><span class="n">snip</span><span class="o">&gt;</span>
<span class="nc">Values</span> <span class="k">do</span> <span class="n">not</span> <span class="k">match</span><span class="o">:</span>
 <span class="k">val</span> <span class="n">set</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span>
<span class="n">is</span> <span class="n">not</span> <span class="n">included</span> <span class="k">in</span>
 <span class="k">val</span> <span class="n">set</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
<span class="nc">The</span> <span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="n">is</span> <span class="n">not</span> <span class="n">compatible</span> <span class="k">with</span> <span class="n">the</span> <span class="k">type</span>
 <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="n">t</span> <span class="o">@</span> <span class="n">unique</span>
</code></pre></div></div>

<h2 id="refs-that-explain-their-work">Refs that explain their work</h2>

<p>The <a href="https://kcsrk.info/ocaml/types/2016/06/30/behavioural-types/#refs-that-explain-their-work">earlier blog
post</a>
used polymorphic variants to encode the <em>protocol</em> of operations that are
permitted on a ref cell. The implementation is reproduced below:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">Ref</span> <span class="o">=</span>
<span class="k">sig</span>
  <span class="k">type</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="k">constraint</span> <span class="k">'</span><span class="n">b</span> <span class="o">=</span> <span class="p">[</span><span class="o">&gt;</span><span class="p">]</span>

  <span class="k">val</span> <span class="n">ref</span>   <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span>
  <span class="k">val</span> <span class="n">read</span>  <span class="o">:</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="p">[</span><span class="nt">`Read</span> <span class="k">of</span> <span class="k">'</span><span class="n">b</span><span class="p">])</span> <span class="n">ref</span>
              <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="o">*</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span>
  <span class="k">val</span> <span class="n">write</span> <span class="o">:</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="p">[</span><span class="nt">`Write</span> <span class="k">of</span> <span class="k">'</span><span class="n">b</span><span class="p">])</span> <span class="n">ref</span>
              <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span>
              <span class="o">-&gt;</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span>
<span class="k">end</span>
<span class="k">module</span> <span class="nc">Ref</span> <span class="o">:</span> <span class="nc">Ref</span> <span class="o">=</span>
<span class="k">struct</span>

  <span class="k">type</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="o">=</span>
    <span class="p">{</span><span class="n">contents</span>     <span class="o">:</span> <span class="k">'</span><span class="n">a</span><span class="p">;</span>
     <span class="k">mutable</span> <span class="n">live</span> <span class="o">:</span> <span class="kt">bool</span><span class="p">}</span> <span class="c">(* For linearity *)</span>
     <span class="k">constraint</span> <span class="k">'</span><span class="n">b</span> <span class="o">=</span> <span class="p">[</span><span class="o">&gt;</span><span class="p">]</span>

  <span class="k">let</span> <span class="n">ref</span> <span class="n">v</span> <span class="o">=</span> <span class="p">{</span><span class="n">contents</span> <span class="o">=</span> <span class="n">v</span><span class="p">;</span> <span class="n">live</span> <span class="o">=</span> <span class="bp">true</span><span class="p">}</span>

  <span class="k">let</span> <span class="n">check</span> <span class="n">r</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">not</span> <span class="n">r</span><span class="o">.</span><span class="n">live</span> <span class="k">then</span> <span class="k">raise</span> <span class="nc">LinearityViolation</span><span class="p">;</span>
    <span class="n">r</span><span class="o">.</span><span class="n">live</span> <span class="o">&lt;-</span> <span class="bp">false</span>

  <span class="k">let</span> <span class="n">fresh</span> <span class="n">r</span> <span class="o">=</span> <span class="p">{</span><span class="n">r</span> <span class="k">with</span> <span class="n">live</span> <span class="o">=</span> <span class="bp">true</span><span class="p">}</span>

  <span class="k">let</span> <span class="n">read</span> <span class="n">r</span> <span class="o">=</span>
    <span class="n">check</span> <span class="n">r</span><span class="p">;</span>
    <span class="p">(</span><span class="n">r</span><span class="o">.</span><span class="n">contents</span><span class="o">,</span> <span class="n">fresh</span> <span class="n">r</span><span class="p">)</span>

  <span class="k">let</span> <span class="n">write</span> <span class="n">r</span> <span class="n">v</span> <span class="o">=</span>
    <span class="n">check</span> <span class="n">r</span><span class="p">;</span>
    <span class="p">{</span> <span class="n">contents</span> <span class="o">=</span> <span class="n">v</span><span class="p">;</span> <span class="n">live</span> <span class="o">=</span> <span class="bp">true</span> <span class="p">}</span>

  <span class="k">let</span> <span class="n">branch</span> <span class="n">r</span> <span class="n">_</span> <span class="o">=</span> <span class="n">check</span> <span class="n">r</span><span class="p">;</span> <span class="n">fresh</span> <span class="n">r</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Observe that we use a dynamic check to enforce linearity. It requires a <em>fresh</em>
ref cell for each operation performed on this reference. With uniqueness, we can
enforce this statically, avoiding the dynamic check and the fresh ref cell
requirement.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">Ref</span> <span class="o">=</span>
<span class="k">sig</span>
  <span class="k">type</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="k">constraint</span> <span class="k">'</span><span class="n">b</span> <span class="o">=</span> <span class="p">[</span><span class="o">&gt;</span><span class="p">]</span>
  <span class="c">(* 'b is the behavioural type variable *)</span>

  <span class="k">val</span> <span class="n">ref</span>   <span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">read</span>  <span class="o">:</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="p">[</span><span class="nt">`Read</span> <span class="k">of</span> <span class="k">'</span><span class="n">b</span><span class="p">])</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
              <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span> <span class="nn">Modes</span><span class="p">.</span><span class="nn">Aliased</span><span class="p">.</span><span class="n">t</span> <span class="o">*</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">write</span> <span class="o">:</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="p">[</span><span class="nt">`Write</span> <span class="k">of</span> <span class="k">'</span><span class="n">b</span><span class="p">])</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
              <span class="o">-&gt;</span> <span class="k">'</span><span class="n">a</span>
              <span class="o">-&gt;</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
  <span class="k">val</span> <span class="n">branch</span> <span class="o">:</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="p">[</span><span class="o">&gt;</span><span class="p">]</span> <span class="k">as</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
               <span class="o">-&gt;</span> <span class="p">((</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="p">[</span><span class="o">&gt;</span><span class="p">]</span> <span class="k">as</span> <span class="k">'</span><span class="n">c</span><span class="p">)</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span> <span class="o">-&gt;</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span>
               <span class="o">-&gt;</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">c</span><span class="p">)</span> <span class="n">ref</span> <span class="o">@</span> <span class="n">unique</span>
<span class="k">end</span>

<span class="k">module</span> <span class="nc">Ref</span> <span class="o">:</span> <span class="nc">Ref</span> <span class="o">=</span>
<span class="k">struct</span>
  <span class="k">type</span> <span class="p">(</span><span class="k">'</span><span class="n">a</span><span class="o">,</span> <span class="k">'</span><span class="n">b</span><span class="p">)</span> <span class="n">ref</span> <span class="o">=</span> <span class="p">{</span><span class="k">mutable</span> <span class="n">contents</span> <span class="o">:</span> <span class="k">'</span><span class="n">a</span><span class="p">}</span> <span class="k">constraint</span> <span class="k">'</span><span class="n">b</span> <span class="o">=</span> <span class="p">[</span><span class="o">&gt;</span><span class="p">]</span>

  <span class="k">let</span> <span class="n">ref</span> <span class="n">v</span> <span class="o">=</span> <span class="p">{</span><span class="n">contents</span> <span class="o">=</span> <span class="n">v</span><span class="p">}</span>

  <span class="k">let</span> <span class="n">read</span> <span class="n">r</span> <span class="o">=</span>
    <span class="k">let</span> <span class="n">c</span> <span class="o">=</span> <span class="nn">Modes</span><span class="p">.</span><span class="nn">Aliased</span><span class="p">.{</span><span class="n">aliased</span> <span class="o">=</span> <span class="n">r</span><span class="o">.</span><span class="n">contents</span><span class="p">}</span> <span class="k">in</span>
    <span class="n">c</span><span class="o">,</span> <span class="nn">Obj</span><span class="p">.</span><span class="n">magic_at_unique</span> <span class="n">r</span>

  <span class="k">let</span> <span class="n">write</span> <span class="n">r</span> <span class="n">v</span> <span class="o">=</span>
    <span class="n">r</span><span class="o">.</span><span class="n">contents</span> <span class="o">&lt;-</span> <span class="n">v</span><span class="p">;</span>
    <span class="nn">Obj</span><span class="p">.</span><span class="n">magic_at_unique</span> <span class="n">r</span>

  <span class="k">let</span> <span class="n">branch</span> <span class="n">r</span> <span class="n">_</span> <span class="o">=</span> <span class="nn">Obj</span><span class="p">.</span><span class="n">magic_at_unique</span> <span class="n">r</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The only changes necessary in the signature were a number of uniqueness and
aliasing annotations. Notice that the implementation no longer needs the
dynamic check! <code class="language-plaintext highlighter-rouge">Obj.magic_at_unique</code> has the type <code class="language-plaintext highlighter-rouge">'a @ unique -&gt; 'b @ unique</code>,
and is the version of <code class="language-plaintext highlighter-rouge">Obj.magic</code> with uniqueness annotation. We use it to
<em>advance</em> the protocol type state.</p>

<h2 id="where-next">Where next</h2>

<p>The rest of the examples in the <a href="https://kcsrk.info/ocaml/types/2016/06/30/behavioural-types/">original
post</a> should also
benefit from uniqueness annotations to remove the runtime overheads.</p>

<p>The complete code examples are available
<a href="https://github.com/kayceesrk/code-snippets/tree/master/oxcaml/uniqueness_may_2025">here</a>.
You can also play with the code examples <a href="https://tinyurl.com/y7ku8r5h">directly in the
browser</a> thanks to <a href="https://patrick.sirref.org/index/index.xml">Patrick
Ferris’</a> OCaml with extensions
<a href="https://patrick.sirref.org/try-oxcaml/index.xml">js_of_ocaml top-level</a>.</p>

<p>Since the modes features are constantly evolving, there are no stability
guarantees yet. However, I’m excited about the possibility of modes improving
how we do safe systems programming in OCaml.</p>

<h2 id="addendum">Addendum</h2>

<p>Looks like there’s a <a href="https://kcsrk.info/ocaml/modes/oxcaml/2025/06/04/linearity_and_uniqueness/">part 2</a> of this post.</p>

<h2 id="footnotes">Footnotes</h2>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Unclear whether it is possible to return a pair where one of the
components is unique, but the other one is not. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="Modes" /><category term="OxCaml" /><summary type="html"><![CDATA[Jane Street has been developing modal types for OCaml – an extension to the type system where modes track properties of values, such as their scope, thread sharing, and aliasing. These modes restrict which operations are permitted on values, enabling safer and more efficient systems programming. In this post, I focus on the uniqueness mode, which tracks aliasing, and show how it can eliminate certain runtime checks.]]></summary></entry><entry><title type="html">Joining my group</title><link href="https://kcsrk.info/ocaml/iitm/community/2025/04/28/working-with-me/" rel="alternate" type="text/html" title="Joining my group" /><published>2025-04-28T12:10:00+00:00</published><updated>2025-04-28T12:10:00+00:00</updated><id>https://kcsrk.info/ocaml/iitm/community/2025/04/28/working-with-me</id><content type="html" xml:base="https://kcsrk.info/ocaml/iitm/community/2025/04/28/working-with-me/"><![CDATA[<p>Recently, I posted on <a href="https://x.com/kc_srk/status/1912008952340164804">X</a> and
<a href="https://www.linkedin.com/posts/kc-sivaramakrishnan-25061a14_kc-sivaramakrishnan-activity-7317777561936183296-8hH-/">LinkedIn</a>
that I am always looking for excellent people to join my group. I received a lot
of enquiries, some of which led to internship hires (yay!). But mostly, I seemed
to offer similar advice. I thought I’d write a post that summarise my responses.</p>

<!--more-->

<p>At IIT Madras, my <a href="https://github.com/prismlab">research group</a> develops
programming language abstractions to solve systems problems. The group is
composed of research associates (fixed-term project staff), PhD, MS and MTech
students, undergraduate research students (who are typically BTech students from
IIT Madars) and interns. I made the following post a few weeks ago, for which I
received a lots of enquiries, and I have been busy writing similar responses to
many of them, which I summarise below.</p>

<center>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">PSA: I&#39;m always looking for excellent folks to join my research group at IIT Madras to work on building &quot;functional&quot; systems. This includes internships, MS and PhD studentships, research staff positions, and post-baccalaureate fellowships. <br /><br />Reach out to me if you are keen!</p>&mdash; KC Sivaramakrishnan (@kc_srk) <a href="https://twitter.com/kc_srk/status/1912008952340164804?ref_src=twsrc%5Etfw">April 15, 2025</a></blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
</center>

<h2 id="internship-positions">Internship positions</h2>

<p>Internship enquiries are the most frequent ones that I receive. Here’s how you
can make it work.  Please do go through my <a href="https://kcsrk.info">web page</a> to
look at what areas I work on. Write to me about what interests you and what your
goals are.</p>

<p>My group works on systems. To make the internship work well, we require that you
have demonstrable systems building experience. Do build projects that go beyond
your coursework.  Make sure that the projects are developed publicly on GitHub
or other similar platforms so that one can take a look at what you’ve
built. Even better is contributions to other open-source projects.</p>

<p>The group solves systems problems with functional programming. If you have prior
experience with functional programming, such as building small projects with
OCaml, Haskell, Scala, Scheme or other languages, it is easier for me to assess
your interest. That said, if you are great at any programming language, having
built non-trivial projects in any language, then you have the right skills for
internships in my group. Generally, I expect the interns to have done course
work on OS, compilers and computer architecture. Significant projects in any of
those areas is a huge plus.</p>

<p>I should clarify that my recommendation letters for graduate programs will
reflect my honest assessment of the internship. I will decline writing a
recommendation letter if I think I may not be able to provide a strong one.</p>

<p>I do not work on projects that are primarily AI/ML or Web Development. If you
write to me looking for projects in those areas, it is very likely that you
won’t hear from me. Please don’t bulk email faculty CCing or BCCing everyone in
the department. It is likely that no one will read such an email.</p>

<h2 id="phdmsmtech-positions">PhD/MS/MTech positions</h2>

<p>For academic positions, please have a look at <a href="https://research.iitm.ac.in/">https://research.iitm.ac.in/</a>.
There are alternative ways to enter MS and PhD positions by being a reserach
associate and completing some coursework at IITM. For more information, see
<a href="https://cystar.iitm.ac.in/join-us/#:~:text=Pathways%20to%20IIT%20Madras">here</a>.</p>

<h2 id="contributing-to-the-ocaml-community">Contributing to the OCaml community</h2>

<p>A significant chunk of the enquiries were from folks who hold full-time
positions looking to be involved in the research group. Unfortunately, making
part-time positions work is a challenge for both sides. I would encourage
contributions to the wider OCaml community.</p>

<p>There are several great ways to get involved with the community. Here’s what I
usually recommend.</p>

<ul>
  <li>Learn the basics.
    <ul>
      <li>Go through the OCaml part of my <a href="https://github.com/fplaunchpad/cs3100_m20">CS3100 course</a>. The course has a YouTube
playlist and programming assignments. Complete the programming assignments.</li>
      <li>Read the <a href="https://dev.realworldocaml.org/">Real World OCaml</a> book.</li>
      <li>There are lots of other resources at <a href="https://ocaml.org/">OCaml.org</a>, the official website of the OCaml community and the ecosystem.</li>
    </ul>
  </li>
  <li>Join the community.
    <ul>
      <li>OCaml <a href="https://discord.com/invite/ZBgYuvR">discord</a> and <a href="https://discuss.ocaml.org/">discuss</a> are great places to hang out with other OCaml folks and ask questions.</li>
      <li>Discord is better for quick clarifications and discuss for longer form discussions.</li>
    </ul>
  </li>
  <li>Look for “good first issues” in the OCaml projects and work on them
    <ul>
      <li>Check out the core platform tools under the <a href="https://github.com/search?q=label%3A%22good+first+issue%22+language%3AOCaml+state%3Aopen+org%3Aocaml&amp;type=issues">OCaml github org</a>. See <a href="https://github.com/ocaml/ocaml/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22">OCaml compiler</a>, <a href="https://github.com/ocaml/dune/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22">dune build system</a>, <a href="https://github.com/ocaml/opam/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22easy%20first%20issue%22">opam package manager</a>, <a href="https://github.com/ocaml/ocaml.org/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22">ocaml.org</a>, etc.</li>
      <li>Across the wider ecosystem – <a href="https://github.com/semgrep/semgrep/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22%20">SemGrep</a>, <a href="https://github.com/opengrep/opengrep/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22">OpenGrep</a>, <a href="https://github.com/rocq-prover/rocq/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22">Rocq</a>, etc.</li>
    </ul>
  </li>
  <li>Work on self-directed projects. Here is my <a href="https://github.com/tarides/hackocaml">list of ideas</a>.</li>
</ul>

<p>OCaml community also participates in <a href="https://ocaml.org/outreachy">Outreachy
internships</a>. Outreachy internships are paid
internships for underrepresented groups. It is a great way to contribute to the
community while being mentored by folks from the OCaml community. Here’s a <a href="https://www.youtube.com/watch?v=5eLRm8riAnI&amp;t=970s">nice
intro (in Tamil)</a> to the
impact that Outreachy program had on an Outreachy intern. Look out for
<a href="https://discuss.ocaml.org/t/outreachy-june-2025/16154">announcements</a> about
Outreachy internships in the OCaml discuss forum.</p>

<h2 id="research-associate-positions">Research Associate positions</h2>

<p>This is for folks who want to contribute to the core research programme but do
not see themselves joining academic programs. The expectation here is that you
are an experiened systems engineer, who should see themselves easily qualifying
for the internship positions in the group.</p>

<p>One useful way to look at this position is similar to a research software
development engineer who helps build out the systems used for research or
translate research to practice. In the past, research associates have <a href="https://kcsrk.info/ocaml/multicore/job/2019/09/16/1115-multicore-job/">helped
upstream multicore
OCaml</a>.
The easiest way to get into this role would be to do an internship, see whether
you like this area, do well in the internship and then choose to apply to
research associate position.</p>

<p>Another variant is a post-bacc or a pre-doc position aimed at highly motivated
recent graduates, who are looking to build research experience. The expectation
here is that we get papers into top venues in PL and Systems. For such students,
I recommend going through my <a href="https://github.com/fplaunchpad/cs6225_s25_iitm">CS6225 Programs and Proofs
course</a>, watch the <a href="https://www.youtube.com/playlist?list=PLt0HgEXFOHdkfd7phdKKmTIuwHEvPX0qb">video
lectures</a>
and complete the
<a href="https://github.com/fplaunchpad/cs6225_s25_iitm/tree/main/assignments">assignments</a>.
The course is not an easy one, but will expose you to the broad area of PL and
specifically to deductive program verification. At the very least, you will come
out with an understanding of what it is to think rigorously about program
correctness.</p>

<p>Research associate positions are fixed-term positions. In order to make this
work, the tenure should be at least 18 months to make it work.</p>

<h2 id="summary">Summary</h2>

<p>While I may not be hiring actively all the time, do reach out to me if you are
interested in any of hte above. Please follow me on
<a href="https://www.linkedin.com/in/kc-sivaramakrishnan-25061a14/">LinkedIn</a>,
<a href="https://x.com/kc_srk">X</a> or <a href="https://bsky.app/profile/kcsrk.info">Bluesky</a>,
where I am likely to announce any open positions.</p>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="IITM" /><category term="Community" /><summary type="html"><![CDATA[Recently, I posted on X and LinkedIn that I am always looking for excellent people to join my group. I received a lot of enquiries, some of which led to internship hires (yay!). But mostly, I seemed to offer similar advice. I thought I’d write a post that summarise my responses.]]></summary></entry><entry><title type="html">Off-CPU-time analysis</title><link href="https://kcsrk.info/ocaml/offcputime/bpfcc/2024/07/24/offcputime-analysis/" rel="alternate" type="text/html" title="Off-CPU-time analysis" /><published>2024-07-24T09:48:00+00:00</published><updated>2024-07-24T09:48:00+00:00</updated><id>https://kcsrk.info/ocaml/offcputime/bpfcc/2024/07/24/offcputime-analysis</id><content type="html" xml:base="https://kcsrk.info/ocaml/offcputime/bpfcc/2024/07/24/offcputime-analysis/"><![CDATA[<p>Off-CPU analysis is where the program behavior when it is not running is
recorded and analysed. See <a href="https://www.brendangregg.com/offcpuanalysis.html">Brendan Gregg’s eBPF based off-CPU
analysis</a>. While on-CPU
performance monitoring tools such as <code class="language-plaintext highlighter-rouge">perf</code> give you an idea of where the
program is <em>actively</em> spending its time, they won’t tell you where the program
is spending time <em>blocked</em> waiting for an action. Off-CPU analysis reveals
information about where the program is spending time <em>passively</em>.</p>

<!--more-->

<h2 id="installation">Installation</h2>

<p>Install the tools from <a href="https://github.com/iovisor/bcc/">https://github.com/iovisor/bcc/</a>.</p>

<h2 id="enabling-frame-pointers">Enabling frame pointers</h2>

<p>The off-CPU stack trace collection, <code class="language-plaintext highlighter-rouge">offcputime-bpfcc</code>, requires the programs to
be compiled with frame pointers for full backtraces.</p>

<h3 id="ocaml">OCaml</h3>

<p>For OCaml, you’ll need a compiler variant with frame pointers enabled. If you
are installing a released compiler using <code class="language-plaintext highlighter-rouge">opam</code>, you can create one the following 
switch command <code class="language-plaintext highlighter-rouge">opam switch create 5.2.0+fp 5.2.0 ocaml-option-fp</code>. Change out 
<code class="language-plaintext highlighter-rouge">5.2.0</code> for your preferred OCaml version.</p>

<p>Instead, if you are building the OCaml compiler from source, <code class="language-plaintext highlighter-rouge">configure</code> the
compiler with <code class="language-plaintext highlighter-rouge">--enable-frame-pointers</code> option:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ./configure --enable-frame-pointers
</code></pre></div></div>

<p>Lastly, there is an option to create an opam switch with the development branch
of the compiler. The instructions are in <code class="language-plaintext highlighter-rouge">ocaml/HACKING.adoc</code>. In order to
create an opam switch from the current working directory, do:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ opam switch create . 'ocaml-option-fp' --working-dir
</code></pre></div></div>

<h2 id="glibc">glibc</h2>

<p>The libc is not compiled with frame pointers by default. This will lead to many
truncated stack traces. On Ubuntu, I did the following to get a glibc with frame
pointers enabled:</p>

<ol>
  <li>Install glibc with frame pointers
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo apt install libc6-prof
</code></pre></div>    </div>
  </li>
  <li>LD_PRELOAD the glibc with frame pointers
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ LD_PRELOAD=/lib/libc6-prof/x86_64-linux-gnu/libc.so.6 ./myapp.exe
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="running">Running</h2>

<p>On one terminal run the program that you want to analyze:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ LD_PRELOAD=/lib/libc6-prof/x86_64-linux-gnu/libc.so.6 ./ocamlfoo.exe
</code></pre></div></div>

<p>On another terminal run <code class="language-plaintext highlighter-rouge">offcputime-bpfcc</code> tool:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo offcputime-bpfcc --stack-storage-size 2097152 -p $(pgrep -f ocamlfoo.exe) 10 &gt; offcputime.out
</code></pre></div></div>

<p>The command instruments the watches for 10s and the writes out the stack traces
corresponding to blocking calls in <code class="language-plaintext highlighter-rouge">offcputime.out</code>. We use a large stack
storage size argument so as to not lose stack traces. Otherwise, you will see
many <code class="language-plaintext highlighter-rouge">[Missing User Stack]</code> errors in the back traces.</p>

<h2 id="caveats">Caveats</h2>

<p><code class="language-plaintext highlighter-rouge">offcputime-bpfcc</code> must run longer than the program being instrumented by a few
seconds so that the function symbols are resolved. Otherwise you may see
<code class="language-plaintext highlighter-rouge">[unknown]</code> in the backtrace for function names.</p>

<h2 id="oddities">Oddities</h2>

<p>I still see an order of magnitude difference between the maximum pauses observed
using <code class="language-plaintext highlighter-rouge">offcputime-bpfcc</code> and <code class="language-plaintext highlighter-rouge">olly trace</code>. Something is off.</p>

<h2 id="other-links">Other links</h2>

<ul>
  <li><a href="https://www.pingcap.com/blog/how-to-trace-linux-system-calls-in-production-with-minimal-impact-on-performance/">https://www.pingcap.com/blog/how-to-trace-linux-system-calls-in-production-with-minimal-impact-on-performance/</a></li>
</ul>]]></content><author><name>KC Sivaramakrishnan</name><email>sk826@cl.cam.ac.uk</email></author><category term="OCaml" /><category term="offcputime" /><category term="bpfcc" /><summary type="html"><![CDATA[Off-CPU analysis is where the program behavior when it is not running is recorded and analysed. See Brendan Gregg’s eBPF based off-CPU analysis. While on-CPU performance monitoring tools such as perf give you an idea of where the program is actively spending its time, they won’t tell you where the program is spending time blocked waiting for an action. Off-CPU analysis reveals information about where the program is spending time passively.]]></summary></entry></feed>