KC Sivaramakrishnan Building Functional Systems

Shrinking the OxCaml js_of_ocaml bundle: 285 MB to 4 MB

In the previous post on capsules, I cheated. The lecture I was adapting (from my CS6868 course on language abstractions for parallelism) used Await_capsule.Mutex.with_lock, the recommended non-deprecated way to acquire a capsule mutex, but the post shipped Capsule_blocking_sync.Mutex instead with the deprecation alert silenced. The reason was bundle size: the await library, once we chased its transitive dependencies through base, sexplib0, base_quickcheck and the rest of Jane Street’s runtime, would have ballooned the in-browser toplevel by roughly 285 MB. The right API would not even fit through GitHub’s 100 MB per-file push limit, let alone be reasonable to send to a reader’s browser.

This post is the story of how we got from 285 MB down to 4 MB and made the resulting bundle compose cleanly with the in-browser toplevel, so the lecture’s Await_capsule form works end-to-end in the cell at the bottom of this post. Most of the work happened on a branch of ocsigen/js_of_ocaml, with a smaller piece in art-w/x-ocaml, the WebComponent that powers the cells.

Capsules: compile-time lock discipline in OxCaml

In the previous post we fixed the racy gensym with Portable.Atomic. That worked because the shared state was a single integer with atomic primitives. What about state that needs a hash table, a multi-step update, or any structure where atomics aren’t enough? OxCaml’s answer is the capsule: a way to bundle state with its lock so the lock discipline becomes a type-checker job rather than a programmer convention.

Data race freedom in OxCaml

A while back I wired up x-ocaml so this blog could embed live, editable OCaml notebooks. That post used a vanilla OCaml 5 toplevel. Today the toplevel running in your browser is built from OxCaml, the Jane Street fork of the compiler. That means we can prove a small parallel program is data-race free, interactively, without ever spawning a thread.

From Convergence to Confidence: Push-button verification for RDTs

What does it mean for a replicated data type to be correct? For most of the literature, my own prior work included, the answer has been convergence: two replicas that have applied the same operations end up in the same state. I argued in my PaPoC 2026 keynote last week that for many useful data types convergence is not enough, and agentic proof-oriented programming can help close the gap between convergence and confidence.

Foundations for hacking on OCaml

How do you acquire the fundamental computer skills to hack on a complex systems project like OCaml? What’s missing and how do you go about bridging the gap?

Testing x-ocaml, OCaml notebooks as a WebComponent

Can we have OCaml notebooks as pure client-side code? Can these notebooks have rich editor support (highlighting, formatting, types on hover, autocompletion, inline diagnostics, etc.)? Can you take packages from OPAM and use them in these notebooks?

The answer to all of these turns out to be a resounding yes thanks for x-ocaml. This post is my experiment playing with x-ocaml and integrating that into this blog.

Linearity and uniqueness

In the last post, we looked at uniqueness mode and how uniqueness may be used to optimise. As we will see, uniqueness alone is insufficient in practice, and we also need a concept of linearity for uniqueness to be useful.

Uniqueness for Behavioural Types

Jane Street has been developing modal types for OCaml – an extension to the type system where modes track properties of values, such as their scope, thread sharing, and aliasing. These modes restrict which operations are permitted on values, enabling safer and more efficient systems programming. In this post, I focus on the uniqueness mode, which tracks aliasing, and show how it can eliminate certain runtime checks.

Joining my group

Recently, I posted on X and LinkedIn that I am always looking for excellent people to join my group. I received a lot of enquiries, some of which led to internship hires (yay!). But mostly, I seemed to offer similar advice. I thought I’d write a post that summarise my responses.

Off-CPU-time analysis

Off-CPU analysis is where the program behavior when it is not running is recorded and analysed. See Brendan Gregg’s eBPF based off-CPU analysis. While on-CPU performance monitoring tools such as perf give you an idea of where the program is actively spending its time, they won’t tell you where the program is spending time blocked waiting for an action. Off-CPU analysis reveals information about where the program is spending time passively.