KC Sivaramakrishnan

Shrinking the OxCaml js_of_ocaml bundle: 285 MB to 4 MB

2026-05-10T11:00:00+00:00

In the previous post on capsules, I cheated. The lecture I was adapting (from my CS6868 course on language abstractions for parallelism) used Await_capsule.Mutex.with_lock, the recommended non-deprecated way to acquire a capsule mutex, but the post shipped Capsule_blocking_sync.Mutex instead with the deprecation alert silenced. The reason was bundle size: the await library, once we chased its transitive dependencies through base, sexplib0, base_quickcheck and the rest of Jane Street’s runtime, would have ballooned the in-browser toplevel by roughly 285 MB. The right API would not even fit through GitHub’s 100 MB per-file push limit, let alone be reasonable to send to a reader’s browser.

This post is the story of how we got from 285 MB down to 4 MB and made the resulting bundle compose cleanly with the in-browser toplevel, so the lecture’s Await_capsule form works end-to-end in the cell at the bottom of this post. Most of the work happened on a branch of ocsigen/js_of_ocaml, with a smaller piece in art-w/x-ocaml, the WebComponent that powers the cells.

Why bundle size matters

I teach two OCaml-heavy courses at IIT Madras: CS3100, the undergraduate functional programming course, and CS6868, the more recent graduate course on language abstractions for parallelism. The lecture notes, examples and homework for both would be much more useful as interactive books that a student can read, edit and run entirely client-side, with no local installation. The same shape would help us when we run hands-on OCaml and OxCaml workshops, where the first session routinely gets eaten by the installation hump: getting opam, the compiler and the required libraries working on every attendee’s machine over patchy conference WiFi, before the teaching can begin.

The broader effort to make installation painless is the OCaml Platform roadmap, which we have been working on at Tarides as a “zero to OCaml in one click” experience. That roadmap targets a developer who wants a real local toolchain, with the full editor, debugger and project-management story, and a generous latency budget since this is a one-time setup. A workshop attendee has a much narrower target: just enough OCaml to complete the exercises in front of them. The client-side x-ocaml toplevel fits that target naturally, because everything ships as static assets and there is no installation step. The bundle, in this setting, is the latency budget: 285 MB makes the in-browser path unshippable, 4 MB makes it a realistic alternative to a local toolchain for a 90-minute session.

Why 285 MB?

The recipe x-ocaml already had for “load extra libraries into a running in-browser toplevel” goes like this. For each cma you want to ship, run

$ js_of_ocaml --toplevel .cma -o lib.js

then concatenate the per-cma outputs into a single bundle and load it via

`Internship positions`



Internship enquiries are the most frequent ones that I receive. Here’s how you
can make it work.  Please do go through my web page to
look at what areas I work on. Write to me about what interests you and what your
goals are.

My group works on systems. To make the internship work well, we require that you
have demonstrable systems building experience. Do build projects that go beyond
your coursework.  Make sure that the projects are developed publicly on GitHub
or other similar platforms so that one can take a look at what you’ve
built. Even better is contributions to other open-source projects.

The group solves systems problems with functional programming. If you have prior
experience with functional programming, such as building small projects with
OCaml, Haskell, Scala, Scheme or other languages, it is easier for me to assess
your interest. That said, if you are great at any programming language, having
built non-trivial projects in any language, then you have the right skills for
internships in my group. Generally, I expect the interns to have done course
work on OS, compilers and computer architecture. Significant projects in any of
those areas is a huge plus.

I should clarify that my recommendation letters for graduate programs will
reflect my honest assessment of the internship. I will decline writing a
recommendation letter if I think I may not be able to provide a strong one.

I do not work on projects that are primarily AI/ML or Web Development. If you
write to me looking for projects in those areas, it is very likely that you
won’t hear from me. Please don’t bulk email faculty CCing or BCCing everyone in
the department. It is likely that no one will read such an email.

PhD/MS/MTech positions

For academic positions, please have a look at https://research.iitm.ac.in/.
There are alternative ways to enter MS and PhD positions by being a reserach
associate and completing some coursework at IITM. For more information, see
here.

Contributing to the OCaml community

A significant chunk of the enquiries were from folks who hold full-time
positions looking to be involved in the research group. Unfortunately, making
part-time positions work is a challenge for both sides. I would encourage
contributions to the wider OCaml community.

There are several great ways to get involved with the community. Here’s what I
usually recommend.


  Learn the basics.
    
      Go through the OCaml part of my CS3100 course. The course has a YouTube
playlist and programming assignments. Complete the programming assignments.
      Read the Real World OCaml book.
      There are lots of other resources at OCaml.org, the official website of the OCaml community and the ecosystem.
    
  
  Join the community.
    
      OCaml discord and discuss are great places to hang out with other OCaml folks and ask questions.
      Discord is better for quick clarifications and discuss for longer form discussions.
    
  
  Look for “good first issues” in the OCaml projects and work on them
    
      Check out the core platform tools under the OCaml github org. See OCaml compiler, dune build system, opam package manager, ocaml.org, etc.
      Across the wider ecosystem – SemGrep, OpenGrep, Rocq, etc.
    
  
  Work on self-directed projects. Here is my list of ideas.


OCaml community also participates in Outreachy
internships. Outreachy internships are paid
internships for underrepresented groups. It is a great way to contribute to the
community while being mentored by folks from the OCaml community. Here’s a nice
intro (in Tamil) to the
impact that Outreachy program had on an Outreachy intern. Look out for
announcements about
Outreachy internships in the OCaml discuss forum.

Research Associate positions

This is for folks who want to contribute to the core research programme but do
not see themselves joining academic programs. The expectation here is that you
are an experiened systems engineer, who should see themselves easily qualifying
for the internship positions in the group.

One useful way to look at this position is similar to a research software
development engineer who helps build out the systems used for research or
translate research to practice. In the past, research associates have helped
upstream multicore
OCaml.
The easiest way to get into this role would be to do an internship, see whether
you like this area, do well in the internship and then choose to apply to
research associate position.

Another variant is a post-bacc or a pre-doc position aimed at highly motivated
recent graduates, who are looking to build research experience. The expectation
here is that we get papers into top venues in PL and Systems. For such students,
I recommend going through my CS6225 Programs and Proofs
course, watch the video
lectures
and complete the
assignments.
The course is not an easy one, but will expose you to the broad area of PL and
specifically to deductive program verification. At the very least, you will come
out with an understanding of what it is to think rigorously about program
correctness.

Research associate positions are fixed-term positions. In order to make this
work, the tenure should be at least 18 months to make it work.

Summary

While I may not be hiring actively all the time, do reach out to me if you are
interested in any of hte above. Please follow me on
LinkedIn,
X or Bluesky,
where I am likely to announce any open positions.



Off-CPU-time analysis
2024-07-24T09:48:00+00:00
Off-CPU analysis is where the program behavior when it is not running is
recorded and analysed. See Brendan Gregg’s eBPF based off-CPU
analysis. While on-CPU
performance monitoring tools such as perf give you an idea of where the
program is actively spending its time, they won’t tell you where the program
is spending time blocked waiting for an action. Off-CPU analysis reveals
information about where the program is spending time passively.



Installation

Install the tools from https://github.com/iovisor/bcc/.

Enabling frame pointers

The off-CPU stack trace collection, offcputime-bpfcc, requires the programs to
be compiled with frame pointers for full backtraces.

OCaml

For OCaml, you’ll need a compiler variant with frame pointers enabled. If you
are installing a released compiler using opam, you can create one the following 
switch command opam switch create 5.2.0+fp 5.2.0 ocaml-option-fp. Change out 
5.2.0 for your preferred OCaml version.

Instead, if you are building the OCaml compiler from source, configure the
compiler with --enable-frame-pointers option:

$ ./configure --enable-frame-pointers


Lastly, there is an option to create an opam switch with the development branch
of the compiler. The instructions are in ocaml/HACKING.adoc. In order to
create an opam switch from the current working directory, do:

$ opam switch create . 'ocaml-option-fp' --working-dir


glibc

The libc is not compiled with frame pointers by default. This will lead to many
truncated stack traces. On Ubuntu, I did the following to get a glibc with frame
pointers enabled:


  Install glibc with frame pointers
    $ sudo apt install libc6-prof
    
  
  LD_PRELOAD the glibc with frame pointers
    $ LD_PRELOAD=/lib/libc6-prof/x86_64-linux-gnu/libc.so.6 ./myapp.exe
    
  


Running

On one terminal run the program that you want to analyze:

$ LD_PRELOAD=/lib/libc6-prof/x86_64-linux-gnu/libc.so.6 ./ocamlfoo.exe


On another terminal run offcputime-bpfcc tool:

$ sudo offcputime-bpfcc --stack-storage-size 2097152 -p $(pgrep -f ocamlfoo.exe) 10 > offcputime.out


The command instruments the watches for 10s and the writes out the stack traces
corresponding to blocking calls in offcputime.out. We use a large stack
storage size argument so as to not lose stack traces. Otherwise, you will see
many [Missing User Stack] errors in the back traces.

Caveats

offcputime-bpfcc must run longer than the program being instrumented by a few
seconds so that the function symbols are resolved. Otherwise you may see
[unknown] in the backtrace for function names.

Oddities

I still see an order of magnitude difference between the maximum pauses observed
using offcputime-bpfcc and olly trace. Something is off.

Other links


  https://www.pingcap.com/blog/how-to-trace-linux-system-calls-in-production-with-minimal-impact-on-performance/