Continuous Benchmarking & Call for Benchmarks

13 Sep 2018

Over the past few weeks, at OCaml Labs, we’ve deployed continuous benchmarking infrastructure for Multicore OCaml. Live results are available at http://ocamllabs.io/multicore. Continuous benchmarking has already enabled us to make informed decisions about the impact of our changes, and should come in handy over the next few months where we polish off and tune the multicore runtime.

Currently, the benchmarks are all single-threaded and run on x86-64. Our current aim is to quantify the performance impact of running single-threaded OCaml programs using the multicore compiler. Moving forward, would would include multi-threaded benchmarks and other architectures.

The benchmarks and the benchmarking infrastructure were adapted from OCamlPro’s benchmark suite aimed at benchmarking Flambda optimisation passes. The difference with the new infrastructure is that all the data is generated as static HTML and CSV files with data processing performed on the client side in JavaScript. I find the new setup easier to manage and deploy.

Quality of benchmarks

If you observe the results, you will see that multicore is slowest compared to trunk OCaml on menhir-standard and menhir-fancy. But if you look closely:

Binary tree

these benchmarks complete in less than 10 milliseconds. This is not enough time to faithfully compare the implementations as constant factors such as runtime initialisation and costs of single untimely major GC dominate any useful work. In fact, almost half of the benchmarks complete within a second. The quality of this benchmark suite ought to be improved.

Call for benchmarks

While we want longer running benchmarks, we would also like those benchmarks to represent real OCaml programs found in the wild. If you have long running real OCaml programs, please consider adding it to the benchmark suite. Your contribution will ensure that performance-oriented OCaml features such as multicore and flambda are evaluated on representative OCaml programs.

How to contribute

Make a PR to multicore branch of ocamllabs/ocamlbench-repo. The packages directory contains many examples for how to prepare programs for benchmarking. Among these, numerical-analysis-bench and menhir-bench are simple and illustrative.

The benchmarks themselves are run using these scripts.

Dockerfile

There is a handy Dockerfile to test benchmarking setup:

$ docker build -t multicore-cb -f Dockerfile . #takes a while; grab a coffee

This builds the docker image for the benchmarking infrastructure. You can run the benchmarks as:

$ docker run -p 8080:8080 -it multicore-cb bash
$ cd ~/ocamlbench-scripts
$ ./run-bench.sh --nowait --lazy #takes a while; grab lunch

You can view the results by:

$ cd ~/logs/operf
$ python -m SimpleHTTPServer 8080

Now on your host machine, point your browser to localhost:8080 to interactively visualise the benchmark results.

Caveats

Aim to get your benchmark compiling with OCaml 4.06.1. You might have trouble getting your benchmark to compile with the multicore compiler due to several reasons:

Multicore compiler has syntax extensions for algebraic effect handlers which breaks packages that use ppx.
Multicore compiler has a different C API which breaks core dependencies such as Lwt.
Certain features such as marshalling closures and custom tag objects are unimplemented.

If you encounter trouble submitting benchmarks, please make an issue on kayceesrk/ocamlbench-scripts repo.

KC Sivaramakrishnan Building Functional Systems

Continuous Benchmarking & Call for Benchmarks

Quality of benchmarks

Call for benchmarks

How to contribute

Dockerfile

Caveats

Related Posts

Foundations for hacking on OCaml 10 Nov 2025

Testing x-ocaml, OCaml notebooks as a WebComponent 20 Jun 2025

Linearity and uniqueness 04 Jun 2025