Continuous Benchmarking & Call for Benchmarks
13 Sep 2018Over the past few weeks, at OCaml Labs, we’ve deployed continuous benchmarking infrastructure for Multicore OCaml. Live results are available at http://ocamllabs.io/multicore. Continuous benchmarking has already enabled us to make informed decisions about the impact of our changes, and should come in handy over the next few months where we polish off and tune the multicore runtime.
Currently, the benchmarks are all single-threaded and run on x86-64. Our current aim is to quantify the performance impact of running single-threaded OCaml programs using the multicore compiler. Moving forward, would would include multi-threaded benchmarks and other architectures.
The benchmarks and the benchmarking infrastructure were adapted from OCamlPro’s benchmark suite aimed at benchmarking Flambda optimisation passes. The difference with the new infrastructure is that all the data is generated as static HTML and CSV files with data processing performed on the client side in JavaScript. I find the new setup easier to manage and deploy.
Quality of benchmarks
If you observe the results, you will see that multicore is slowest compared to
trunk OCaml on menhir-standard
and menhir-fancy
. But if you look closely:
these benchmarks complete in less than 10 milliseconds. This is not enough time to faithfully compare the implementations as constant factors such as runtime initialisation and costs of single untimely major GC dominate any useful work. In fact, almost half of the benchmarks complete within a second. The quality of this benchmark suite ought to be improved.
Call for benchmarks
While we want longer running benchmarks, we would also like those benchmarks to represent real OCaml programs found in the wild. If you have long running real OCaml programs, please consider adding it to the benchmark suite. Your contribution will ensure that performance-oriented OCaml features such as multicore and flambda are evaluated on representative OCaml programs.
How to contribute
Make a PR to multicore
branch of
ocamllabs/ocamlbench-repo.
The packages
directory contains many examples for how to prepare programs for
benchmarking. Among these, numerical-analysis-bench
and menhir-bench
are
simple and illustrative.
The benchmarks themselves are run using these scripts.
Dockerfile
There is a handy Dockerfile to test benchmarking setup:
This builds the docker image for the benchmarking infrastructure. You can run the benchmarks as:
You can view the results by:
Now on your host machine, point your browser to localhost:8080
to
interactively visualise the benchmark results.
Caveats
Aim to get your benchmark compiling with OCaml 4.06.1. You might have trouble getting your benchmark to compile with the multicore compiler due to several reasons:
- Multicore compiler has syntax extensions for algebraic effect handlers which breaks packages that use ppx.
- Multicore compiler has a different C API which breaks core dependencies such as Lwt.
- Certain features such as marshalling closures and custom tag objects are unimplemented.
If you encounter trouble submitting benchmarks, please make an issue on kayceesrk/ocamlbench-scripts repo.