Ocaml Meta-programming With PPX
Ocaml PPX
OCaml does not include a built-in metaprogramming system as part of the language. Instead, OCaml
provides extension nodes and attributes in its syntax. Extension nodes have the form [%<identifier> <expr>], optionally with more % signs. There is also a shorthand form let%<identifier> <expr> which is equivalent to the first one. Attributes use similar
syntax, for example [@<identifier> <payload>]. Both are explained in more detail in the Ocaml user
manual. The PPX metaprogramming system builds upon these syntax features and runs before compilation. PPX stands for Preprocessing extension.
OCaml has had several metaprogramming systems over the years, but one has prevailed: PPX. There are two types of PPXs: derivers and extensions. Derivers derive a new AST node from an already existing
type. Extensions take an existing AST and return a modified AST as output.
Why I like the PPX system in Ocaml
Short and concise:
- PPX transformations operate on the typed AST, not raw text, which makes them safe and structured.
- They integrate seamlessly with the OCaml compiler toolchain and build systems like dune.
- PPXs are composable -> multiple PPXs can safely run in sequence on the same codebase.
- Writing PPXs deepens your understanding of OCaml’s compiler internals and the Parsetree.
- Once you understand PPX, you can create your own domain-specific mini-languages within OCaml.
Derivers
Derivers generate code that you could almost write by hand, for example, converting a type to a string.They operate on a type definition and generate new code. More precisely, they generate a new AST-node in
the program based on an existing one. More often than not this node is limited to a specific
type. To generate a new node you have to reconstruct a (sub-)Parsetree. This can be done via the
ppxlib and Ast_builder.Default helper functions. However, this is very error-prone and can quickly become hard to read. Instead you can use the ppxlib.metaquot extensions. It’s a PPX for writing
PPXs. It’ll take Ocaml syntax and convert it to a Parsetree equivalent.
(* example from "Meta-programming with PPX"*)
let rec expr_of_type typ =
let loc = typ.ptyp_loc in
match typ with
| [%type: int] -> [%expr string_of_int]
| [%type: string] -> [%expr fun i -> i]
| [%type: bool] -> [%expr string_of_bool]
| [%type: float] -> [%expr string_of_float]
| [%type: [%t? t] list] ->
[%expr
fun lst ->
"["
^ List.fold_left
(fun acc s -> acc ^ [%e expr_of_type t] s ^ ";")
"" lst
^ "]"]
| _ ->
Location.raise_errorf ~loc "No support for this type: %s"
(string_of_core_type typ)
Extensions
Extension writers do not add a new node to the AST, instead, they modify an existing AST node. The
core functionality of this extension is defined in the expand function which has the signature
Ppxlib.expression -> Ppxlib.expression. The following example from the blog “Meta-programming with PPX” demonstrates this idea. It shows how to use an extension to convert a list of the form `(‘key
- ‘value) list` into a hash table.
let get_tuple ~loc = function
| { pexp_desc = Pexp_tuple [ key; value ]; _ } -> (key, value)
| _ -> Location.raise_errorf ~loc "Expected a list of tuple pairs"
let rec handle_list ~loc = function
| [%expr []] -> []
| [%expr [%e? pair] :: [%e? tl]] ->
let k, v = get_tuple ~loc pair in
let add = [%expr fun tbl -> Hashtbl.add tbl [%e k] [%e v]] in
let rest = handle_list ~loc tl in
add :: rest
| _ -> Location.raise_errorf ~loc "Expected a list of tuple pairs"
let expand ~ctxt expr =
let loc = Expansion_context.Extension.extension_point_loc ctxt in
match expr with
| [%expr []] -> [%expr Hashtbl.create 10]
| [%expr [%e? _pair] :: [%e? _]] ->
let fun_list = handle_list ~loc expr in
let len = List.length fun_list in
[%expr
Hashtbl.create [%e eint ~loc len] |> fun tbl ->
List.iter (fun f -> f tbl) [%e elist ~loc fun_list];
tbl]
| _ -> Location.raise_errorf ~loc "Expected a list"
I’ll try and explain this snippet in my own words: We see the function expand this function takes
a named parameter ctxt and an expression (the expression that will be modified). The expansion
context is used to extract the source location, which is useful for error messages and debugging. If
the expression is in any other form than we expect the last match case we raise an error with the
location. In case of an empty list a hash table with a capacity of 10 is created. If the expression
is in any other form than we expect the last match case we raise an error with the location. The
main part takes any list of the form “head-tail”. This list is then passed to the fun_list
function which generates a list of “hash-table-add” expressions. To achieve this, we have
the helper function get_tuple which extracts the tuple key and value from the provided list.
To better understand what the whole thing does I’ll show a literal example:
let tbl = [%hashtbl [ ("Hello", 1) ]] in (* is the same as...*)
let%hashtbl tbl2 = [ ("Hello", 2) ] in (* only other syntax *)
print_int (Hashtbl.find tbl "Hello");
print_int (Hashtbl.find tbl2 "Hello")
(* output *)
1
2
This converts to:
let tbl = Hashtbl.create 1 in
Hashtbl.add_exn tbl "Hello" 1;
printf "%d\n" (Hashtbl.find tbl "tom");
First, the extension takes a list expression [("Hello", 1)] with the length one and converts it to
Hashtbl.create 1. Then, the extension adds the elements with Hashtbl.add "Hello" 1. This
extension expands to a code snippet that creates a Hash-Table with a capacity and adds the elements
from a list to it. This can save the programmer a lot of keystrokes.
NOTE: I want to note here that the extensions in ppxlib make the extension and deriver code much more readable and increase developer ergonomics. While showing you in the process of writing extensions and derivers how powerful extensions can be. Another good example of extension is the
tyxmllibrary for creating html documents within Ocaml.