Ocaml Meta-programming With PPX


Ocaml PPX

OCaml does not include a built-in metaprogramming system as part of the language. Instead, OCaml provides extension nodes and attributes in its syntax. Extension nodes have the form [%<identifier> <expr>], optionally with more % signs. There is also a shorthand form let%<identifier> <expr> which is equivalent to the first one. Attributes use similar syntax, for example [@<identifier> <payload>]. Both are explained in more detail in the Ocaml user manual. The PPX metaprogramming system builds upon these syntax features and runs before compilation. PPX stands for Preprocessing extension. OCaml has had several metaprogramming systems over the years, but one has prevailed: PPX. There are two types of PPXs: derivers and extensions. Derivers derive a new AST node from an already existing type. Extensions take an existing AST and return a modified AST as output.

Why I like the PPX system in Ocaml

Short and concise:

  • PPX transformations operate on the typed AST, not raw text, which makes them safe and structured.
  • They integrate seamlessly with the OCaml compiler toolchain and build systems like dune.
  • PPXs are composable -> multiple PPXs can safely run in sequence on the same codebase.
  • Writing PPXs deepens your understanding of OCaml’s compiler internals and the Parsetree.
  • Once you understand PPX, you can create your own domain-specific mini-languages within OCaml.

Derivers

Derivers generate code that you could almost write by hand, for example, converting a type to a string.They operate on a type definition and generate new code. More precisely, they generate a new AST-node in the program based on an existing one. More often than not this node is limited to a specific type. To generate a new node you have to reconstruct a (sub-)Parsetree. This can be done via the ppxlib and Ast_builder.Default helper functions. However, this is very error-prone and can quickly become hard to read. Instead you can use the ppxlib.metaquot extensions. It’s a PPX for writing PPXs. It’ll take Ocaml syntax and convert it to a Parsetree equivalent.

(* example from "Meta-programming with PPX"*)
let rec expr_of_type typ =
  let loc = typ.ptyp_loc in
  match typ with
  | [%type: int] -> [%expr string_of_int]
  | [%type: string] -> [%expr fun i -> i]
  | [%type: bool] -> [%expr string_of_bool]
  | [%type: float] -> [%expr string_of_float]
  | [%type: [%t? t] list] ->
      [%expr
        fun lst ->
          "["
          ^ List.fold_left
              (fun acc s -> acc ^ [%e expr_of_type t] s ^ ";")
              "" lst
          ^ "]"]
  | _ ->
      Location.raise_errorf ~loc "No support for this type: %s"
        (string_of_core_type typ)

Extensions

Extension writers do not add a new node to the AST, instead, they modify an existing AST node. The core functionality of this extension is defined in the expand function which has the signature Ppxlib.expression -> Ppxlib.expression. The following example from the blog “Meta-programming with PPX” demonstrates this idea. It shows how to use an extension to convert a list of the form `(‘key

  • ‘value) list` into a hash table.
let get_tuple ~loc = function
  | { pexp_desc = Pexp_tuple [ key; value ]; _ } -> (key, value)
  | _ -> Location.raise_errorf ~loc "Expected a list of tuple pairs"

let rec handle_list ~loc = function
  | [%expr []] -> []
  | [%expr [%e? pair] :: [%e? tl]] ->
      let k, v = get_tuple ~loc pair in
      let add = [%expr fun tbl -> Hashtbl.add tbl [%e k] [%e v]] in
      let rest = handle_list ~loc tl in
      add :: rest
  | _ -> Location.raise_errorf ~loc "Expected a list of tuple pairs"

let expand ~ctxt expr =
  let loc = Expansion_context.Extension.extension_point_loc ctxt in
  match expr with
  | [%expr []] -> [%expr Hashtbl.create 10]
  | [%expr [%e? _pair] :: [%e? _]] ->
      let fun_list = handle_list ~loc expr in
      let len = List.length fun_list in
      [%expr
        Hashtbl.create [%e eint ~loc len] |> fun tbl ->
        List.iter (fun f -> f tbl) [%e elist ~loc fun_list];
        tbl]
  | _ -> Location.raise_errorf ~loc "Expected a list"

I’ll try and explain this snippet in my own words: We see the function expand this function takes a named parameter ctxt and an expression (the expression that will be modified). The expansion context is used to extract the source location, which is useful for error messages and debugging. If the expression is in any other form than we expect the last match case we raise an error with the location. In case of an empty list a hash table with a capacity of 10 is created. If the expression is in any other form than we expect the last match case we raise an error with the location. The main part takes any list of the form “head-tail”. This list is then passed to the fun_list function which generates a list of “hash-table-add” expressions. To achieve this, we have the helper function get_tuple which extracts the tuple key and value from the provided list. To better understand what the whole thing does I’ll show a literal example:

let tbl = [%hashtbl [ ("Hello", 1) ]] in (* is the same as...*)
let%hashtbl tbl2 = [ ("Hello", 2) ] in (* only other syntax *)
print_int (Hashtbl.find tbl "Hello");
print_int (Hashtbl.find tbl2 "Hello")

(* output *)
1
2

This converts to:

let tbl = Hashtbl.create 1 in
Hashtbl.add_exn tbl "Hello" 1;
printf "%d\n" (Hashtbl.find tbl "tom");

First, the extension takes a list expression [("Hello", 1)] with the length one and converts it to Hashtbl.create 1. Then, the extension adds the elements with Hashtbl.add "Hello" 1. This extension expands to a code snippet that creates a Hash-Table with a capacity and adds the elements from a list to it. This can save the programmer a lot of keystrokes.

NOTE: I want to note here that the extensions in ppxlib make the extension and deriver code much more readable and increase developer ergonomics. While showing you in the process of writing extensions and derivers how powerful extensions can be. Another good example of extension is the tyxml library for creating html documents within Ocaml.

Sources