Publications
2025
Rao, Xiaojia; Radziuk, Stefan; Watt, Conrad; Gardner, Philippa
Progressful Interpreters for Efficient WebAssembly Mechanisation Journal Article
In: Proc. ACM Program. Lang., vol. 9, no. POPL, 2025.
@article{ProgressfulInterpreter,
title = {Progressful Interpreters for Efficient WebAssembly Mechanisation},
author = {Xiaojia Rao and Stefan Radziuk and Conrad Watt and Philippa Gardner},
url = {https://doi.org/10.1145/3704858},
doi = {10.1145/3704858},
year = {2025},
date = {2025-01-01},
urldate = {2025-01-01},
journal = {Proc. ACM Program. Lang.},
volume = {9},
number = {POPL},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
abstract = {Mechanisations of programming language specifications are now increasingly common, providing machine-checked modelling of the specification and verification of desired properties such as type safety. However it is challenging to maintain these mechanisations, particularly in the face of an evolving specification. Existing mechanisations of the W3C WebAssembly (Wasm) standard have so far been able to keep pace as the standard evolves, helped enormously by the W3C Wasm standard's choice to state the language's semantics in terms of a fully formal specification. However a substantial incoming extension to Wasm, the 2.0 feature set, motivates the investigation of strategies for more efficient production of the core verification artefacts currently associated with the WasmCert-Coq mechanisation of Wasm.
In the classic formalisation of a typed operational semantics as followed by the W3C Wasm standard, both the type system and runtime operational semantics are defined as inductive relations, with associated type soundness properties (progress and preservation) and an independent sound interpreter. We investigate two more efficient strategies for producing these artefacts, which are currently all separately defined by WasmCert-Coq. First, the approach of Kokke, Siek, and Wadler for deriving a sound interpreter from a constructive progress proof — we show that this approach scales to the W3C Wasm 1.0 standard, but results in an inefficient interpreter in our setting. Second, inspired by results from intrinsically-typed languages, we define a progressful interpreter which uses Coq's dependent types to certify not only its own soundness, but also the progress property. We show that this interpreter can implement several performance optimisations while maintaining these certifications, which are fully erasable when the interpreter is extracted from Coq. Using this approach, we extend the WasmCert-Coq mechanisation to the significantly larger Wasm 2.0 feature set, discovering and correcting several errors in the expanded specification's type system.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
In the classic formalisation of a typed operational semantics as followed by the W3C Wasm standard, both the type system and runtime operational semantics are defined as inductive relations, with associated type soundness properties (progress and preservation) and an independent sound interpreter. We investigate two more efficient strategies for producing these artefacts, which are currently all separately defined by WasmCert-Coq. First, the approach of Kokke, Siek, and Wadler for deriving a sound interpreter from a constructive progress proof — we show that this approach scales to the W3C Wasm 1.0 standard, but results in an inefficient interpreter in our setting. Second, inspired by results from intrinsically-typed languages, we define a progressful interpreter which uses Coq’s dependent types to certify not only its own soundness, but also the progress property. We show that this interpreter can implement several performance optimisations while maintaining these certifications, which are fully erasable when the interpreter is extracted from Coq. Using this approach, we extend the WasmCert-Coq mechanisation to the significantly larger Wasm 2.0 feature set, discovering and correcting several errors in the expanded specification’s type system.
2024
Youn, Dongjun; Shin, Wonho; Lee, Jaehyun; Ryu, Sukyoung; Breitner, Joachim; Gardner, Philippa; Lindley, Sam; Pretnar, Matija; Rao, Xiaojia; Watt, Conrad; Rossberg, Andreas
Bringing the WebAssembly Standard up to Speed with SpecTec Journal Article
In: Proc. ACM Program. Lang., vol. 8, no. PLDI, 2024.
@article{10.1145/3656440,
title = {Bringing the WebAssembly Standard up to Speed with SpecTec},
author = {Dongjun Youn and Wonho Shin and Jaehyun Lee and Sukyoung Ryu and Joachim Breitner and Philippa Gardner and Sam Lindley and Matija Pretnar and Xiaojia Rao and Conrad Watt and Andreas Rossberg},
url = {https://doi.org/10.1145/3656440},
doi = {10.1145/3656440},
year = {2024},
date = {2024-06-01},
journal = {Proc. ACM Program. Lang.},
volume = {8},
number = {PLDI},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
abstract = {WebAssembly (Wasm) is a portable low-level bytecode language and virtual machine that has seen increasing use in a variety of ecosystems. Its specification is unusually rigorous – including a full formal semantics for the language – and every new feature must be specified in this formal semantics, in prose, and in the official reference interpreter before it can be standardized. With the growing size of the language, this manual process with its redundancies has become laborious and error-prone, and in this work, we offer a solution.We present SpecTec, a domain-specific language (DSL) and toolchain that facilitates both the Wasm specification and the generation of artifacts necessary to standardize new features. SpecTec serves as a single source of truth — from a SpecTec definition of the Wasm semantics, we can generate a typeset specification, including formal definitions and prose pseudocode descriptions, and a meta-level interpreter. Further backends for test generation and interactive theorem proving are planned. We evaluate SpecTec’s ability to represent the latest Wasm 2.0 and show that the generated meta-level interpreter passes 100% of the applicable official test suite. We show that SpecTec is highly effective at discovering and preventing errors by detecting historical errors in the specification that have been corrected and ten errors in five proposals ready for inclusion in the next version of Wasm. Our ultimate aim is that SpecTec should be adopted by the Wasm standards community and used to specify future versions of the standard.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2023
Rao, Xiaojia*; Georges, Aïna Linn* (Joint First Authors); Legoupil, Maxime; Watt, Conrad; Pichon-Pharabod, Jean; Gardner, Philippa; Birkedal, Lars
Iris-Wasm: Robust and Modular Verification of WebAssembly Programs Journal Article
In: Proc. ACM Program. Lang., vol. 7, no. PLDI, 2023.
@article{10.1145/3591265,
title = {Iris-Wasm: Robust and Modular Verification of WebAssembly Programs},
author = {Xiaojia* Rao and Aïna Linn* (Joint First Authors) Georges and Maxime Legoupil and Conrad Watt and Jean Pichon-Pharabod and Philippa Gardner and Lars Birkedal},
url = {https://doi.org/10.1145/3591265},
doi = {10.1145/3591265},
year = {2023},
date = {2023-06-01},
urldate = {2023-06-01},
journal = {Proc. ACM Program. Lang.},
volume = {7},
number = {PLDI},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
abstract = {WebAssembly makes it possible to run C/C++ applications on the web with near-native performance. A WebAssembly program is expressed as a collection of higher-order ML-like modules, which are composed together through a system of explicit imports and exports using a host language, enabling a form of higher- order modular programming. We present Iris-Wasm, a mechanized higher-order separation logic building on a specification of Wasm 1.0 mechanized in Coq and the Iris framework. Using Iris-Wasm, we are able to specify and verify individual modules separately, and then compose them modularly in a simple host language featuring the core operations of the WebAssembly JavaScript Interface. Building on Iris-Wasm, we develop a logical relation that enforces robust safety: unknown, adversarial code can only affect other modules through the functions that they explicitly export. Together, the program logic and the logical relation allow us to formally verify functional correctness of WebAssembly programs, even when they invoke and are invoked by unknown code, thereby demonstrating that WebAssembly enforces strong isolation between modules.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2021
Watt, Conrad; Rao, Xiaojia; Pichon-Pharabod, Jean; Bodin, Martin; Gardner, Philippa
Two Mechanisations of WebAssembly 1.0 Proceedings Article
In: Formal Methods: 24th International Symposium, FM 2021, Virtual Event, November 20–26, 2021, Proceedings, pp. 61–79, Springer-Verlag, Berlin, Heidelberg, 2021, ISBN: 978-3-030-90869-0.
@inproceedings{10.1007/978-3-030-90870-6_4,
title = {Two Mechanisations of WebAssembly 1.0},
author = {Conrad Watt and Xiaojia Rao and Jean Pichon-Pharabod and Martin Bodin and Philippa Gardner},
url = {https://doi.org/10.1007/978-3-030-90870-6_4},
doi = {10.1007/978-3-030-90870-6_4},
isbn = {978-3-030-90869-0},
year = {2021},
date = {2021-01-01},
booktitle = {Formal Methods: 24th International Symposium, FM 2021, Virtual Event, November 20–26, 2021, Proceedings},
pages = {61–79},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
abstract = {WebAssembly (Wasm) is a new bytecode language supported by all major Web browsers, designed primarily to be an efficient compilation target for low-level languages such as C/C++ and Rust. It is unusual in that it is officially specified through a formal semantics. An initial draft specification was published in 2017 [14], with an associated mechanised specification in Isabelle/HOL published by Watt that found bugs in the original specification, fixed before its publication [37].The first official W3C standard, WebAssembly 1.0, was published in 2019 [45]. Building on Watt’s original mechanisation, we introduce two mechanised specifications of the WebAssembly 1.0 semantics, written in different theorem provers: WasmCert-Isabelle and WasmCert-Coq. Wasm’s compact design and official formal semantics enable our mechanisations to be particularly complete and close to the published language standard. We present a high-level description of the language’s updated type soundness result, referencing both mechanisations. We also describe the current state of the mechanisation of language features not previously supported: WasmCert-Isabelle includes a verified executable definition of the instantiation phase as part of an executable verified interpreter; WasmCert-Coq includes executable parsing and numeric definitions as on-going work towards a more ambitious end-to-end verified interpreter which does not require an OCaml harness like WasmCert-Isabelle.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Reflection
My PhD research has mostly been on projects related to WasmCert-Coq, a Rocq (known as Coq previously) mechanisation of the WebAssembly language specification, which I’ve been maintaing on Github since 2019.
I believe implementing important mechanisations in a maintainable way is valuable and requires careful planning, which can be substantially more difficult than making done-and-forgot one. Even if the underlying theory is not moving (such as mathematical definitions), the theorem prover languages or libraries themselves could be changed and refactored, demanding updates to the mechanisation which will otherwise gradually bit-rot. Some of these updates can be non-trivial depending on how backward-compatible the changes to the languages are (fortunately, Rocq has been pretty good on this aspect so far). There is a lot to be said on good ‘mechanisation engineering practices’, and I only come across them when I regret about a particular choice I have made in hindsight.
I also wish to study the foundations of proof assistants through active usage and experiments, in the hope of finding something useful for other usages. Eventually, it would be cool to do real mathematical researches using a proof assistant as a tool which can actually assist in coming up with the proofs, instead of merely for verifying the existing ones.
