Research – Rao Xiaojia

Selected Publications

2025

Rao, Xiaojia; Radziuk, Stefan; Watt, Conrad; Gardner, Philippa

Progressful Interpreters for Efficient WebAssembly Mechanisation Journal Article

In: Proc. ACM Program. Lang., vol. 9, no. POPL, 2025.

@article{ProgressfulInterpreter,

title = {Progressful Interpreters for Efficient WebAssembly Mechanisation},

author = {Xiaojia Rao and Stefan Radziuk and Conrad Watt and Philippa Gardner},

url = {https://doi.org/10.1145/3704858},

doi = {10.1145/3704858},

year  = {2025},

date = {2025-01-01},

urldate = {2025-01-01},

journal = {Proc. ACM Program. Lang.},

volume = {9},

number = {POPL},

publisher = {Association for Computing Machinery},

address = {New York, NY, USA},

abstract = {Mechanisations of programming language specifications are now increasingly common, providing machine-checked modelling of the specification and verification of desired properties such as type safety. However it is challenging to maintain these mechanisations, particularly in the face of an evolving specification. Existing mechanisations of the W3C WebAssembly (Wasm) standard have so far been able to keep pace as the standard evolves, helped enormously by the W3C Wasm standard's choice to state the language's semantics in terms of a fully formal specification. However a substantial incoming extension to Wasm, the 2.0 feature set, motivates the investigation of strategies for more efficient production of the core verification artefacts currently associated with the WasmCert-Coq mechanisation of Wasm. 

In the classic formalisation of a typed operational semantics as followed by the W3C Wasm standard, both the type system and runtime operational semantics are defined as inductive relations, with associated type soundness properties (progress and preservation) and an independent sound interpreter. We investigate two more efficient strategies for producing these artefacts, which are currently all separately defined by WasmCert-Coq. First, the approach of Kokke, Siek, and Wadler for deriving a sound interpreter from a constructive progress proof — we show that this approach scales to the W3C Wasm 1.0 standard, but results in an inefficient interpreter in our setting. Second, inspired by results from intrinsically-typed languages, we define a progressful interpreter which uses Coq's dependent types to certify not only its own soundness, but also the progress property. We show that this interpreter can implement several performance optimisations while maintaining these certifications, which are fully erasable when the interpreter is extracted from Coq. Using this approach, we extend the WasmCert-Coq mechanisation to the significantly larger Wasm 2.0 feature set, discovering and correcting several errors in the expanded specification's type system.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Mechanisations of programming language specifications are now increasingly common, providing machine-checked modelling of the specification and verification of desired properties such as type safety. However it is challenging to maintain these mechanisations, particularly in the face of an evolving specification. Existing mechanisations of the W3C WebAssembly (Wasm) standard have so far been able to keep pace as the standard evolves, helped enormously by the W3C Wasm standard’s choice to state the language’s semantics in terms of a fully formal specification. However a substantial incoming extension to Wasm, the 2.0 feature set, motivates the investigation of strategies for more efficient production of the core verification artefacts currently associated with the WasmCert-Coq mechanisation of Wasm.
In the classic formalisation of a typed operational semantics as followed by the W3C Wasm standard, both the type system and runtime operational semantics are defined as inductive relations, with associated type soundness properties (progress and preservation) and an independent sound interpreter. We investigate two more efficient strategies for producing these artefacts, which are currently all separately defined by WasmCert-Coq. First, the approach of Kokke, Siek, and Wadler for deriving a sound interpreter from a constructive progress proof — we show that this approach scales to the W3C Wasm 1.0 standard, but results in an inefficient interpreter in our setting. Second, inspired by results from intrinsically-typed languages, we define a progressful interpreter which uses Coq’s dependent types to certify not only its own soundness, but also the progress property. We show that this interpreter can implement several performance optimisations while maintaining these certifications, which are fully erasable when the interpreter is extracted from Coq. Using this approach, we extend the WasmCert-Coq mechanisation to the significantly larger Wasm 2.0 feature set, discovering and correcting several errors in the expanded specification’s type system.

2023

Rao, Xiaojia*; Georges, Aïna Linn* (Joint First Authors); Legoupil, Maxime; Watt, Conrad; Pichon-Pharabod, Jean; Gardner, Philippa; Birkedal, Lars

Iris-Wasm: Robust and Modular Verification of WebAssembly Programs Journal Article

In: Proc. ACM Program. Lang., vol. 7, no. PLDI, 2023.

Abstract | Links | BibTeX

2021

Watt, Conrad; Rao, Xiaojia; Pichon-Pharabod, Jean; Bodin, Martin; Gardner, Philippa

Two Mechanisations of WebAssembly 1.0 Proceedings Article

In: Formal Methods: 24th International Symposium, FM 2021, Virtual Event, November 20–26, 2021, Proceedings, pp. 61–79, Springer-Verlag, Berlin, Heidelberg, 2021, ISBN: 978-3-030-90869-0.

Abstract | Links | BibTeX

@inproceedings{10.1007/978-3-030-90870-6_4,

title = {Two Mechanisations of WebAssembly 1.0},

author = {Conrad Watt and Xiaojia Rao and Jean Pichon-Pharabod and Martin Bodin and Philippa Gardner},

url = {https://doi.org/10.1007/978-3-030-90870-6_4},

doi = {10.1007/978-3-030-90870-6_4},

isbn = {978-3-030-90869-0},

year  = {2021},

date = {2021-01-01},

booktitle = {Formal Methods: 24th International Symposium, FM 2021, Virtual Event, November 20–26, 2021, Proceedings},

pages = {61–79},

publisher = {Springer-Verlag},

address = {Berlin, Heidelberg},

abstract = {WebAssembly (Wasm) is a new bytecode language supported by all major Web browsers, designed primarily to be an efficient compilation target for low-level languages such as C/C++ and Rust. It is unusual in that it is officially specified through a formal semantics. An initial draft specification was published in 2017 [14], with an associated mechanised specification in Isabelle/HOL published by Watt that found bugs in the original specification, fixed before its publication [37].The first official W3C standard, WebAssembly 1.0, was published in 2019 [45]. Building on Watt’s original mechanisation, we introduce two mechanised specifications of the WebAssembly 1.0 semantics, written in different theorem provers: WasmCert-Isabelle and WasmCert-Coq. Wasm’s compact design and official formal semantics enable our mechanisations to be particularly complete and close to the published language standard. We present a high-level description of the language’s updated type soundness result, referencing both mechanisations. We also describe the current state of the mechanisation of language features not previously supported: WasmCert-Isabelle includes a verified executable definition of the instantiation phase as part of an executable verified interpreter; WasmCert-Coq includes executable parsing and numeric definitions as on-going work towards a more ambitious end-to-end verified interpreter which does not require an OCaml harness like WasmCert-Isabelle.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Reflection

My research has mostly been on WasmCert-Coq, a Coq mechanisation of the WebAssembly language specification, which I am actively maintaining on Github. In addition to the interest in the development of the specific language itself, I’m also keen on:

- implementing a long-term maintainable mechanisation of an industrial language standard: some existing mechanisations projects can be implemented in a done-and-forgot approach, because the underlying theory is not subject to any change. However, this often leads to unmaintainable mechanisation artefact. There is a lot to be said on good ‘mechanisation engineering practices’, and I only come across them when I regret about a particular choice I have made in hindsight.

- studying the foundations of proof assistants through active usage and experiments, in the hope of finding something useful for other usages. Eventually, it would be cool to do real mathematical researches using a proof assistant as a tool which can actually assist in coming up with the proofs, instead of merely for verifying the existing ones.