WebAssembly


WebAssembly is an open standard that defines a portable binary-code format for executable programs, and a corresponding textual assembly language, as well as interfaces for facilitating interactions between such programs and their host environment. The main goal of WebAssembly is to enable high-performance applications on web pages, but the format is designed to be executed and integrated in other environments as well, including standalone.
WebAssembly became a World Wide Web Consortium recommendation on 5 December 2019 and, alongside HTML, CSS, and JavaScript, is the fourth language to run natively in browsers. In order to use Wasm in browsers, users may use Emscripten SDK to compile C++ source code into a binary file which runs in the same sandbox as regular JavaScript code. Emscripten provides bindings for several commonly used environment interfaces like WebGL. There is no direct Document Object Model access; however, it is possible to create proxy functions for this, for example through stdweb, web_sys, and js_sys when using Rust language.
WebAssembly is usually either ahead-of-time or just-in-time compiled, while there's also available "WebAssembly Micro Runtime, an interpreter-based WebAssembly runtime for embedded devices".
The World Wide Web Consortium maintains the standard with contributions from Mozilla, Microsoft, Google, Apple, Fastly, Intel, and Red Hat.

History

WebAssembly was first announced in 2015, and the first demonstration was executing Unity's Angry Bots in Firefox, Google Chrome, and Microsoft Edge. The precursor technologies were asm.js from Mozilla and Google Native Client, and the initial implementation was based on the feature set of asm.js. The asm.js technology already provides near-native code execution speeds and can be considered a viable alternative for browsers that don't support WebAssembly or have it disabled for security reasons.
In March 2017, the design of the minimum viable product was declared to be finished and the preview phase ended., Safari 11 was released with support. In February 2018, the WebAssembly Working Group published three public working drafts for the Core Specification, JavaScript Interface, and Web API.

Implementations

While WebAssembly was initially designed to enable near-native code execution speed in the web browser, it has been considered valuable outside of such, in more generalized contexts. Since WebAssembly's runtime environment are low level virtual stack machines that can be embedded into host applications some of them have found a way to standalone RE like Wasmtime and Wasmer.

Web Browsers

In November 2017, Mozilla declared support "in all major browsers", after WebAssembly was enabled by default in Edge 16. The support includes mobile web browsers for iOS and Android. , 91.18% of installed browsers support WebAssembly.
But for older browsers, Wasm can be compiled into asm.js by a JavaScript polyfill.

Compilers

Because WebAssembly executables are precompiled, it is possible to use a variety of programming languages to make them. This is achieved either through direct compilation to Wasm, or through implementation of the corresponding virtual machines in Wasm. There have been around 40 programming languages reported to support Wasm as a compilation target.
Emscripten compiles C and C++ to Wasm using the Binaryen and LLVM as backend.
As of version 8 a standalone Clang can compile C and C++ to Wasm.
Its initial aim is to support compilation from C and C++, though support for other source languages such as Rust and.NET languages is also emerging. After the MVP release, there are plans to support multithreading and garbage collection which would make WebAssembly a compilation target for garbage-collected programming languages like C#, F#, Python, and even JavaScript where the browser's just-in-time compilation speed is considered too slow. A number of other languages have some support including Java, Julia, Ruby, as well as Go.

Security considerations

In June 2018, a security researcher presented the possibility of using WebAssembly to circumvent browser mitigations for Spectre and Meltdown security vulnerabilities once support for threads with shared memory is added. Due to this concern, WebAssembly developers put the feature on hold. However, in order to explore these future language extensions, Google Chrome added experimental support for the WebAssembly thread proposal in October 2018.
WebAssembly has been criticized for allowing greater ease of hiding the evidence for malware writers, scammers and phishing attackers; WebAssembly is only present on the user's machine in its compiled form, which " detection difficult". The speed and concealability of WebAssembly have led to its use in hidden crypto mining on the website visitor's device. Coinhive, a now defunct service facilitating cryptocurrency mining in website visitors' browsers, claims their "miner uses WebAssembly and runs with about 65% of the performance of a native Miner." A June 2019 study from the Technische Universität Braunschweig, analyzed the usage of WebAssembly in the Alexa top 1 million websites and found the prevalent use was for malicious crypto mining, and that malware accounted for more than half of the WebAssembly-using websites studied.
The ability to effectively obfuscate large amounts of code can also be used to disable ad blocking and privacy tools that prevent web tracking like Privacy Badger.
As WebAssembly only supports structured control flow, it is amenable toward security verification techniques including symbolic execution. Current efforts in this direction include the Manticore symbolic execution engine.

WASI

WebAssembly System Interface is a simple interface designed by Mozilla intended to be portable to any platform. It provides POSIX-like features like file I/O constrained by capability-based security. There are also a few other proposed ABI/APIs.
WASI was influenced by CloudABI and Capsicum.

Specification

Host Environment

The general standard provides core specifications for JavaScript API and details on embedding.

Virtual machine

Wasm code is intended to be run on a portable virtual stack machine. The VM is designed to be faster to parse and execute than JavaScript and to have a compact code representation. An external functionality that may be expected by a wasm binary code is not stipulated by standard, it rather provides a way to deliver interfacing via modules by the host environment that the VM implementation runs in.

Wasm program

A Wasm program is designed to be a separate module containing collections of various wasm-defined values and program type definitions. These are expressed in either binary or textual format that both have common structure.

Instruction set

The core standard for the binary format of a wasm program defines an instruction set architecture consisting of specific binary encodings of types of operations which are executed by the VM. It doesn't specify how exactly they must be executed by the VM however. The list of instructions includes standard memory load/store instructions, numeric, parametric, control of flow instruction types and wasm-specific variable instructions.

Code representation

In March 2017, the WebAssembly Community Group reached consensus on the initial binary format, JavaScript API, and reference interpreter. It defines a WebAssembly binary format, which is not designed to be used by humans, as well as a human-readable WebAssembly text format that resembles a cross between S-expressions and traditional assembly languages.
The table below represents three different views of the same source code input from the left, as it is converted to a Wasm intermediate representation, then to Wasm binary instructions:
C input sourceLinear assembly bytecode Wasm binary encoding


int factorial


; magic number
; type for )
; function section
; code section start

local.get 0
i64.eqz
if
i64.const 1
else
local.get 0
local.get 0
i64.const 1
i64.sub
call 0
i64.mul
end)
; module end, size fixups


00 61 73 6D 01 00 00 00
01 00 01 60 01 73 01 73 06
03 00 01 00 02
0A 00 01
00 00
20 00
50
04 7E
42 01
05
20 00
20 00
42 01
7D
10 00
7E
0B
0B 15 17

All integer constants are encoded using a space-efficient, variable-length LEB128 encoding.
The WebAssembly text format is more canonically written in a folded format using s-expressions. For instructions and expressions, this format is purely syntactic sugar and has no behavioral differences with the linear format. Through, the code above decompiles to:

))

; $I0 is an unused label name
; the name $p0 is the same as 0 here



)))))))

Note that a module is implicitly generated by the compiler. The function is actually referenced by an entry of the type table in the binary, hence a type section and the emitted by the decompiler. The compiler and decompiler can be accessed online.

Literature