JetStream 3.0: A New Era for Browser Performance Benchmarking

In a joint announcement with Google and Mozilla, the WebKit team unveiled JetStream 3.0, a major update to the industry-standard cross-browser benchmark suite. This release reflects a fundamental shift in how browser engines measure performance, especially for WebAssembly and large-scale modern web applications. While the collaborative effort behind JetStream 3.0 is detailed in the shared announcement, we offer a deeper look at the challenges tackled by the WebKit team and the engineering innovations within JavaScriptCore that drive these improvements.

Why Benchmarks Need Constant Evolution

Benchmarks are among the most powerful tools browser engine developers use to drive performance gains. However, the web ecosystem evolves rapidly, and any static benchmark inevitably becomes outdated as new best practices emerge. Once the low-hanging optimization opportunities within a benchmark are addressed, subsequent improvements tend to become less general and more workload-specific, diminishing their real-world relevance. JetStream 3.0 addresses this by not only refreshing the test suite but also rethinking the very methodology behind performance measurement. The goal is to reflect the scale and complexity of today's web applications, where WebAssembly and dynamic content are increasingly prevalent.

JetStream 3.0: A New Era for Browser Performance Benchmarking — Source: webkit.org

Redefining WebAssembly Performance Measurement

One of the most significant changes in JetStream 3.0 is the overhaul of WebAssembly (Wasm) benchmarking. To understand why this was necessary, we must revisit the origins of Wasm in the browser. When JetStream 2 was released, WebAssembly was still in its infancy. Early adopters were primarily large C/C++ projects that had previously been compiled to asm.js. These applications, such as video games, accepted long startup times in exchange for high throughput. Consequently, JetStream 2 scored Wasm in two distinct phases: Startup and Runtime.

The Zero Startup Time Problem

Over the years, browser engines have become remarkably efficient at instantiating WebAssembly modules. As startup times improved, even micro-optimizations began to yield noticeable benefits. For example, shaving 0.1 milliseconds off a 100-millisecond workload was negligible, but once instantiation times dropped to just 2 milliseconds, that same improvement represented a 5% performance gain. In WebKit, the startup path was optimized so aggressively that for certain smaller workloads, startup time effectively reached zero seconds. In JetStream 2, each iteration's time was computed using Date.now(), which rounds down—meaning any sub-1-millisecond time became zero milliseconds. This created a unique challenge for the benchmark's scoring formula: Score = 5000 / Time. When the time hit zero, the score became infinite. The team eventually patched JetStream 2.2 by clamping the score to 5,000, ensuring that a zero millisecond sub-score wouldn't dominate all other scores.

While achieving an infinite score might seem like a victory, it was a clear indication that browser engines had outgrown JetStream 2's WebAssembly subtests. On the modern web, WebAssembly is often part of the critical path for page loads—used in libraries, image decoders, and UI frameworks. A “zero” startup time in a microbenchmark no longer reflects real-world conditions, where startup costs still matter, albeit at much smaller scales. JetStream 3.0 addresses this by introducing more representative Wasm workloads and a scoring model that better captures the nuances of modern application behavior.

Modern Web Applications at Scale

Beyond WebAssembly, JetStream 3.0 expands its scope to cover larger, more complex web applications. The suite now includes tests that simulate the behavior of contemporary frameworks, rendering engines, and data-intensive operations. This shift recognizes that modern websites are not just collections of simple scripts but intricate, stateful applications that push the boundaries of browser performance. By incorporating workloads that reflect real-world usage patterns, JetStream 3.0 provides a more accurate assessment of browser engine capabilities under realistic conditions. The need for continuous evolution is embedded in the suite's design, ensuring it can adapt as the web evolves.

Conclusion

JetStream 3.0 represents a collaborative effort to keep browser performance measurement aligned with the current state of the web. By addressing the quirks of WebAssembly startup time and scaling up workload complexity, the new benchmark suite offers a more robust and realistic evaluation tool for developers. For the WebKit team, the refinements in JavaScriptCore—such as faster parsing, compilation, and code generation—are directly informed by the challenges uncovered during JetStream 3.0's development. As the web continues to advance, benchmarks like this one will remain essential for driving innovation and ensuring that users enjoy fast, responsive browsing experiences.

Tags: