Module lowp

Source
Expand description

A low precision raster pipeline implementation.

A lowp pipeline uses u16 instead of f32 for math. Because of that, it doesn’t implement stages that require high precision. The pipeline compiler will automatically decide which one to use.

Skia uses u16x8 (128bit) types for a generic CPU and u16x16 (256bit) for modern x86 CPUs. But instead of explicit SIMD instructions, it mainly relies on clang’s vector extensions. And since they are unavailable in Rust, we have to do everything manually.

According to our benchmarks, a SIMD-accelerated u16x8 in Rust is almost 2x slower than in Skia. Not sure why. For example, there are no div instruction for u16x8, so we have to use a basic scalar version. Which means unnecessary load/store. No idea what clang does in this case. Surprisingly, a SIMD-accelerated u16x8 is even slower than a scalar one. Again, not sure why.

Therefore we are using scalar u16x16 by default and relying on rustc/llvm auto vectorization instead. When targeting a generic CPU, we’re just 5-10% slower than Skia. While u16x8 is 30-40% slower. And while -C target-cpu=haswell boosts our performance by around 25%, we are still 40-60% behind Skia built for Haswell.

On ARM AArch64 the story is different and explicit SIMD make our code up to 2-3x faster.

MacrosΒ§

blend_fn πŸ”’
blend_fn2 πŸ”’

StructsΒ§

Pipeline

ConstantsΒ§

STAGES
STAGE_WIDTH

FunctionsΒ§

clear πŸ”’
darken πŸ”’
destination_atop πŸ”’
destination_in πŸ”’
destination_out πŸ”’
destination_over πŸ”’
difference πŸ”’
div255 πŸ”’
evenly_spaced_2_stop_gradient πŸ”’
exclusion πŸ”’
fn_ptr
fn_ptr_eq
from_float πŸ”’
gradient πŸ”’
gradient_lookup πŸ”’
hard_light πŸ”’
inv πŸ”’
join πŸ”’
just_return
lerp πŸ”’
lerp_1_float πŸ”’
lerp_u8 πŸ”’
lighten πŸ”’
load_8 πŸ”’
load_8888 πŸ”’
load_8888_tail πŸ”’
load_dst
load_dst_tail
load_dst_u8
load_dst_u8_tail
load_mask_u8 πŸ”’
mad πŸ”’
mask_u8 πŸ”’
modulate πŸ”’
move_destination_to_source πŸ”’
move_source_to_destination πŸ”’
multiply πŸ”’
null_fn
overlay πŸ”’
pad_x1 πŸ”’
plus πŸ”’
premultiply πŸ”’
reflect_x1 πŸ”’
repeat_x1 πŸ”’
round_f32_to_u16 πŸ”’
scale_1_float πŸ”’
scale_u8 πŸ”’
screen πŸ”’
seed_shader πŸ”’
source_atop πŸ”’
source_in πŸ”’
source_out πŸ”’
source_over πŸ”’
source_over_rgba
source_over_rgba_tail
split πŸ”’
start
store
store_8888 πŸ”’
store_8888_tail πŸ”’
store_tail
store_u8
store_u8_tail
transform πŸ”’
uniform_color πŸ”’
xor πŸ”’
xy_to_radius πŸ”’

Type AliasesΒ§

StageFn