Expand description
A low precision raster pipeline implementation.
A lowp pipeline uses u16 instead of f32 for math. Because of that, it doesnβt implement stages that require high precision. The pipeline compiler will automatically decide which one to use.
Skia uses u16x8 (128bit) types for a generic CPU and u16x16 (256bit) for modern x86 CPUs. But instead of explicit SIMD instructions, it mainly relies on clangβs vector extensions. And since they are unavailable in Rust, we have to do everything manually.
According to our benchmarks, a SIMD-accelerated u16x8 in Rust is almost 2x slower than in Skia. Not sure why. For example, there are no div instruction for u16x8, so we have to use a basic scalar version. Which means unnecessary load/store. No idea what clang does in this case. Surprisingly, a SIMD-accelerated u16x8 is even slower than a scalar one. Again, not sure why.
Therefore we are using scalar u16x16 by default and relying on rustc/llvm auto vectorization instead.
When targeting a generic CPU, weβre just 5-10% slower than Skia. While u16x8 is 30-40% slower.
And while -C target-cpu=haswell
boosts our performance by around 25%,
we are still 40-60% behind Skia built for Haswell.
On ARM AArch64 the story is different and explicit SIMD make our code up to 2-3x faster.
MacrosΒ§
StructsΒ§
ConstantsΒ§
FunctionsΒ§
- clear π
- darken π
- destination_
atop π - destination_
in π - destination_
out π - destination_
over π - difference π
- div255 π
- evenly_
spaced_ π2_ stop_ gradient - exclusion π
- fn_ptr
- fn_
ptr_ eq - from_
float π - gradient π
- gradient_
lookup π - hard_
light π - inv π
- join π
- just_
return - lerp π
- lerp_
1_ πfloat - lerp_u8 π
- lighten π
- load_8 π
- load_
8888 π - load_
8888_ πtail - load_
dst - load_
dst_ tail - load_
dst_ u8 - load_
dst_ u8_ tail - load_
mask_ πu8 - mad π
- mask_u8 π
- modulate π
- move_
destination_ πto_ source - move_
source_ πto_ destination - multiply π
- null_fn
- overlay π
- pad_x1 π
- plus π
- premultiply π
- reflect_
x1 π - repeat_
x1 π - round_
f32_ πto_ u16 - scale_
1_ πfloat - scale_
u8 π - screen π
- seed_
shader π - source_
atop π - source_
in π - source_
out π - source_
over π - source_
over_ rgba - source_
over_ rgba_ tail - split π
- start
- store
- store_
8888 π - store_
8888_ πtail - store_
tail - store_
u8 - store_
u8_ tail - transform π
- uniform_
color π - xor π
- xy_
to_ πradius