🔬 This is a nightly-only experimental API. (
This is supported on x86-64 and target feature
Load packed single-precision (32-bit) floating-point elements from memory into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). mem_addr does not need to be aligned on any particular boundary.