pub fn idct4x4(in_vector: &mut [i32; 64], out_vector: &mut [i16], stride: usize)
IDCT assuming only the upper 4x4 is filled.