I tried. It's bad. It *is* faster than permutation tables though.
Bottom center panel is my attempt.
I guess I should bring out a profiler and actually see where the bottlenecks are.
Meanwhile, if someone knows a fast hash that takes 128 bits as input (x: i32, y: i32, seed: u64), works in AVX __mm256i registers and has good entropy in the lower bits, I'm all ears.
@thomastc does xxhash fit your need?
@bitinn Thanks, that's the sort of thing I was looking for. XXH3 doesn't beat the permutation table for performance (38 ms total runtime instead of 34 ms), but the visual quality is good.