@aras Really enjoying the series - I started on a SIMD version of my own last night. One thing that I'm trying is replacing the 3 mults + 2 adds when computing dot products with 2 fmadd + 1 mult instructions. I haven't compared speed yet though - it might pipeline worse, cancelling out any advantage.

Sign in to participate in the conversation
Gamedev Mastodon

Game development! Discussions about game development and related fields, and/or by game developers and related professions.