https://gcc.godbolt.org/z/YhrEcb7nz
Am I correct assuming that fmadd and vmul/vadd versions might not give me the exact same results (down in Test)? (Double checking.. )
... thanks for confirmations.. Practice agrees as well and sometimes the sort will explode :) (this was a copy-pasted bit of some of the <algorithm> sort code)
@msinilo is fp:fast generally worth the trouble it causes? It could lead to even simpler cases like compiler turning (a+b)+c into a+(b+c) which is not the same. And yeah, sorting predicates can and will go wrong with things like that.
@msinilo @TomF @aras @rygorous I argued for the inclusion of fast math in rust for a while, but it's the wrong feature for the problem it's trying to solve; instead of a hammer across the whole codebase I think more localized solutions would be better. We currently run rust as-is with strict math ops, and our shaders too. I suspect we take a hit for it, but I'll only disable it if we're ever /really/ desperate for performance. For now, sidestepping a whole class of bugs is much more preferable.
@JasperBekkers @msinilo @aras @rygorous It might be interesting to have a compiler look at your code and tell you where it thinks something could be improved, and then you'd manually refactor things until it was happy that it couldn't improve it any more?
@TomF @JasperBekkers @msinilo @aras @rygorous my take on fast math is similar (evil!), but this proposed solution IMO would be strictly worse (for anything that is not write-once or for experts) - reading to unreadable, unclear code. I want something like local fast math with local decorators.
Halide has some partial solutions for it, where you can ask some functions / expressions to be evaluated at a specific point and folded out.
@BartWronski @TomF @JasperBekkers @msinilo @aras I'd be happy for something local like function or even scope-level annotations of "feel free to reassociate this" but "fast math" is an incredibly blunt tool and violates most intuitive notions of what a function even is, in a way that no other optimizations do
@rygorous @BartWronski @TomF @msinilo @aras Full scoped fast math some issues; e.g. `#[fast_math]{ sqrt(1.0) }` would still be problematic in its results. Yes you've been clear during the operation, but stil... Scoping it down to specific optimizations (contraction/reassociation/auto reciprocals) may be useful but could be done manually or as a lint. (1/2)
@JasperBekkers @BartWronski @TomF @msinilo @aras I was assuming you would specify what was allowed, fast math is too big an umbrella anyhow.
@rygorous @BartWronski @TomF @msinilo @aras My point exactly , it I'm also trying to figure out a way to work with it; fastmath's usefulness is in a large part as a "code is slow, please make it fast with minimal investment on my part" kinda tool. Giving up some of that convenience might pave a way to also trade away some of its downsides.
@JasperBekkers @rygorous @BartWronski @msinilo @aras Hence my suggestion of "compiler-guided optimisations". It suggests stuff, you decide whether or not to add those annotations.
@JasperBekkers @rygorous @BartWronski @msinilo @aras We sort of have some of that already with the compiler saying "hey did you actually mean to use a double here?"
@TomF @rygorous @BartWronski @msinilo @aras I see where you're coming from, sort of, on a high level, but I'm not sure how it would actually turn out in practice. e.g. for reassociation, reciprocals or contraction maybe it can suggest a way to rewrite the equations for you, but after that it would feel like this quickly falls apart. Would you want the compiler to suggest using approximations to functions of which it can't likely know the input? (1/2)
@TomF @rygorous @BartWronski @msinilo @aras Can it know that if you tell it, "please assume no nans for this code" that things will get significantly faster. How would your prevents overloads of false positives or massive compiler spam (2/2)
@JasperBekkers @rygorous @BartWronski @msinilo @aras I'm suggesting that anything the compiler would normally decide to do for you with fastmath, it instead suggests adding an annotation to allow. That's all.