Hi Nicolas,
Your suggestions are very insightful. Maybe we should add a way to collect, rate, and track this topic to a public forum. Do you think many developers would participate?
Some developers have asked that we make the instruction set completely orthogonal, so that gaps like the unsigned int to float conversion would not exist. There is a strong desire to enable vectorizing compilers (and I think that SSE4.1 and AVX both reflect this), but the process continues to involve a cost/benefit analysis. Given this, do you have an example of a common usage of the unsigned int to float conversion that, if accelerated, would represent a noticeable application-level performance improvement?
Thank you,
Eric