This optimisation came up in a consulting job for CruxML. A common pattern in embedded programming or high-level synthesis is if-bit-then, i.e. test a bit and modify a variable depending on the result.
Here is an example in C:
unsigned a = 0x23, x = 0x86; ... if (a & 1) x = x op q;
(where op is some operation like
+ but could be more complicated).
The above code can be optimised in a number of ways. A common one is:
x = (a & 1) ? x op q : x;
An alternative, less well known way is:
x = x op (-(a & 1) & q);
The code relies on the fact that
a & 1 only has two possible outputs,
-1 = 0xffffffff and
-0 == 0
-(a & 1) & q either evaluates to
This identity may be advantageous depending on the situation, particularly for optimising inner loops of time-critical applications. For our specific problem, it significantly reduced resource utilisation in an OpenCL-based FPGA design.