Nicholas Merriam Follow Author of Embedded Answers

Dude, where's my code?

From time to time, I have experienced what seems like vanishing code. I write out the source code, I think it is right, but then strange things start to happen when I compile it. In this article, I want to look at how it might look like the compiler has thrown away my code, and how to manage the advantages and disadvantages of a powerful, optimizing compiler.

Instruction set design

The modern world is full of RISC processors. The compact instruction set typically requires the compiler to generate more instructions, with the payoff that the simpler instructions can be implemented using less time and less power in the processor. So we have learned to expect compiler-generated assembly code to be a bit wordy, using far more assembly instructions that the input lines of C source code.

Having said that, the instruction set designers are always looking for instructions that are both somewhat easy to implement and also help allow frequently occurring code patterns to be expressed efficiently. A well-known example is count-leading-zeros. Once you have a barrel shifter, you can implement this using a modest amount of additional gates, and it is a very popular instruction, implemented by Arm, TriCore, PowerPC and RH850 instruction sets, not to mention x86. The C code to count leading zeros would require a loop, if not for a compiler intrinsic to generate the one assembly instruction.

So we know that there can be cases where an algorithm can be more compactly represented in assembly code than in C code.

Example: setting negative values to zero

sint32_t NegativeToZero( sint32_t x )
{
    sint32_t result;
    if( x < 0 )
    {
        result = 0;
    }
    else
    {
        result = x;
    }
    return result;
}

The assembly code generated by the free Gnu Arm toolchain for the humble Arm Cortex-M processor is one instruction, plus one instruction to return. What happened to all the code? Where is the test for a negative value? Where is the conditional code? Arm has conditional instructions that can be use for branchless code, but BIC r0, r0, r0, ASR #31 is not a conditional instruction. What is going on?

The answer lies in the ASR #31. An arithmetic shift right by 31 bits turns the two’s complement sign bit in to a mask of all ones (0xFFFFFFFF) for a negative number, or all zeros for a positive number. BIC then ANDs the value with the bitwise negation of this mask to change any negative number to zero, while leaving any positive number unchanged.

Confusion

Such optimizations provide excellent run-time performance, but they also cause some problems when profiling and debugging. I experienced this with exactly the example above. I wanted to see how often negative values were being produced and changed to zero, so I placed a breakpoint on the line result = 0; I was very frustrated to see that the breakpoint was hit for every value. In fact, I only ever observed positive values. After some head-scratching, I switched the debugger from “source only” mode to “interleaved” mode, where you see both the source code and the assembly code. My frustration only grew, when I discovered that the compiler appeared to have generated only one instruction, and that instruction did not seem like a plausible translation of my C source code.

Eventually, I puzzled out what was going on with the sign bit and the mask. I temporarily added __asm( "nop" ); next to result = 0; to force a conditional jump and got some real metrics about the frequency of negative values.

For non-embedded projects, you will typically disable optimization while debugging. In many embedded projects, disabling optimization makes the code too large for the program flash, and/or too slow to run in any realistic way. So we are forced to debug in the presence of a certain level of compiler optimization.

Conclusions

In the end, having very efficient code for thousands (or even millions) of final products is a benefit that far outweighs a bit of inconvenience when debugging. So such compiler optimization is something to celebrate and be grateful for. And also, we want to minimize confusion that can arise.

I hope that learning from my mistakes might save you a bit of time, next time you find yourself in this situation. Distilling the above into three points, we might say:

Be aware that optimized code can be much smaller than you expect.
If you start to suspect this kind of situation, take a look at the assembly code.
Even if you cannot disable optimization for the entire project, you can temporarily disrupt the optimization enough to meet your profiling and debugging requirements.

26 Oct 2025

C
tutorial

« Measurement fallacies

Embedded Answers

Dude, where's my code?

Instruction set design

Example: setting negative values to zero

Confusion

Conclusions

Explore →