Skip to content

C Portability Gotcha

June 4, 2012

This weekend I came across this quiz about integers in C by John Regehr, and it reminded me of a bug I encountered last week.

Take a minute to try the quiz and see how you do (I scored only 85% despite being experienced in C). As an aside, part of the reason I missed points is because my experience tells me it’s OK to do things even though they may be “incorrect.” For example, take question 7, which asks what happens for (INT_MAX + 1). The answer is that this is undefined, because it overflows an integer. And if I compile this in gcc, indeed it warns me warning: integer overflow in expression. But I also know that I can safely overflow on most processors, and the result wraps to INT_MIN (which is confirmed by running this expression and printing the result), so experience tainted my opinion of correctness.

Now on to the bug I found. Here’s the code with all but the essential part stripped, a function that at some point compares the value of a character to a value stored in an unsigned integer:

void some_func(const char *key, uint32_t val) {
  const char *cur_key = key;
  if (val >= *cur_key) {
     do_something();
  }
}

Can you see the bug? Well, the processor didn’t find a bug with this code either, because this function worked as intended when running on a 32-bit ARM processor. It wasn’t until I compiled and ran this same working code for an x86-64 processor that it failed.

So what happened? When cur_key was compared to val, it is promoted to an unsigned integer for the comparison. The rule for this conversion is stated in the C99 standard 6.3.1.8:

Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

This rule holds on both the ARM and the x86 processors. The difference must fall on what it means to convert a type on these processors. The bug occurred when the value of the character was 248. Since the character was representing part of a UTF-8 string, and not ASCII, values over 127 are expected and valid. On the ARM processor, when the character was promoted to an unsigned integer, its value remained 248. However, on the x86 processor, the value was interpreted as -8, so it was promoted to an unsigned value of 4294967288! Not surprisingly, the comparison result differs and makes this program fail.

The code was fixed by casting the value of the key to an unsigned character like so:

void some_func(const char *key, uint32_t val) {
  const uint8_t *cur_key = reinterpret_cast<const uint8_t*>(key);
  if (val >= *cur_key) {
    do_something();
  }
}

Now it works as intended on both processors.

I like John’s quiz because it really highlights the point to be careful when comparing values of different types. Even honoring the warnings of the compiler isn’t enough, because with gcc -Wall these compile without warning. (Incidentally, compiling both examples with clang, which is known for catching more errors, only complained about the comparison of val to cur_key in the first example, because they are signed differently, but casting at the comparison point to avoid the warning doesn’t change the way the value is promoted).

Another lesson learned here is that just because C code runs fine on one platform, doesn’t mean it will on another. You must test thoroughly for each targeted architecture.

One Comment leave one →
  1. John Regehr permalink
    June 4, 2012 1:40 PM

    I like this example and have made very similar mistakes myself!

Leave a Reply

Your email address will not be published. Required fields are marked *