Searching \ for '[EE] C integer promotions (was: Code packing)' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/language/index.htm?key=c
Search entire site for: 'C integer promotions (was: Code packing)'.

Exact match. Not showing close matches.
PICList Thread
'[EE] C integer promotions (was: Code packing)'
2008\07\09@132243 by Herbert Graf

flavicon
face
On Wed, 2008-07-09 at 18:03 +0100, Alan B. Pearce wrote:
> >1) PORTA is an unsigned char. Therefore, before we can take
> >its complement, it must be promoted to signed int.
>
> What a load of rubbish. There is no 'must be promoted' at all. There is no
> reason to promote an unsigned char to signed int.

FWIW most of this thread has been mostly over my head (never took a
course on compiler design, always been more into the hardware...).

That said, I am interested in the "integer promotion". What ARE the
rules for promotion? I have been hit a couple times with promotion
happening when I didn't think it would, and other times not happening
when I thought it should have.

What are common "gotchas" with regards to type promotion?

Since I'm not that familiar with the rules, I generally explicitly cast
so that I don't have to worry about them, but now I'm curious! :)

TTYL

2008\07\09@150601 by Gerhard Fiedler

picon face
Herbert Graf wrote:

> That said, I am interested in the "integer promotion". What ARE the rules
> for promotion? I have been hit a couple times with promotion happening
> when I didn't think it would, and other times not happening when I
> thought it should have.
>
> What are common "gotchas" with regards to type promotion?
>
> Since I'm not that familiar with the rules, I generally explicitly cast
> so that I don't have to worry about them, but now I'm curious! :)

>From the C89 standard (there's a newer C99, but few embedded compilers
comply with it):

-------------------------------
3.2.1.1 Characters and integers

A char, a short int, or an int bit-field, or their signed or unsigned
varieties, or an object that has enumeration type, may be used in an
expression wherever an int or unsigned int may be used.  If an int can
represent all values of the original type, the value is converted to an
int; otherwise it is converted to an unsigned int. These are called the
integral promotions.

The integral promotions preserve value including sign.  As discussed
earlier, whether a ``plain'' char is treated as signed is
implementation-defined.
-------------------------------

Basically all arithmetic or bitwise operations are defined for (unsigned)
int, and (potentially) shorter types are converted to the appropriate int
variety before the operation execution. I think that a number of embedded
compilers for smaller processors don't do the standard promotion of char to
int, and rather do (non-standard) 8-bit arithmetic.

A common gotcha is to assume 8-bit arithmetic with 8-bit types when integer
promotion occurs. This is of course only a problem when an overflow happens
in the 8-bit calculations (which usually doesn't happen in integer, or if
it does, its results are different). Another is about how literals are
handled, type-wise; for example, check out the end of this post by John
Payson
<http://www.htsoft.com/forum/all/showthreaded.php/Cat/0/Number/14264/page//vc/1>.

It's good to know the rules, but I agree with you: it's best to use
explicit type conversions wherever reasonably possible. E.g. if the
calculation doesn't fit in 8 bits and I want to do it in 16-bit math, I
explicitly cast to a 16-bit type or use a 16-bit temporary variable. Avoids
problems with specific implementations of promotions and documents the need
in the code.

Gerhard

2008\07\09@173028 by Tomás Ó hÉilidhe

picon face

Herbert Graf wrote:
> That said, I am interested in the "integer promotion". What ARE the
> rules for promotion? I have been hit a couple times with promotion
> happening when I didn't think it would, and other times not happening
> when I thought it should have.
>
> What are common "gotchas" with regards to type promotion?
>
> Since I'm not that familiar with the rules, I generally explicitly cast
> so that I don't have to worry about them, but now I'm curious! :)


OK here goes I'll try give an exhaustive yet finite explanation of
integer promotion. You should probably put the kettle on.

Here's the integer types as defined by the 1989 Standard of C:

char
short int
int
long int

In the C language, "char" is synonymous with "byte". The "sizeof"
operator tells you how many bytes there are in a particular type, so
sizeof(char) will always be 1. If it's not 1, then it's not Standard C.

Today we're all familiar with a byte being 8 bits, but the C Standard
doesn't restrict the implementation to that. The only restriction it
imposes is that a byte must be *at least* 8 bits. There's a macro called
CHAR_BIT defined in <limits.h> which tells you how many bits you have in
a byte.

As regards the other types, the Standard doesn't limit their size or
range but it *does* state minimums:

char  >=  8 bits
short int  >=  16 bits
int  >=  16 bits
long int  >=  32 bits

Another restriction is as follows:

   sizeof(char)  <=  sizeof(short)  <=  sizeof(int)  <= sizeof(long)

which basically just means that the types must make sense, i.e. a long
can't be smaller than a short. Note thought that it's possible, for
instance, for "int" and "long int" to be the same.

In C, you can do stuff like arithmetic operations and bitwise operations
on integer expressions, but you can't do operations on anything smaller
than an int. Therefore, before you can play with a char or a short, it
has to undergo "promotion".

For some reason, the creators of C favoured signed integer types. (I
myself hate signed types and would've gone for unsigned integer types
but what can you do). Anyway, the idea was that everything smaller than
an int would be promoted to a "signed int". Disgusting, I know.

However... somebody noticed something. If you look at the minimum
bitnesses I mentioned above, it's perfectly legal to have the following
machine:

char = 64-Bit
short = 64-Bit
int = 64-Bit
long = 64-Bit

And in fact there was at least one supercomputer that came out with this
configuration. Somebody noticed that on this particular supercomputer,
"unsigned short" and "signed int" had the following ranges:

   unsigned short = 0 to 18446744073709551615

   signed int = -9223372036854775807 to 9223372036854775807

Now as you can see, it's possible to have a value that will fit in an
unsigned short that *won't* fit into a signed int (for example take the
number 13835058055282163710). The idea behind integer promotion was that
you'd still be left with the original value, so they couldn't promote an
unsigned short to signed int on this system. So they came out with the
following:

   If  INT_MAX  is greater than or equal to USHRT_MAX,  then unsigned
short promotes to signed int. Otherwise, it promotes to *unsigned* int.

Similarly for char:

   If INT_MAX is greater than or equal UCHAR_MAX, then unsigned char
promotes to signed int. Otherwise, it promotes to *unsigned* int.

So basically it can be summed up as follows:
   Before you can touch a char or a short, it has to be promoted to
either a signed int or unsigned int. Whether it promotes to signed or
unsigned depends on whether INT_MAX is greater than UCHAR_MAX / USHRT_MAX.

OK so that's the most basic stage of promotion, a "char" or a "short"
has to become an int. If you're dealing with a unary operator (i.e. an
operator that only takes one operand), then that's all that happens.
Examples of unary operators are the complement ~, or the negation -.

However, if you're dealing with a binary operator (i.e. one that takes
two operands), then there may be a need for further promotion. Here's an
example:

   unsigned int i = 78;

   signed long j = 56;

Now let's say you had the following expression:

   i + j

In order to perform a binary operation that involves two integer
expressions, the two operands must be of the exact same type. They must
be the same size and the same signedness. If they're not, then we need
to do further promotion.

First thing you need to do is decide which one is bigger (forget about
the signedness for now). In this case, "j" is bigger because it's
"long". So what we have to do is bump "i" up to long. Now as I've
mentioned, the creators of C favoured signed types for some horrid
reason, so they wanted "unsigned int" to promote to "signed long". But
what happens if LONG_MAX is less than UINT_MAX... ? You guessed it, in
that case it'll promote to "unsigned long" instead. So let's say we're
working on a system where both int and long are 32-Bit: On such a
system, LONG_MAX will indeed be smaller than UINT_MAX, so "i" will
promote to "unsigned long". Therefore, to make them the same size, the
compiler does the following:

   (unsigned long)i   +   j

OK so let's see what we have now: On the left-hand side we have an
"unsigned long", and on the right we have a "signed long".

Next we have to make the signs the same. If the signs are different,
then the signed one becomes unsigned. Therefore:

   (unsigned long)i + (unsigned long)j

Now the operation can be carried out, and the result will be an
"unsigned long".

So integer promotion can have three stages:

1) Stuff smaller than int becomes either signed int or unsigned int. (On
the vast majority of systems it's signed int)
2) The two types have to match in size. (The smaller one becomes bigger
and, if possible, signed)
3) The two types have to match in sign. (If one of them is signed then
it becomes unsigned).

I myself work with unsigned integer types 95% of the time. In order to
ensure that I don't end up with a signed type somewhere along the line,
I observe the following:
   1) Use unsigned literals, e.g. "7889u" instead of "7889".
   2) Always cast char and short before doing an operation on them.

I believe the whole idea behind integer promotion was that it was meant
to accurately portray the way in which computers work. For instance, on
a trully 16-Bit processor, you can't do operations on a 16-Bit integer,
only 32-Bit. Unfortunately though, integer promotion isn't very fitting
for systems where int is less than 16-Bit, but the PIC C compiler
handles it flawlessly!

As for why they decided they wanted everything to promote to signed,
well I think it was a horrible idea.

2008\07\09@200014 by M. Adam Davis

face picon face
Herbert Graf wrote:
> That said, I am interested in the "integer promotion". What ARE the rules
> for promotion? I have been hit a couple times with promotion happening
> when I didn't think it would, and other times not happening when I
> thought it should have.

There are rules for promotion, but they can't be counted on as every
compiler implements them a little differently, and different
processors implement them differently.  Further, few compilers
(especially in the PIC world) claim full ANSI compliance so even
though there are rules you have to check every single compiler that
you use, and then look to make sure that the command line options
aren't changing the rules on you.

It's annoying.

Which is why EVERY single operation that involves different types
SHOULD be explicitly cast, and in the automotive industry (as in many
other high reliability industries) this is a requirement.

This is especially true if you expect the code to be portable or
reusable.  Do you really want to debug your debounce routine every
time you try and use it on a new compiler or processor?

> What are common "gotchas" with regards to type promotion?

The older, more common ANSI C standard does not define a char as
signed or unsigned - your compiler is allowed to treat a char as
either signed or unsigned, and every good compiler has a seperate type
that means unsigned 8 bit or signed 8 bit.

This bites a lot of people who assume that it's one or the other, then
it gets promoted to an integer or unsigned integer (depending on the
compiler's view of char) and can have unintended effects based on
whether its sign extended or not.

Further, an integer may be (commonly) 16 bit or 32 bit (or even 64
bit) - this was left as an implementation detail to generically mean
the processor's native integer format.  You might be surprised moving
a PC program (32 bit) to a limited processor and seeing your variables
roll over at 32,767.

> Since I'm not that familiar with the rules, I generally explicitly cast
> so that I don't have to worry about them, but now I'm curious! :)

This is the most reliable way to deal with it.  In fact, automotive
companies (and many companies, I've found) don't even use the standard
names because it affects more than just the char.  A types.h file is
used to typedefine every type for a particular compiler (using #ifdef
COMPILER type statements) so that, for instance, you'll see T_u8 for
unsigned 8 bit variables, T_s16 for unsigned 16 bits, T_bit for
TRUE/FALSE boolean (which usually goes to the processor's native int
for speed and to avoid bit packing) etc.

When a new compiler is used, the types.h file is updated to include
the correct mapping for types so the program is still portable.

It's also extremely handy when testing your software on the PC - you
can compile it in both Visual Studio and your micro's compiler and
know the types and casting will happen correctly.

If you're really interested in other C gotcha's, check out MISRA
online (often mis-pronounced misery - it's a pain to comply with most
of it).  If I understand correctly a lot of these stem from the book
"C Traps and Pitfalls"
http://www.amazon.com/dp/0201179288?tag=adamdaviscollect

-Adam

--
EARTH DAY 2008
Tuesday April 22
Save Money * Save Oil * Save Lives * Save the Planet
http://www.driveslowly.org

More... (looser matching)
- Last day of these posts
- In 2008 , 2009 only
- Today
- New search...