Searching \ for '[OT] Code packing' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=code+packing
Search entire site for: 'Code packing'.

Exact match. Not showing close matches.
PICList Thread
'[OT] Code packing'
2008\07\09@052250 by Tamas Rudnai

face picon face
> Telling me that MCC18 is C is like saying a wolf is a dog.

You do not have to search for excuse if you do not know all the C
implementations - but do not tell that everything you do not know is not C.

> Fair enough, if that's what happens in MCC18, as distinct from Standard
> C of course.

When you are talking about platform independent development sometimes you
have to deal with other C implementations than your favourites, even if that
is not ANSI. Gcc does not exists on all the platforms.

> I don't go in to such details, because I don't have to go into such

I see, so if you do not know how do you know that what you are saying is
correct?

> Take the following bit pattern for an 8-Bit signed integer type:
>
> 11011010
>
> Tell me what value that is. You can't. Why? Because you don't know what
> number system is being used.

That's exactly what I am talking about! As you set the bits, you do not have
to care what the number is and how is coded.

> The highest bit in an unsigned type corresponds to the "sign bit" in a
> signed type. If you set the signed bit, you've got a negative number

In 2's complement that's not a sign bit. But you can tell if the number is
signed looking at that bit, I've got what you mean.

> (assuming of course that the bit pattern doesn't result in a "trap
> representation"). This negative number will be different on different
> systems depending on the number system in use.

Once again, in which system you are talking about? Or which compiler? I was
developing commercial software on multiplatform environment for loads of
very different systems, but all of them was using 2's complement no matter
if it was big or little endian or the Vax style dec-endian. The reason is
that 2's complement is very convenient, and all the other types were dropped
that was used by early computers - As far as I remember even PDP-11 used 2's
complement with the RSX-11 but maybe earlier ones like some IBM?

>   int a;
>  double b;
>
>   int *p = (int*)&b;   /* This won't work without a cast */

For god's sake double is a floating point number. It has a completely
different representation than any integer type. Of course you need
something, but it's not a casting. You need a converter function - and THAT
is a conversion, so here we go.

> An "auto-cast" tends to be called a conversion. There's no difference
> other than that sometimes you need an explicit cast because there's no
> implicit conversion.

Nope. There is always implicit or auto casting. You need explicit when mr.
compiler does not know how would you like it. Like with MCC18 (like it or
not you will see many MCC18 examples in your life if you are gong to deal
with 18F series). So with MCC18 you probably need the 'L' many times with
literals, that's because it is much safer to implicit casting and therefore
telling the compiler that the literal should be treated as a long rather
than a char. '1L' same as '(long)1' both of them are casting as you probably
know. You can tell in the project settings and with pragma that there is an
integer promotion, but project settings is not safe in my opinion.

> No it's not OK, and this topic was discussed in detail on comp.lang.c a
> few months ago. The conclusion was that the following code is
unpredictable:
>
>   unsigned char a = 7, b = 4;
>
>   a = ~b;

All right, tell me please were is that illegal. If a compiler gives a bad
results on this then that is a bad compiler - and you probably would use
unsigned int instead of char on that compiler - better than casting. If you
deal with multiplatform development you are not going to use the native type
definitions anyway, so use different types on different platforms is quite
possible.

>   short i = 7;
>   int j = i;
>
> There is never a problem when going from a smaller signed type to a

No, the problem is that you have an unsigned j = i; And when the i is
negative. And of course when the type definition of short is shorter than
int...

>   char unsigned a = 4, b = 7;
>   a &= ~b;
>
> then the expression on the right-hand side of the assignment is of type
>either "signed int" or "unsigned int" depending on whether INT_MAX is

Ok, 'b' is an unsigned char, therefore the ~b is unsigned char. 'a' is also
unsigned, therefore no casting needed. There could be some platform where
there is no way to read a single byte out of the memory. I could imagine,
that on those system every operation on unsigned char is painful for the
compiler. Then I guess you should avoid using that. Trying to remember which
unix was that, I think it was a DEC system with Alpha processor, but then
the compiler did all the trick for you, only had to remember when using a
pointer - reading a non 4 dividable address caused a dump. Anyway, good old
times :-)

> greater than UCHAR_MAX. On the vast majority of systems, it's "signed
> int". When you try to assign the "signed int" to an unsigned char, it
> needs to be converted.

So how do you convert a bigger number than you can represent on a smaller
storage? You cannot. All you can do is to cast it - which means no
conversion, only picking up the part that can fit into the place it has.
Like if you have 16 bit int i and 8 bit char c, then c=i means i will be
atuo casted to c, so the least significant 8 bit will be copied into c.

> An integer expression must undergo promotion before any arithmetic or
> bitwise operation can be performed on it.

Who told you that? It is not a must, it may how it works - maybe the
compiler design is simpler, maybe the code works faster on a given platform
like that, but it is not a must. On many platform the size of int chosen to
to be supported natively. Not like on 8 bit environment - like 8 bit PICs.
You can imagine what would happen if every time you deal with a simple
operation like PORTA |= 1<<3?

Gcc produce a code on Intel Pentium (without optimization of course) from
your example:

   movzbl    -6(%ebp), %eax
   movl    %eax, %edx
   notl    %edx
   movzbl    -5(%ebp), %eax
   andl    %edx, %eax
   movb    %al, -5(%ebp)

MOVZBL, so no signed/unsigned replacement made... It left on unsigned,
therefore happened what expected, the compiler use 32 bit numbers then uses
only the least significant 8 bits only - and most importantly the casting
(or integer promotion) is done BEFORE the NOT operation - exactly the same
as you were casting implicit. The casting is absolutely not necessary step
and not what the programmer meant but behaves exactly the same so it is
unnoticed - thanks god! Especially on Intel where you still have the 8 and
16 bit access to registers. This is a kind of bad compiler behaviour I would
say. Anyway, if you change the 'unsigned' to 'signed' then it uses the
MOVSBL instead, so the signed char will be promoted to signed int - no sign
changed.

Tamas

2008\07\09@053910 by Tamas Rudnai

face picon face
> You'd be surprised how intuitive a compiler can be. The PIC C one is
> particularly good. It can definitely produce better assembler than I can.

Tomas, please do not start religion war :-) It was discussed here many times
- and I believe on most forums already. There was never a satisfactory
conclusion. C developers things that compiler optimization do a better job
than any ASM developer could achieve, ASM developers say that's not true and
not even close.

BTW my job is to reverse engineer applications daily basis, I have seen so
many disassembled code in my life from many compilers (not only C) therefore
I am an ASM guy for good reason. But I agree that a compiler can make a fair
code and during the development phase the developer can thinking in a higher
abstraction level forgetting the under layer architecture party or sometimes
even completely. I think neither ASM nor C or any other language is good or
bad - it's only a tool anyway that we use to create something, and it's your
freedom to choose the one that fits the best to your knowledge, style or
likings.

Tamas


On Wed, Jul 9, 2008 at 10:22 AM, Tamas Rudnai <spam_OUTtamas.rudnaiTakeThisOuTspamgmail.com>
wrote:

{Quote hidden}

--
Rudonix DoubleSaver
http://www.rudonix.com

2008\07\09@082602 by Tomás Ó hÉilidhe

picon face


Tamas Rudnai wrote:
>> Take the following bit pattern for an 8-Bit signed integer type:
>>
>> 11011010
>>
>> Tell me what value that is. You can't. Why? Because you don't know what
>> number system is being used.
>>    
>
> That's exactly what I am talking about! As you set the bits, you do not have
> to care what the number is and how is coded.
>  


But then what will you do with that number? Assign it to a port maybe?
Everything will be fine and dandy until you find a chip that doesn't use
Two's.


> In 2's complement that's not a sign bit. But you can tell if the number is
> signed looking at that bit, I've got what you mean.
>  


The C Standard calls it the sign bit. (Not being snotty, just making the
point).


> Once again, in which system you are talking about? Or which compiler? I was
> developing commercial software on multiplatform environment for loads of
> very different systems, but all of them was using 2's complement no matter
> if it was big or little endian or the Vax style dec-endian. The reason is
> that 2's complement is very convenient, and all the other types were dropped
> that was used by early computers - As far as I remember even PDP-11 used 2's
> complement with the RSX-11 but maybe earlier ones like some IBM?
>  


I don't know the actual statistics on what proportion of machines don't
use 2's complement. I do know that the C Standard can be implemented on
non-2's complement machines though, and so when I'm writing
fully-portable code I take this into account.


{Quote hidden}

I haven't touched the value, I'm simply showing you an instance of where
a cast can do what a conversion cannot do. The following won't compile:

   int *p = &b;


>> An "auto-cast" tends to be called a conversion. There's no difference
>> other than that sometimes you need an explicit cast because there's no
>> implicit conversion.
>>    
>
> Nope. There is always implicit or auto casting. You need explicit when mr.
> compiler does not know how would you like it.


Sorry but is this not what I said?


>>   unsigned char a = 7, b = 4;
>>
>>   a = ~b;
>>    
>
> All right, tell me please were is that illegal.


It's not illegal, it's just that it will give a different result
depending on the number system. Presumably, if you're writing an
algorithm, you want it to perform the same way on all systems.


>  If a compiler gives a bad
> results on this then that is a bad compiler - and you probably would use
> unsigned int instead of char on that compiler - better than casting.


It's not the compiler's fault if the machine's number system isn't Two's.


>  If you
> deal with multiplatform development you are not going to use the native type
> definitions anyway, so use different types on different platforms is quite
> possible.
>  


Personally I use types such as uint_fast16_t to get optimal performance
on everything from an 8-Bit PIC to a 64-Bit super computer.


{Quote hidden}

You originally said that there was a problem in going from *signed* to
unsigned, and I remember reading over it twice to be sure. If you have
the following:

   unsigned j = -5;

then it's the same as writing:

   unsigned j = UINT_MAX - 5 + 1;

Similarly if you have:

   unsigned j = -27;

then it's the same as writing:

   unsigned j = UINT_MAX - 27 + 1;


>>   char unsigned a = 4, b = 7;
>>   a &= ~b;
>>
>> then the expression on the right-hand side of the assignment is of type
>> either "signed int" or "unsigned int" depending on whether INT_MAX is
>>    
>
> Ok, 'b' is an unsigned char, therefore the ~b is unsigned char.


No, the ~b is either "signed int" or "unsigned int", depending on
whether INT_MAX is greater than UCHAR_MAX.


>  'a' is also
> unsigned, therefore no casting needed. There could be some platform where
> there is no way to read a single byte out of the memory. I could imagine,
> that on those system every operation on unsigned char is painful for the
> compiler.


Are today's PC's not like that? I heard they take an entire word from
memory and then do bitwise manipulation to give you a byte? I'm open to
correction!

A good indicator of whether this is true is to see if sizeof(char*) is
greater than sizeof(int*). If it is indeed greater, then it implies that
a char* needs more information than just the address, it needs the byte
index as well.


>  Then I guess you should avoid using that. Trying to remember which
> unix was that, I think it was a DEC system with Alpha processor, but then
> the compiler did all the trick for you, only had to remember when using a
> pointer - reading a non 4 dividable address caused a dump. Anyway, good old
> times :-)
>  


The C Standard mentions alignment also. The following, for example, has
undefined behaviour:

   char whatever[sizeof(int)];

   *(int*)whatever = 5;

If "whatever" had had suitable alignment then it would have been OK.


>> greater than UCHAR_MAX. On the vast majority of systems, it's "signed
>> int". When you try to assign the "signed int" to an unsigned char, it
>> needs to be converted.
>>    
>
> So how do you convert a bigger number than you can represent on a smaller
> storage? You cannot.


unsigned i = 78;

short unsigned j = i;   /* No problem, this will  be 78 */


>  All you can do is to cast it - which means no
> conversion, only picking up the part that can fit into the place it has.
> Like if you have 16 bit int i and 8 bit char c, then c=i means i will be
> atuo casted to c, so the least significant 8 bit will be copied into c.
>  


Provided both types are unsigned, this is correct. The C Standard talks
about this in terms of "modulo addition".

For instance, if you have the following on an 8-Bit system:

   char unsigned i = 7888u;

then it's the same as writing:

   char unsigned i = 7888u % 256;  /* where 256 is the max value for a
byte */

Conveniently, this is the same as just giving you the lower 8 bits.


>> An integer expression must undergo promotion before any arithmetic or
>> bitwise operation can be performed on it.
>>    
>
> Who told you that?


The C Standard of 1989, and also of 1999.


>  It is not a must, it may how it works - maybe the
> compiler design is simpler, maybe the code works faster on a given platform
> like that, but it is not a must.


It must behave *as if* the promotion were applied. Under the bonnet
though it can do whatever it wants.


>  On many platform the size of int chosen to
> to be supported natively. Not like on 8 bit environment - like 8 bit PICs.
> You can imagine what would happen if every time you deal with a simple
> operation like PORTA |= 1<<3?
>  


In this case, the PIC C compiler is smart in that it knows it can do the
job without having to use 16-Bit numbers. I've actually tried this out.
And if only for the sake of habit, I'd still change that to:

   1u << 3


{Quote hidden}

Now try your code on a one's complement or sign-magnitude machine and
see what you get.

The CPU can do unsigned arithmetic so long as the program behaves as
though it were signed. In the case of your code snippet, there's no
difference in the behaviour even if unsigned types are used, so the
compiler went with the more efficient solution.

2008\07\09@113317 by Alan B. Pearce

face picon face
>The C Standard calls it the sign bit.
>(Not being snotty, just making the point).

You seem to have made a rod for your own back, by arguing a non-existent
case. If I set up a binary number to set port direction bits, why would it
automatically be signed ?????????

2008\07\09@121833 by Tomás Ó hÉilidhe

picon face


Alan B. Pearce wrote:
>> The C Standard calls it the sign bit.
>> (Not being snotty, just making the point).
>>    
>
> You seem to have made a rod for your own back, by arguing a non-existent
> case. If I set up a binary number to set port direction bits, why would it
> automatically be signed ?????????


Let's say you have two ports that each have 8 pins, Port A and Port B.

PORTA is an unsigned char, and PORTB is an unsigned char.

You want to write a piece of code that reads from Port A, and then
produces the complement of Port A on Port B. So if Port A had all pins
high, then Port B should have all pins low.

The less-than-perfect solution is:

   PORTB = ~PORTA;

Now before I go any further, I'm going to introduce the following
hypothetical, but reasonable, machine:

* Byte = 8 bits
* Char = 1 byte
* Int = 2 bytes
* System for storing negative numbers = One's complement

Using these four pieces of information, we can calculate the following:
UCHAR_MAX  ==  255
INT_MAX  ==  32767

Because INT_MAX is greater than UCHAR_MAX, we know that any integral
type smaller than "int" will promote to a "signed int".

Let's say that the value of PORTA is 00001111 (0x0F in hex, 15 in
decimal), and therefore we want PORTB to become 11110000 (i.e. 0xF0 in
hex, 240 in decimal). OK so let's take a look at what happens with the
following statement:

   PORTB = ~PORTA;

1) PORTA is an unsigned char. Therefore, before we can take its
complement, it must be promoted to signed int.
2) After the promotion, we have a 16-Bit signed int whose value is 15 in
decimal.
3) We then take the complement of this signed int, yielding a bit
pattern of 0xF0.
4) On this one's complement machine, an int that has a bit pattern of
0xF0 has a decimal value of -15.
5) Now, with the assignment to PORTB, we have to convert the int value
-15 to an unsigned char. This conversion happens as follows:

   unsigned char i = UCHAR_MAX - 15 + 1

This yields a value of 241. The bit-pattern for 241 is  1111 0001.

We wanted a bit pattern of 1111 0000, but we got 1111 0001.

Therefore, if you want the code to work perfectly with One's complement,
Two's complement and Sign-magnitude, then you change it to:

   PORTB = ~(unsigned)PORTA;





2008\07\09@130346 by Alan B. Pearce

face picon face
>1) PORTA is an unsigned char. Therefore, before we can take
>its complement, it must be promoted to signed int.

What a load of rubbish. There is no 'must be promoted' at all. There is no
reason to promote an unsigned char to signed int.

2008\07\09@163906 by Tomás Ó hÉilidhe

picon face


Alan B. Pearce wrote:
>> 1) PORTA is an unsigned char. Therefore, before we can take
>> its complement, it must be promoted to signed int.
>>    
>
> What a load of rubbish. There is no 'must be promoted' at all. There is no
> reason to promote an unsigned char to signed int.


I don't mind people getting things wrong, but it's a different kettle of
fish when they burst in with "What a load of rubbish".

There's a certain integer type called "int". It comes in a signed
flavour and an unsigned flavour. There exist integral types smaller than
int, e.g. "char" and "short", but these types are somewhat "second class
citizens". Before you can perform any sort of arithmetic or bitwise
operation on one of these smaller types, it *must*, repeat MUST, repeat
*MUST* be promoted.

2008\07\09@181426 by Rolf

face picon face
Tomás Ó hÉilidhe wrote:
{Quote hidden}

Unless of course you are using MCC18 in which case it does not need to
be promoted.... just run the compiler and check the disassembly. Should
be easy to prove.

Additonally, this is PICList, and, portability of code to other
architectures is a low-priority requirement where the integer promotion
aspects are likely to be the least of your concerns.

Basically, if *you* are a C guru, that's great, but, for pragmatic
reasons, no-one else really cares if MCC18 is compliant with the
'normal' integer opromotion standards, but, as was mentioned before,
there is a flag to make it compliant if you want inefficient code that's
more compatible.

Just RTFM! In this case the MCC18 User guide....

> 2.7.1 Integer Promotions
> ISO mandates that all arithmetic be performed at int precision or
greater. By default,
> MPLAB C18 will perform arithmetic at the size of the largest operand,
even if both
> operands are smaller than an int. The ISO mandated behavior can be
instated via the
> -Oi command-line option.
> For example:
> unsigned char a, b;
> unsigned i;
> a = b = 0x80;
> i = a + b; /* ISO requires that i == 0x100, but in C18 i == 0 */
> Note that this divergence also applies to constant literals. The
chosen type for constant
> literals is the first one from the appropriate group that can
represent the value of the
> constant without overflow.


Rolf

2008\07\09@183603 by Tomás Ó hÉilidhe

picon face

Rolf wrote:
{Quote hidden}

I'm actually pleasantly surprised that the MCC18 compiler has this in
its manual. I mean the next best thing to a Standard-compliant compiler
is one that makes a finite list of where it diverges from the Standard.

As for all my talk about integer promotion, well naturally it only
applies to standardised C.

2008\07\09@193249 by sergio masci

flavicon
face
part 1 1858 bytes content-type:TEXT/PLAIN; charset=UTF-8 (decoded quoted-printable)



On Wed, 9 Jul 2008, Rolf wrote:

{Quote hidden}

Actually this doesn't "prove" anything. It is possible for the compiler to promote a "short int" to an "int" yet optimise the generated code such that redundent code is removed.

> > Additonally, this is PICList, and, portability of code to other > architectures is a low-priority requirement

If that's the case why on earth do I keep hearing "must use C because it is portable" in connection with PICs

Regards
Sergio Masci

part 2 35 bytes content-type:text/plain; charset="us-ascii"
(decoded 7bit)

2008\07\09@195239 by Rolf

face picon face

sergio masci wrote:
{Quote hidden}

The 'high priority' requirment is that it is often easier to fulfil
requirements in C than assembler (especially for people familiar with
C). Also, it is easier to port code from other places to C18 than to
write it in assembler. Also, often it is simpler to write in C than
assembler. Also, it is easier to port a C18 process to another
architecture than to port a PIC assembler routine.

Regardless, every person who's used C in multiple environment knows that
portabolity is a relative thing. Compared to other languages, C is more
portable than most (i.e. requires fewer changes to work). Hell, it is a
major undertaking to 'port' an application written for linux to windows
(or visa-versa) even though the actual physical machine may be identical...

Anyway, I presume you were aware of this and were just looking for a
rise ... ;-)

Rolf

2008\07\09@201520 by =?UTF-8?B?VG9tw6FzIMOTIGjDiWlsaWRoZQ==?= n/a

picon face


Rolf wrote:
>
> Regardless, every person who's used C in multiple environment knows that
> portabolity is a relative thing.


If you're writing algorithms (as opposed to stuff like graphical user
interface code), then a lot of the time it's not a heavy task at all to
make it fully-portable. In the code for my Connect4 game, I ended up
writing a module for using as little memory as possible for storing the
states of the LED's (I had 43 LED's that had three different states).
Obviously if I'd used an entire byte for every LED, that would have been
43 bytes. Instead I decided to use 2 bits per LED, which came out at 11
bytes. I made a module called "MemoryAccess" to do this, and it had
getter and setter functions such as "GetSegment" and "SetSegment".
Because I made use of macroes such as CHAR_BIT in my module, I was able
to make it fully-portable so that it will work on *any* compliant
implementation of the C Standard regardless of things like endianness,
size of byte, size of int, character set.


>  Compared to other languages, C is more
> portable than most (i.e. requires fewer changes to work). Hell, it is a
> major undertaking to 'port' an application written for linux to windows
> (or visa-versa) even though the actual physical machine may be identical...
>  


If you're using the native operating system's API, then yes you
basically have to re-write the whole thing. The trend is toward
"cross-platform" programming a lot though nowadays, e.g. people are
using a cross-platform library for the application's GUI so that it will
compile for Linux, Unix, Mac and Windows.

2008\07\09@225602 by Rolf

face picon face
Tomás Ó hÉilidhe wrote:
{Quote hidden}

I think that's what I said... portability is a relative thing... ;-)

Rolf

2008\07\10@091227 by sergio masci

flavicon
face


On Wed, 9 Jul 2008, Rolf wrote:

{Quote hidden}

Maybe ;-)

Regards
Sergio

More... (looser matching)
- Last day of these posts
- In 2008 , 2009 only
- Today
- New search...