I am using CCS PCM C compiler. I looked at the assembly that it generated
and did not understand something, maybe you can help me with it.
When I wrote :
i++;
It compiled it into :
movf i, W
incf i, F
And when I wrote:
i--;
It compiled it into :
movf i, W
decf i, F
All instructions movf/decf/incf modify only the Z flag, so it can be used
the same way later as a test for branching, there is no extra information
on the status register supplied by the movf instructions, only W register
now holds the previus data of i, which was not used at all.
> I am using CCS PCM C compiler. I looked at the assembly that it generated
> and did not understand something, maybe you can help me with it.
>
> When I wrote :
>
> i++;
>
> It compiled it into :
>
> movf i, W
> incf i, F
>
>
> And when I wrote:
>
> i--;
>
> It compiled it into :
>
> movf i, W
> decf i, F
Just out of curiosity, what assembler code is produced for:
++i;
and:
--i:
The post-increment/decrement will save the current value of the variable
*before* incrementing/decrementing it. Some compilers will generate the
same code for post increment/decrement as for pre-increment/decrement if
they see that you are not using the former value (as in your example); others
may NOT do this.
>
> Dear PIC experts,
>
> I am using CCS PCM C compiler. I looked at the assembly that it generated
> and did not understand something, maybe you can help me with it.
>
> When I wrote :
>
> i++;
>
> It compiled it into :
>
> movf i, W
> incf i, F
>
> And when I wrote:
>
> i--;
>
> It compiled it into :
>
> movf i, W
> decf i, F
>
> All instructions movf/decf/incf modify only the Z flag, so it can be used
> the same way later as a test for branching, there is no extra information
> on the status register supplied by the movf instructions, only W register
> now holds the previus data of i, which was not used at all.
>
> Any idea why is the MOVF infront of incf/decf ?
>
> Thanks Chaipi
> \\\|///
Perhaps, because you're doing "post-increment". So the compiler preserves
the old value of the variable, in case you say :
a = i++;
in this case, the old value of "i" is stored in "a".
Try, what happens, if you say "++i", i.e. "pre-increment" ?
Does it generate a different sequence? It should!
> I am using CCS PCM C compiler. I looked at the assembly that it generated
> and did not understand something, maybe you can help me with it.
>
> When I wrote :
>
> i++;
>
> It compiled it into :
>
> movf i, W
> incf i, F
The semantics of post increment (decrement) are that the variable is
incremented (decremented), and the original value is the value of the
post increment (decrement) expression. So this code would be necessary
in the general case, where the old value of i would be needed, perhaps
for use as an index as in
array[i++] = 0;
Now, any optimizing compiler and even many merely opportunistic ones,
might notice that the value of the expression "i++" isn't used in the
case you mentioned, and would thus treat it less generally. But a less
evolved C compiler - and from what I've seen, C compilers for embedded
processors are usually rather primitive - might very well choose to leave
it up to the programmer to write instead "++i" when the more efficent
code generated for that form will suffice. This sort of thing, where you
had to learn what code produced tighter assembler output, used to be
fairly common, but the PC and workstation (and etc) compilers these days
are more sophisticated. Why, it's probably been a decade since I thought
about using a "register" storage class explicitly! :-)
> > When I wrote :
> >
> > i++;
> >
> > It compiled it into :
> >
> > movf i, W
> > incf i, F
>
> Now, any optimizing compiler and even many merely opportunistic ones,
> might notice that the value of the expression "i++" isn't used in the
> case you mentioned, and would thus treat it less generally. But a less
> evolved C compiler - and from what I've seen, C compilers for embedded
> processors are usually rather primitive - might very well choose to leave
> it up to the programmer to write instead "++i" when the more efficent
> code generated for that form will suffice.
You mean:
incf i,F
movf i,W
or somesuch? I don't know if the CCS compiler would generate that but it
would seem the logical counterpart to i++; the movf is superfluous in this
case, of course, but how is the compiler to know that if it can't figure it
out in the former case?
> > it up to the programmer to write instead "++i" when the more efficent
> > code generated for that form will suffice.
>
> You mean:
>
> incf i,F
> movf i,W
>
> or somesuch? I don't know if the CCS compiler would generate that but it
> would seem the logical counterpart to i++; the movf is superfluous in this
> case, of course, but how is the compiler to know that if it can't figure it
> out in the former case?
No, I wouldn't expect it to make a copy at all in this case. The value
of the expression IS the value in i (after the preincrement), so no copy
is needed, ever. For the postincrement, you need a copy of the original
value to use as the value of the whole expression after the increment has
been performed. So I would be happy to call any PIC C compiler that
performed such a copy for ++i a completely brain-dead compiler, wheras
one that does so only for i++ is merely not very sophisticated.
++i should generate just "incf i, F".
Similar situations may well arise for things like using the
assignment-ops in place of an assignment and the equivalent expression -
this too is traditional in primitive C compilers, so that
i += j;
may very well produce better code than
i = i + j;
I would guess the PIC code for these might be
movf j, W
addwf i, F
versus
movf i, W
addwf j, W
movwf i
Same reasoning: without some understanding of the whole expression the
compiler _can't_ generate the more efficent code for "i = i + j", and even
this very simple degree of understanding is not so simple to teach to a
compiler. A modern optimizing compiler would be thought defincient if it
didn't generate the same (efficent) code for both of these, but it's not
been so long since such compilers were pretty rare, and mostly restricted
to mainframes.
> On Fri, 17 May 1996, John Payson wrote:
[suggested ++i would generate < incf i,F / movf i,W > on a PIC compiler]
> No, I wouldn't expect it to make a copy at all in this case. The value
> of the expression IS the value in i (after the preincrement), so no copy
> is needed, ever. For the postincrement, you need a copy of the original
> value to use as the value of the whole expression after the increment has
> been performed. So I would be happy to call any PIC C compiler that
> performed such a copy for ++i a completely brain-dead compiler, wheras
> one that does so only for i++ is merely not very sophisticated.
The normal mode of operation for a simple compiler on an accumulator-based
machine is to have the code for evaluating any subexpression leave the value
of that subexpression in the accumulator.
Thus, if I say (a= ++b - 6) the compiler should respond by generating code
to evaluate ++b, leaving the result in the accumulator, followed by code to
subtract six from the accumulator, and code to store the result into A.
In the absense of any optimization, the code would be:
Clearly, the handling of the "=" operator and subract operators should be
improved for the case where they're using constants; in a tree-based parser
this is not difficult, however, since the parser knows when it evaluates
each operation the attributes of its operands.
Much better. Note that the second instruction of "++b" above is very
much necessary, since the value of b has to be in the accumulator when
the "-6" is evaluated. While the "movf" would not be needed if the value
of ++b were not used for anything, the compiler can't know that unless it
down-propagates information about operators to the routines that are
processing the operands. Such down-propagation can be extremely helpful
in more significant contexts than the inc/dec operators. For example,
consider the code:
{
signed char a,b,c;
int i;
...
a=b+c;
...
i=b+c;
...
if (a > b+c) /* Whatever */
}
In the first instruction, the obvious best code is:
; b
movf b,w
; +c
addwf c,w
; a=
movwf a
but this code cannot be generated unless the compiler knows during its
evaluation of the "b+c" subexpression that the result is going to be
coerced into type char. In the second case, the compiler will have to
in fact have its generated "b+c" code produce something useful; if it
can optimize for (signed byte + signed byte), it might generate something
like this:
movf b,w
addwf c,w
clrf res_high
rlf res_high
btfsc b,7
decf res_high
btfsc c,7
decf res_high ; b+c is now in res_high:w
movwf i_low
movf res_high,w
movwf i_high
If the carry handling is merged into the assignment, the last two statements
are unnecessary but otherwise the above code would be reasonably appropriate
for i=a+b. If the compiler can't tell, however, when it's generating the
code for a+b, whether the result is going to be taken as an integer or a
char it would have to generate 10 instructions rather than 3.
The last case I gave above (checking a>b+c) is the bane of many compilers.
Leaving all results as type char will violate the ANSI spec for C, since
(-50) > (-100) + (-100) but (-50)<(signed char)(-100 + -100). On the other
hand, casting to integer will be very expensive especially since comparisons
happen often inside loops.
In a case like this where code generation is essential, sometimes the code
writer can help the compiler by coding: /* if (a < (signed char)(b+c)) */
if the author knows that b+c will not overflow the size of a char. Other-
wise, the compiler must either sign-extend b+c (painful) or else have
special code to deal with this construct (also painful).
> from what I've seen, C compilers for embedded
> processors are usually rather primitive
You must have only been looking at low end stuff then. There are plenty
of compilers for embedded processors that are every bit as sophisticated
in their code generation as any hosted compiler. But as always, you get
what you pay for. What does CCS cost?
> The normal mode of operation for a simple compiler on an accumulator-based
> machine is to have the code for evaluating any subexpression leave the value
> of that subexpression in the accumulator.
Yes, so?
No, seriously - I don't mean to be snide, but this seems to be the crux of
the matter. Yes, if you assume the code generator is stuck in the rut of
"everything interesting happens in The Accumulator, so I can just assume
results show up there just as if this were an 8080 I'm generating code for"
then what you say is true. I think we disagree about whether such
compilers are worth discussing.
Which isn't to say that I found your discussion about generating code for
the PIC uninteresting; in fact, I'll be saving it in case I should be so
foolish as to pursue this any further. :-)