Searching \ for '[Tech] What to do about compiler bug and source co' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=what+about+compiler
Search entire site for: 'What to do about compiler bug and source co'.

Exact match. Not showing close matches.
PICList Thread
'[Tech] What to do about compiler bug and source co'
2008\07\22@135114 by William Couture

face picon face
OK, I've run into a situation at work, and would like some feedback from the
community.

I've found a nasty bug in a C compiler (IAR for Atmel (ATMega),
version 4.21A and 5.11B)
(using the ATMega644 and cranking optimization up to max, 32-bit locals that are
put in registers are not properly handled (only sets lower 2 bytes out
of 4, or in one case
does not set registers at all!)).

I'm going to contact IAR shortly, but since I can't reproduce it in a
sample fragment, I'm
sure that IAR will want the entire source.  But, since it has our
propritary code, work
does not want the source to be sent out.

How would you handle this situation?

Thanks,
  Bill

--
Psst... Hey, you... Buddy... Want a kitten? straycatblues.petfinder.org

2008\07\22@143706 by William \Chops\ Westfield

face picon face

On Jul 22, 2008, at 10:45 AM, William Couture wrote:

>
> I'm going to contact IAR shortly, but since I can't reproduce it in  
> a sample fragment, I'm sure that IAR will want the entire source.  
> But, since it has our propritary code, work does not want the  
> source to be sent out.
>
> How would you handle this situation?

1) report the bug without source code.  Perhaps it is a known issue  
Perhaps there is already a fix available.

2) Meanwhile, begin editing your example to make it as small and self-
contained as possible.  manually include headers.  Delete irrelevant  
code.  Etc.  If you can't create a simple non-proprietary example,  
you should be able to at least create a pretty useless fragment  
starting from the proprietary function.

3) In the end, consider that a well-established high-end compiler  
company can probably be trusted with moderate amounts of proprietary  
code.  Make sure that they understand that that's what they're  
getting ("NOT for publication as an example of the bug!") and proceed  
anyway unless the code fragment (from 2) in question REALLY contains  
key confidential intellectual property...

BillW

2008\07\22@144502 by sergio masci

flavicon
face


On Tue, 22 Jul 2008, William Couture wrote:

{Quote hidden}

Strip out as much source code as possible such that the compiler still
produces the bug.

Start off with a copy of the project. Systematically delete the contents
of functions which are unrelated to the function where you are
experiencing the problem (leaving just the empty function). Keep doing
this until the compiler starts behaving, then undo until the compiler
starts misbehaving again. Make a note of which functions are vital to
the bug and then try again with functions you havent touched yet.

You will end up with some functions and globals which are not referenced
anywhere. try removing them completely, a few at a time until the compiler
starts behaving itself again, then undo the changes until the compiler
starts misbehaving again. you should be able to drastically reduce the
code necessary to reliably reproduce bug.

Regards
Sergio Masci

2008\07\22@181738 by Tomás Ó hÉilidhe

picon face


William Couture wrote:
> OK, I've run into a situation at work, and would like some feedback from the
> community.
>
> I've found a nasty bug in a C compiler (IAR for Atmel (ATMega),
> version 4.21A and 5.11B)
> (using the ATMega644 and cranking optimization up to max, 32-bit locals that are
> put in registers are not properly handled (only sets lower 2 bytes out
> of 4, or in one case
> does not set registers at all!)).
>  


I had a program before that worked perfectly on several different
platforms (Windows, Mac, Linux) with several different compilers until I
compiled it with the highest optimiser turned on for a particular
compiler ( -O3 with gcc ). Since the code worked perfectly on so many
different systems up until that point, my initial reaction was that
there must be a bug in the compiler. Anyway, the offending code turned
out to be:

   void StrToLower(char *p)
   {
       while (   *p++ = tolower( (char unsigned)*p )   );
   }

The problem here is that the behaviour is "unspecified" when it comes to
whether "*p++" or "*p" is evaluated first. For some reason, when I upped
the optimisation, "*p" became the expression that was evaluated first.

I'm not saying there's definitely not a bug in the compiler, but I'd
definitely scrutanise the code for sequence point violations and also
for unspecified behaviour. If you forced me to bet my house on what was
the problem, I'd reluctantly have to go with the code.


> I'm going to contact IAR shortly, but since I can't reproduce it in a
> sample fragment, I'm
> sure that IAR will want the entire source.  But, since it has our
> propritary code, work
> does not want the source to be sent out.
>
> How would you handle this situation?


I'd be interested in glancing over the code if you wouldn't mind sending
me a snippet. I'd pay particular attention to lines of code that mention
the same variable name more than once.

2008\07\23@000332 by Richard Prosser

picon face
Bill

Can you just remove the parts of the code that are sensitive and still
replicate the problem? Maybe fill out the gaps with "rubbish" or a big
array/lookup table or something.

RP

2008/7/23 William Couture <spam_OUTbcoutureTakeThisOuTspamgmail.com>:
{Quote hidden}

> -

2008\07\23@010019 by Vitaliy

flavicon
face
William Couture wrote:
> I'm going to contact IAR shortly, but since I can't reproduce it in a
> sample fragment, I'm
> sure that IAR will want the entire source.  But, since it has our
> propritary code, work
> does not want the source to be sent out.
>
> How would you handle this situation?

How about using GoToMeeting.com or similar, and let IAR compile/test the
code on your machine? This way, they can see exactly what's happening, yet
you don't have to send them a single line of code.


2008\07\23@131017 by Martin

face
flavicon
face
Tomás Ó hÉilidhe wrote:

>
> I had a program before that worked perfectly on several different
> platforms (Windows, Mac, Linux) with several different compilers until I
> compiled it with the highest optimiser turned on for a particular
> compiler ( -O3 with gcc ). Since the code worked perfectly on so many
> different systems up until that point, my initial reaction was that
> there must be a bug in the compiler. Anyway, the offending code turned
> out to be:
>
>     void StrToLower(char *p)
>     {
>         while (   *p++ = tolower( (char unsigned)*p )   );
>     }
>
> The problem here is that the behaviour is "unspecified" when it comes to
> whether "*p++" or "*p" is evaluated first. For some reason, when I upped
> the optimisation, "*p" became the expression that was evaluated first.
>

Postfix increment (n++) occurs after the value is used.
i.e:

b=0;
a = b++ + 4;

results in:
b==1
a==4


So I don't understand how it wouldn't be a bug in the compiler. I have
code that does something similar:

while(*p)
{
       foo(*p++);
}

Starts at p[0] - if it were an undefined OOP there should be a major
problem.
-
Martin

2008\07\23@131915 by Herbert Graf

flavicon
face
On Tue, 2008-07-22 at 13:45 -0400, William Couture wrote:
> OK, I've run into a situation at work, and would like some feedback from the
> community.
>
> I've found a nasty bug in a C compiler (IAR for Atmel (ATMega),
> version 4.21A and 5.11B)
> (using the ATMega644 and cranking optimization up to max, 32-bit locals that are
> put in registers are not properly handled (only sets lower 2 bytes out
> of 4, or in one case
> does not set registers at all!)).
>
> I'm going to contact IAR shortly, but since I can't reproduce it in a
> sample fragment, I'm
> sure that IAR will want the entire source.  But, since it has our
> propritary code, work
> does not want the source to be sent out.
>
> How would you handle this situation?

I deal with this sort of thing all the time at work, you have a few
options:

1. Create a test case that doesn't use your proprietary code, but
replicates the problem. You mention you haven't been able to reproduce
it, so it looks like this isn't an option.

2. Get an NDA signed so that your code can go to them. This
unfortunately isn't always acceptable to both sides.

3. Have them give you a "special" build of their compiler with debugging
enabled, and then send them the debug info. Oftentimes this is enough to
figure out what's going on. THEY might not want to give you the debug
enabled version of the compiler.

4. Have them send an FAE to your site (or alternatively have them
remotely log into one of your machines) to debug the issue on your
machine.

5. Find a workaround, note the problem exists and ignore the issue. This
last one REALLY sucks, but sometime legalities make it the only option.

TTYL

2008\07\23@141909 by Tomás Ó hÉilidhe

picon face
Martin wrote:
> Postfix increment (n++) occurs after the value is used.
> i.e:
>
> b=0;
> a = b++ + 4;
>
> results in:
> b==1
> a==4
>  


Indeed.



{Quote hidden}

Your code is fine. I'll explain the problem I had:

Let's say we have a function that returns a pointer, something like:

   int *Func1(void)
   {
       static int i;

       return &i;
   }

And let's say we invoke this function as follows:

   *Func1() = 55;

Now, instead of setting it's value to 55, let's set it to the return
value of a function called "Func2":

   *Func1() = Func2();

OK now, looking at the above statement, I ask you: Which function gets
called first, Func1 or Func2? The C Standard says that this behaviour is
"unspecified", which means that the compiler can evaluate either one of
them first, and it doesn't even have to document in its manual which one
of them gets evaluated first.

So now if we look at the original code that was something like:

   *p++ = tolower(*p);

If the right-hand side gets evaluated first, then it will behave as
though it were:

   *p = tolower(*p);
    p = p + 1;

However, if the left-hand side gets evaluated first, then it will behave
as though it were:

   p = p + 1;
   *p = tolower(*p);

The best way to avoid these problems is to shy away from reading and
writing the same variable's value in one line of code.

Note that this problem is very similar to, but distinct from, the
undefined behaviour that results from reading and writing a variable's
value without a sequence point in between. An example of a sequence
point violation would be:   j = i++  *   i++;

2008\07\23@161725 by William \Chops\ Westfield

face picon face

On Jul 23, 2008, at 10:09 AM, Martin wrote:

>>         while (   *p++ = tolower( (char unsigned)*p )   );
>>
>> The problem here is that the behaviour is "unspecified" when it  
>> comes to
>> whether "*p++" or "*p" is evaluated first. For some reason, when I  
>> upped
>> the optimisation, "*p" became the expression that was evaluated  
>> first.
>>
> Postfix increment (n++) occurs after the value is used.

That's not the problem.  The problem is whether the left side or  
right side of the assignment is evaluated first.  (actually, I think  
it would be a problem even if it was a comparison rather than an  
assignment.)

It tends to annoy me that compiler standards writers can leave  
something relatively important like this out of their specs, while  
getting all pissy about some obscure corner.

It REALLY annoys me when compiler writers use the lack of  
specification to arbitrarily change behavior from one version to the  
next.  (I didn't see that with this particular issues, but we got  
bitten pretty hard on a change in pre-processor operation ordering  
between gcc2.95 and gcc3.4...)  Grr.

BillW

2008\07\23@165159 by Tomás Ó hÉilidhe

picon face

William "Chops" Westfield wrote:
> That's not the problem.  The problem is whether the left side or  
> right side of the assignment is evaluated first.  (actually, I think  
> it would be a problem even if it was a comparison rather than an  
> assignment.)
>  


Yes you're right, you'll get this problem in places where the following
two criteria are satisfied:
   * There is no sequence point between the two things that need to be
evaluted
   * The order of evaluation is "unspecified"


> It tends to annoy me that compiler standards writers can leave  
> something relatively important like this out of their specs, while  
> getting all pissy about some obscure corner.
>  


I believe it was deliberately left out to aid in code optimisation. For
instance, if you had a line of code such as the following:

   some_variable = Func1() + Func2() * Func(3) / Func(4);

then the compiler can call the functions in whatever order it pleases.
Supposedly this gives the compiler the freedom to produce the most
efficient machine code it can. If you wanted definitive order, you'd
need to do introduce a "sequence point", maybe like as follows:

   some_variable = Func1();
   some_variable += Func2();
   some_variable *= Func3();
   some_variable /= Func4();

For those who don't know and are wondering, a sequence point is sort of
like a "fullstop". When you reach a sequence point, everything before it
must finish being evaluated before anything after it can be evaluated.
The Standard gives a finite list of when and where a sequence point
occurs; the most common one you'll be familiar with is the semi-colon at
the end of a statement.


> It REALLY annoys me when compiler writers use the lack of  
> specification to arbitrarily change behavior from one version to the  
> next.  (I didn't see that with this particular issues, but we got  
> bitten pretty hard on a change in pre-processor operation ordering  
> between gcc2.95 and gcc3.4...)  Grr.


I suppose the only thing you can say about it is that people should
learn this stuff from the very start. The book I used for learning C was
"C++ for Dummies" (I actually started out learning C++ before I delved
into C), and very early on in the book it explained about how the order
of evaluation can be arbitrarily chosen by the compiler.

Actually this makes me think of something... it would be great to have a
tool that would scan through code looking for instances in which code
will behave differently on different systems because of unspecified
behaviour. It wouldn't even need to compile the code, it would just spit
out something like:
   Warning: The statement on line 3 can have more than one effect on
different platforms

I've heard of something called Lint, only heard of it, never used it,
but I don't think it goes into this detail. I might start a thread on
comp.lang.c and see what the lads think.

2008\07\24@031101 by William \Chops\ Westfield

face picon face

On Jul 23, 2008, at 1:51 PM, Tomás Ó hÉilidhe wrote:

>> we got bitten pretty hard on a change in pre-processor operation  
>> ordering
>> between gcc2.95 and gcc3.4...  Grr.
>
> I suppose the only thing you can say about it is that people should
> learn this stuff from the very start.

Not all the undefined issues are as obvious.  Consider:

  #define errno retrieve_errno_func()

  #define SUBSYSTEM_INCLUDE(subsystem, file) <subsystem/include/file>

  #include SUBSYSTEM_INCLUDE(posix, errno.h)

Each piece looks OK.  gcc2.95 pre-processes as intended, to:
   #include <posix/include/errno.h>
gcc3.4 preprocessed to:
   #include <posix/include/retrueve_errno_func()>
Oops.

> it would be great to have a tool that would scan through code  
> looking for instances in which code will behave differently on  
> different systems because of unspecified
> behaviour. It wouldn't even need to compile the code, it would just  
> spit
> out something like:
>     Warning: The statement on line 3 can have more than one effect  
> on different platforms

There's a whole class of tools called "static analysis" tools that do  
this sort of thing.  And for that matter, compilers themselves keep  
getting fussier as well; what WAS  accepted and compiled to correct  
code in one version may get warning messages from a later compiler.  
When we upgrade compilers, it's always a major effort to go through  
and address all the new warnings that get printed (some are actual  
errors,
some are spurious, some are due to a change in features, etc.  Policy  
is that no warning messages are allowed...)  Some things get changed,  
sometime obscure compiler switches are invoked to change behavior,  
sometimes we go and get the compiler changed to accept things the way  
we'd like them to be... ("It's all very nice that you check the  
printf arguments against the format descriptors, but we have our own  
version of printf with DIFFERENT meanings for %e and such, so you've  
got to at least have a way to turn it off!")

What you currently do (-pedantic -ansi -Wall) is a step in the right  
direction.

>
> I've heard of something called Lint ...
That's one of them.  See
  http://web.mit.edu/sunsoft_v5.1/www/c-compiler/user_guide/
lint.doc.html

There's also flexeLint, KLOCwork, Prefix, Coverity, and many others.  
Some cost big bucks.  Some are worth it (so we believe.)  Often one  
gets very frustrated with the false positives...

BillW

2008\07\24@090430 by Tomás Ó hÉilidhe

picon face


William "Chops" Westfield wrote:
> Not all the undefined issues are as obvious.  Consider:
>
>    #define errno retrieve_errno_func()
>
>    #define SUBSYSTEM_INCLUDE(subsystem, file) <subsystem/include/file>
>
>    #include SUBSYSTEM_INCLUDE(posix, errno.h)
>
> Each piece looks OK.  gcc2.95 pre-processes as intended, to:
>     #include <posix/include/errno.h>
> gcc3.4 preprocessed to:
>     #include <posix/include/retrueve_errno_func()>
> Oops.


I wonder if the Standard says anything about which one of them if
correct. My guess is that gcc3.4 is correct (even though it didn't
behave as intended!). I'll look into it and get back to you.

More... (looser matching)
- Last day of these posts
- In 2008 , 2009 only
- Today
- New search...