'Unusual ARM WDT problem'
I am casting wide for any assistance on this issue. I've spent
several days at Atmel HQ with their people on this, and we still have
I've also posted this on a couple of arm specific forums.
We are using an AT91SAM7S256, setting up the WDT to fire an internal
and external reset.
The problem is that the chip hangs in a reset loop appx every 20mS.
The WDT can be set for any timeout up to the 16 second max, and we get
the same 20mS loop.
At times the reset loop will happen "forever", or it will happen
several times then boot properly.
We thought that this was a bad PCB issue (happens on Atmels dev kit as
well), PLL issue (even happens when the PLL is not used.) Crystal
startup issue (problem also happens with other types of crystals, and
oscillator startup looks good) VDD risetime issue (verified risetime
significantly faster than required), bypassing issue (PCB runs fine
with almost all bypass caps removed, adding more does not affect the
Interestingly, the WDT values that are listed as "working" values for
the rev A silicon seem to cause the problem to appear less often than
We initially encountered the problem on Rev B chips, and have
verified that it also happens in rev C.
We have seen this problem with the window enabled or disabled, and we
intend to run with the window disabled. If we are having a problem
with a given board, we can cause the chip to boot up properly by
either heating or cooling by a fraction of a degree, or by applying a
slight torque/twist to the PCB. The direction of the force is
important, but is not the same for two different boards. Once a given
board has booted properly, it is extremely robust. They survive power
line disturbances at 2.5kV with 10nS rise time, EMI at >190V/M, and
ESD events at 16kV while operating.
While the mechanical sensitivity would seem to indicate a PCB problem,
we have replicated this on an Atmel evaluation kit board. The amount
of flex varies, as little as 1/16th inch over 8" of board length to
take it into or out of failure. The thermal sensitivity is also
extreme, heat from a slight touch of finger on the CPU for only a
couple seconds is enough to induce or stop the rebooting. I can't
imagine the die temperature is changing more than a fraction of a
We implemented an extremely stripped down version of the code that
only flashes some LEDs to indicate that the application is running,
and this code also exhibits the problem.
The best that we have been able to do so far is to isolate that it
seems to do with when (relative to rise of /RESET) the WDT is initted.
For a given chip, if we change where in time the WDT is configured
using NOPs, we can create or eliminate the problem on that board. Some
chips don't seem to exhibit the problem but since temperature, timing,
and PCB flex all seem to be part of the equation we are only
comfortable saying that a given system "has not been observed to have
There may be some relationship with the phase or timing of the slow
clock, at the time that the WDT is configured, but we have not been
able to find anything yet.
Out of a couple hundred of our boards, roughly 10% exhibit the problem
with a given code set. If we had three boards fall out from a batch
"A", "B", and "C" then if we change the time before the WDT is initted
and re-program the batch, maybe "A", "D", and "W" would fail. We have
a codeset that has never been observed to fail, but given the nature
of the problem, we are extremely nervous.
We have been through the errata and the data sheet extensively both by
ourselves and at Atmel in San Jose, with their technical people. So
far nobody can explain why we are seeing this.
Has anyone here seen this problem? Solved it?
More... (looser matching)
- Last day of these posts
- In 2011
, 2012 only
- New search...