> I don't think of the wdt as a way of handling software bugs but rather a way to
> get the microcontroller back into a known state after an external event (emi,
> esd, surge...) that has set it into an abnormal state which includes a
> malfunctioning program or peripheral function that can't be restored without a
> reset. Just like the BOR handles voltage dropouts below a critical value.
>
> And then, as you say, it is important to clear the wdt only when you know that
> the program is acting as intended, which can be quite difficult to know.
> Clearing it in a timer interrupt is not a good idea.
>
> /Ruben
>
>
>
>> I just found the post below in my Drafts folder in Outlook Express. I don't
>> normally use drafts, so didn't notice this post was stuck there since early
>> September. Since I have no recollection what this thread was originally about,
>> I changed the subject and topic tag. The points about watchdogs are still
>> relevant though, and I think often not considered.
>>
>> ------------------------------------------------------------------------
>>
>> Bob Axtell wrote:
>>
>>> I, too, use the PIC WDT always, and disable the interrupts
>>> infrequently. The only place that clears the WDT (CLRWDT) is in the
>>> timer interrupt, which occurs in less than 15mS intervals.
>>>
>> I think watchdogs, particularly internal ones, are way overrated. All they are
>> going to do is reset the part when a particular kind of software bug occurs.
>> There are several problems with this:
>>
>> 1 - They detect only a rare kind of software bug that made it into
>> production. The system wedging is the kind of blatant symptom that is quite
>> unlikely to remain in final code.
>>
>> 2 - A hard reset may not be the best way to recover from a software bug. In
>> certain systems this may be worse than wedging.
>>
>> 3 - There is the very real chance that a new bugs are introduced because the
>> watchdog isn't kicked often enough. This kind of timing related bug is more
>> likely to make it into production than the obvious wedging. I think in most
>> cases the cost of 3 outweighs the benefits of 1.
>>
>> 4 - They give a false sense of security. "I'm using the watchdog so
>> everything is safe."
>>
>> 5 - All too often people don't think about using the watchdog effectively
>> when they do use it. The worst thing is to kick the dog in a periodic
>> interrupt routine. Think about it. The interrupt routine is probably
>> small, with few branches, and well tested. The chances of it wedging are
>> very small. If something goes wrong, it will probably be in the foreground
>> code. However, with the interrupt routine kicking the dog, the foreground code
>> can do all manner of bad things and the system will keep right on running. What
>> you need is a mechanism that guarantees both parts of the code are running.
>> Have the interrupt routine set a flag to kick the dog and have this be one event
>> the main event loop checks. Now both have to be running for the dog not to
>> bite.
>>
>> If you've got a complicated system with a lot of asynchronous events where
>> things could get clogged up or you're using code from elsewhere you don't
>> really trust (TCP/IP stack for example), then using a watchdog can be
>> reasonable. But if it's that important, you can't really trust the internal
>> watchdog because it's too tightly coupled to the same firmware it's trying to
>> guard against failures in. In one case like that I used a 10F200 as a watchdog.
>> What's nice about that is you can program it for particular characteristics.
>> In that case, for example, it let the main processor do whatever it wanted for
>> the first 3 seconds, since it had a sortof bootup procedure before it ran its
>> normal code. After the initial period, it required a heartbeat line to be
>> toggled every 500mS +- 75mS. The heartbeat was toggled by one of the tasks in
>> the cooperative tasking system of the main processor, but its timing was derived
>> from state left around by the periodic interrupt. In a simple round robin task
>> scheduler, you know all tasks are running if one of them runs periodically. You
>> can't detect a task stuck in a loop waiting calling TASK_YIELD waiting for
>> something that will never happen, but if it completely wedged, all other tasks
>> would stop too.
>>
>>
>> ********************************************************************
>> Embed Inc, Littleton Massachusetts,
http://www.embedinc.com/products
>> (978) 742-9014. Gold level PIC consultants since 2000.