Searching \ for '[OT]Re: Failure modes (was: wierd jump)' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=failure+modes+was
Search entire site for: 'Re: Failure modes (was: wierd jump)'.

Exact match. Not showing close matches.
PICList Thread
'[OT]Re: Failure modes (was: wierd jump)'
1999\02\09@014358 by Alan King

picon face
But that's the point.  Complete shut down and relying on a fail safe is
a poor choice in any safety related case.

 Lets test your system and mine:

 1000 people jump to test mine.  1000 more people jump to test yours.

 Just to give a good test, the independent testing lab puts in known
bug to make ALL fail..

 My software opens chutes a little early or late.  The 1 in 1000 guy
gets down and finds out his fail-safe wasn't working.  He feels lucky
and sings my praises that even when the software was intentionally
screwed up and the fail-safe malfunctioned, he got down ok, if not as
planned.

 Your 1 in 1000 fail-safe failure guy becomes Purina Earthworm Chow.
No matter which 1 in 1000 it is, because all of your software shuts down
to not have a 'subtle malfunction'.  I attest that my method is better
IMHO.  You attest to YHO.  My alive guy agrees with me.  Your dead guys
family agrees with me.  You get sued out of business and I don't.

 My opinion of which way is better overall doesn't count.  Neither does
your's.  The first way FITS this problem with a better outcome.  Your
prejudice against 'subtle malfunctions' keeps you from even seeing why
someone would do it that way, much less why it's better in some
instances.  In my STRONG opinion, Pics are the best thing since sliced
bread, they are easy to use and versatile.  But regardless of my
opinion, when I first look at a project the first thought is 'Should it
be microprocessor controlled?' and the second is 'Is a Pic the right one
for the job?'  You still see dogged opinions as a plus to engineering
skills.  I've been bitten in the ass enough by that dog (my opinion and
other people's) to realize they're a minus.  No one method of handling
errors can be the right one in *all* cases, and you aren't pulling back
far enough to see when other ways may be better..

Alan


Gerhard Fiedler wrote:

> actually, from a parachute opener i'd very much like to have a strongly
> audible beep -- and NOTHING else -- if =anything= goes wrong with it. and
> if it's a situation where the beep is not enough, maybe a complete shut
> down with of course the following fail-safe parachute opening. but not any
> kind of "subtle malfunctioning"...
>
> ge

1999\02\09@035702 by Michael Rigby-Jones

flavicon
face
No it isn't.  I used to work for a comapny making railway signalling
equipment.  This stuff had to be totaly fail-safe.  All microprocessor
equipment was duplicated, each side cross checking the other.  The micro's
continuously ran RAM tests and ROM CRC tests.  A hardware watchdog would
blow the main supply fuse to the micros if anything went wrong. There was
even circuitry to make sure the PSU's could supply enough current to blow
the fuses and that the fuse blowing transistors were functional.  These
things were just looking for any excuse to shut themselves down, in which
case the signals would default to red.  Would you rather have your train
late, or end up as the filling in a train sandwich?

The 'chute opener should have similar fail-safe capability.  It seems to me
that it needs to signal the user that it has shut down and that manual
operation is required.  In any case, I really don't think a PIC is suitable
for this kind of thing, Microchip definately say PICs are not to be used in
saftey critical applications.

On the other hand, a micro controlling your TV remote, or the tuning in your
car radio can quite safely go loopy without the possibility of hurting
anyone.  The trouble is that faulty code is unpredictable, you don't know
exactly what problems it could cause until it's too late.  You only have to
look at the numerous Y2K software issues to see this.

Regards


Mike Rigby-Jones
spam_OUTmrjonesTakeThisOuTspamnortelnetworks.com


> ----------
> From:         Alan King[SMTP:.....shadedemonKILLspamspam@spam@MINDSPRING.COM]
> Sent:         09 February 1999 06:43
> To:   PICLISTspamKILLspamMITVMA.MIT.EDU
> Subject:      [OT]Re: Failure modes (was: wierd jump)
>
> But that's the point.  Complete shut down and relying on a fail safe is
> a poor choice in any safety related case.
>
<snip>


{Quote hidden}

1999\02\09@042617 by Gerhard Fiedler

picon face
At 01:43 02/09/99 -0500, Alan King wrote:
>  My software opens chutes a little early or late.  The 1 in 1000 guy
>gets down and finds out his fail-safe wasn't working.  He feels lucky
>and sings my praises

your assumption: the failure leads only to a little deviation in time (or
height). if your micro can determine that with cetainty, it's not failing.
it might be too much "a little" late for your guy being able to feel lucky
and sing your praises. how do you know it's gonna be just "a little" early
or late, if the thing isn't working properly? we're not talking here about
a measurement out of range, which you can assume to have been false and
safely ignore in the light of the following measurements, we're talking
about a controller maybe jumping to not previsible code areas. so you know
that something like this will lead to only "a little early or late"? maybe
we're talking about very different types of failures here.

>  Your 1 in 1000 fail-safe failure guy becomes Purina Earthworm Chow.

did you actually read my message? how do you come to the conclusion that
"my" guy falls straight down? after either my device beeps loud enough to
tell him not to rely on it anymore and to pull the string, or, if that's
not possible (i'm no skydiver), the device recognizes something's wrong
with it and shuts down and is of course designed so that when it shuts
down, the string gets pulled? that's called "fail-safe." so how do you
figure my guy is dead? his string might have been pulled too early, but
from all i know, this is probably better than too late. (and if there's a
better "fali-safe" reaction -- as i told you, i'm no skydiver --, the
system should of course be designed to use that better reaction.)


>Your
>prejudice against 'subtle malfunctions' keeps you from even seeing why
>someone would do it that way, much less why it's better in some
>instances.

do you know me well enough to be able to throw up such a strong judgement?

>In my STRONG opinion, Pics are the best thing since sliced
>bread, they are easy to use and versatile.

i know better things for certain situations, but that's another matter
(ever chewed on a pic? :). i don't see how this relates to the question in
question.

>You still see dogged opinions as a plus to engineering skills.

again, do i know you? (it seems if you know me so well that you can say "i
=still= see" whatever, you must know me for some time, and i should at
least have a clue from when and where...)


>No one method of handling errors can be the right one in *all* cases,

that's pretty much my point. but in any case i'd like to be on the safe
side. i actually think that's right in =all= cases.

ge



i leave this quote here, right out of your message, so you can look it up
and see that the parachute actually opens in my scenario... :)

>Gerhard Fiedler wrote:
>
>> actually, from a parachute opener i'd very much like to have a strongly
>> audible beep -- and NOTHING else -- if =anything= goes wrong with it. and
>> if it's a situation where the beep is not enough, maybe a complete shut
>> down with of course the following fail-safe parachute opening. but not any
>> kind of "subtle malfunctioning"...

1999\02\09@063726 by Russell McMahon

picon face
A parachute sounds like a "life support system" to me :-)

I'd be about as happy with a PIC as any other micro I've seen (ie -
grudgingly accept it :-)) as long as it didn't have to power up, the
circuit was genuinely EMC compliant,  and the system was self
contained.

One just possibly useful comment (and not a PIC in sight, almost)..
Fail safe is meant to mean just that - the failure mode is SAFE.
You need a total "map" of the activity (or as total as you can get)
to know what safe is throughout the process.
Safe will sometimes/often change with circumstance.

In this case, if a parachute opener went "bleep" and THEN opened the
chute, just as you jumped from the aeroplane, the results would very
probably be disastrous. You don't have to be a sky-diver to build
fail-safe equipment for them but you DO have to know everything the
skydiver needs to know which may be even slightly pertinent, at all
stages of the "process" if you are going to have a chance of making a
truly failsafe device.

Qualifier: I suspect that I have never built a truly failsafe product
in my life but common sense idiot-proofing-engineering goes a long
way down the road :-).

regards

       Russell McMahon

From: Michael Rigby-Jones <.....mrjonesKILLspamspam.....NORTELNETWORKS.COM>

>No it isn't.  I used to work for a comapny making railway signalling
>equipment.  ....

1999\02\09@084920 by Andy Stephenson

flavicon
face
At 08:42 09/02/99 +0000, you wrote:
>for this kind of thing, Microchip definately say PICs are not to be used in
>saftey critical applications.
>

Can you point me to your source on this one?

Rgds...

...Andy

1999\02\09@100359 by Michael Rigby-Jones

flavicon
face
I may have slightly exagerated here.  If you look at the footnotes on the
last page of a data sheet it says "Use of  Microchip's products as critical
components in life support systems is not authorised except with express
written approval by Microchip"

Most PIC's are unable to access code memory through instructions, and
therefore cannot perform a CRC check on their own code, a function that is
most certainly required for safety critical applications.  I think I have
seen written (in the PIC list) that some newer devices may be able to do
this.

Regards

Mike Rigby-Jones
EraseMEmrjonesspam_OUTspamTakeThisOuTnortelnetworks.com


{Quote hidden}

1999\02\09@112811 by dave vanhorn

flavicon
face
>>for this kind of thing, Microchip definately say PICs are not to be used in
>>saftey critical applications.
>>
>
>Can you point me to your source on this one?


RTFM.  I've never seen a processor vendor that didn't have that exclusion.

1999\02\09@113148 by Alan King

picon face
Gerhard Fiedler wrote:
>
> At 01:43 02/09/99 -0500, Alan King wrote:
> >  My software opens chutes a little early or late.  The 1 in 1000 guy
> >gets down and finds out his fail-safe wasn't working.  He feels lucky
> >and sings my praises
>
> your assumption: the failure leads only to a little deviation in time (or
> height). if your micro can determine that with cetainty, it's not failing.
> it might be too much "a little" late for your guy being able to feel lucky
> and sing your praises. how do you know it's gonna be just "a little" early
> or late, if the thing isn't working properly? we're not talking here about
> a measurement out of range, which you can assume to have been false and
> safely ignore in the light of the following measurements, we're talking
> about a controller maybe jumping to not previsible code areas. so you know
> that something like this will lead to only "a little early or late"? maybe
> we're talking about very different types of failures here.

***
NO, the discussion was about masking the argument to make sure and at
least stay in the table vs randomly hitting other code.  I was going FOR
masking, at least I hit a timing value and come back, so it will be
early or late.  If you don't mask, you don't know WHAT your code is
doing..
***



{Quote hidden}

***
 I do not see how you expect your code to fall off the end of a table,
but somehow magically still be running like you expect, and see
something is wrong to do the beep or shut down and execute your fail
safe.  You're arguing for having your code make hard errors, but making
your case like you're still executing your code..

1999\02\09@113357 by Keith Causey

flavicon
face
-----Original Message-----
From: Michael Rigby-Jones <mrjonesspamspam_OUTNORTELNETWORKS.COM>
To: @spam@PICLISTKILLspamspamMITVMA.MIT.EDU <KILLspamPICLISTKILLspamspamMITVMA.MIT.EDU>
Date: Tuesday, February 09, 1999 8:06 AM
Subject: Re: [OT]Re: Failure modes (was: wierd jump)


{Quote hidden}

The entire line of Scenix micro controllers can do this with the IREAD
command. The low 8 bits of the instruction are returned in W and the high
four are returned in the MODE register.
                                   Keith Causey
{Quote hidden}

1999\02\09@155128 by Gerhard Fiedler

picon face
At 11:32 02/09/99 -0500, Alan King wrote:
>NO, the discussion was about masking the argument to make sure and at
>least stay in the table vs randomly hitting other code.  I was going FOR
>masking, at least I hit a timing value and come back, so it will be
>early or late.  If you don't mask, you don't know WHAT your code is
>doing..

ok, but i assume that if you have to mask the jumps (that is, if the mask
has any effect), you don't know either what your code is doing, because you
don't know why you get that index out of bounds. (usually one makes sure
that in =no= case one can think of the index gets out of bounds. so this
clearly is a case one =didn't= think of...)

if you correctly mask all table reads, and make sure that all possible
reads don't jump outside of the table and really get results only
=slightly= off, you might have a point. (which i guess is impossible,
because what hinders your device to give the result "3000 feet abouve
ground" when it's only 3 feet and the device is not working properly,
albeit not jumping out of the table because of the mask? usually masking a
value, especially the high bits, does not result in "slight" changes, it
results in pretty strong changes. there are of course exceptions, like
linearization tables which only add a small amount to a linear function.
but especially in these cases masking the index is particularily bound to
leave a bug in the linearization algorithm undetected, because the wrong
results are only slightly wrong. during testing i'd rather have it jump
into nowhere or stop at the mask than go on.)

so the point is: =if= your code gets into a table with an index out of
bounds, this is a serious bug and should get caught at design (or
debugging) time. and there the mask might not be very helpful, since your
code happily runs along, possibly without you realizing that there is a
problem. (it of course may be helpful, if you do something other than going
along when it hits the mask.)


>  I do not see how you expect your code to fall off the end of a table,
>but somehow magically still be running like you expect, and see
>something is wrong to do the beep or shut down and execute your fail
>safe.  You're arguing for having your code make hard errors, but making
>your case like you're still executing your code..

depending on the design, i assume that the probability for eg. a watchdog
to hit is bigger if you let the code jump out of the table. but in this
case (like life-support systems), i probably would say the mask is right in
order: but the device should probably better shut down safely (whatever
this means in this circumstance -- eg. it could mean "continue to work, but
let the user know that it's possibly not reliable anymore") than simply go
along with a masked (and therefore arbitrary) index if the index is out of
bounds.

this, again, unless you can make sure that in no case of an index out of
bounds getting masked (and therefore arbitrary) the result is in unsafe
regions. but this of course depends on a lot of details and is not
generally so.

ge

1999\02\09@174729 by Andy Stephenson

flavicon
face
At 11:26 09/02/99 -0500, you wrote:
>>>for this kind of thing, Microchip definately say PICs are not to be used in
>>>saftey critical applications.
>>>
>>
>>Can you point me to your source on this one?
>
>
>RTFM.  I've never seen a processor vendor that didn't have that exclusion.
>
Yes I did RTFM thanks. The phrase 'Microchip definately say PICs are not to
be used' does not exist - or at least on my pass of RTFingM. No
manufacturer excludes themselves from these market places. They add
WARNINGS - not exclusions.

Now I'm after this exclusion on the C558, what page in the manual would it
be on? I am a slow reader, there are lots of pages, so your help is
appreciated. Even if you tell me which PIC you came across the exclusion, I
will cross ref the section to my data sheet.

Rgds...

...Andy

1999\02\09@182519 by dave vanhorn

flavicon
face
>Now I'm after this exclusion on the C558, what page in the manual would it
>be on? I am a slow reader, there are lots of pages, so your help is
>appreciated. Even if you tell me which PIC you came across the exclusion, I
>will cross ref the section to my data sheet.
>
>Rgds...
>
>...Andy


Last page in the book, (or the PDF file)  at the bottom.

"Use of Microchip's products as critical components in life support systems
is not authorized, except with express written approval by Microchip."

Got Letter?

1999\02\09@183957 by dave vanhorn

flavicon
face
>From Natsemi:
"National's products are not authorized for use as critical components in
life support devices or systems without the express written approval of the
president of national semiconductor corporation."

>From Maxim:
"Maxim does not authorize any Maxim product for use in life support devices
and/or systems without the express written approval of an officer of Maxim
Integrated Products Inc."

>From Dallas:
"Dallas Semiconductor products are not designed, intended, or authorized
for use as components in systems intended for surgical implant into the
body, or other applications intended to support or sustain life, or for any
other application in which the failure of the Dallas Semiconductor product
could create a situation where personal injury or death could occur."


What these statements really mean is, "You're on your own pal."
Nobody's going to stop you, unless you tell them what you're doing.
If you end up in a lawsuit, the uChip lawyer will be a witness for the
proscecution.

1999\02\09@194925 by Wagner Lipnharski

picon face
What they are saying is:

  "Our product works nice according with the technology
   available for mass production, by the best quality we
   can provide, but, it can fails without further notice,
   because the actual low cost technology used for
   massive production is not error free".

There are top quality technology, that can be used for high
critical circuits, involving also human life and life support
systems, they will not cost only $5 a piece, and they are not
produced in massive quantities like pizza, and probably we,
simple mortals, are not allowed to buy it.

You can ask yourself a question - Why a space shuttle, or a
missile quidance system, or the F22 flight computers cost so
much?  They don't buy components at Digikey, Marshal, NetBuy,
Radio Shack or other mass production distributors.  Nothing
wrong with them, I like them, but you can't find those six
sigma components in their inventory data base.

I worked 19 years at IBM as main-frame engineer, and I always
got surprised about the cost of a single microprocessor for
the series 4381 (old), it was around $18,000 - they had some
reason for it, at least the computer simply can't fail that
part once a year...

----------------------------------
Wagner Lipnharski - Orlando, FL
UST Research Inc. - http://ustr.net
----------------------------------

1999\02\09@211711 by Keith M. Wheeler

flavicon
face
At 06:24 PM 2/9/99 -0500, you wrote:
>>Now I'm after this exclusion on the C558, what page in the manual would it
>>be on? I am a slow reader, there are lots of pages, so your help is
>>appreciated. Even if you tell me which PIC you came across the exclusion, I
>>will cross ref the section to my data sheet.
>>
>>Rgds...
>>
>>...Andy
>
>
>Last page in the book, (or the PDF file)  at the bottom.
>
>"Use of Microchip's products as critical components in life support systems
>is not authorized, except with express written approval by Microchip."
>
>Got Letter?


Life support systems and safety critical are not the same thing.

-Keith Wheeler
ARMA Design                             http://www.ARMAnet.com/

1999\02\10@042739 by Nigel Orr

flavicon
face
At 11:55 09/02/99 +0000, you wrote:
>At 08:42 09/02/99 +0000, you wrote:
>>for this kind of thing, Microchip definately say PICs are not to be used in
>>saftey critical applications.
>>
>Can you point me to your source on this one?

At the back of most of the data books, just below the Worldwide Sales and
Service lists

"Use of Microchip's products as critical components in life support systems
is not authorized except with express written approval by Microchip"

Nigel
--
Nigel Orr                  Research Associate   O   ______
       Underwater Acoustics Group,              o / o    \_/(
Dept of Electrical and Electronic Engineering     (_   <   _ (
    University of Newcastle Upon Tyne             \______/ \(

1999\02\10@050318 by Gerhard Fiedler

picon face
At 17:23 02/09/99 -0800, Keith M. Wheeler wrote:
>>"Use of Microchip's products as critical components in life support systems
>>is not authorized, except with express written approval by Microchip."
>
>Life support systems and safety critical are not the same thing.

as i understand it, "life support systems" is a well defined juridical
term, that's probably why they use it there. is this true for "safety
critical" also?

ge

1999\02\10@101638 by goflo

flavicon
face
Wagner Lipnharski wrote:

> You can ask yourself a question - Why a space shuttle, or a
> missile quidance system, or the F22 flight computers cost so
> much?  They don't buy components at Digikey, Marshal, NetBuy,
> Radio Shack or other mass production distributors.

Seems the last Mars lander project did just that.
Of course, having skirted a Prime Directive
      ( IF IN_$10K THEN OUT_$1 )
and then succeeded will likely not enhance the careers
of those folks...

Jack

1999\02\10@122121 by Andy Kunz

flavicon
face
>Of course, having skirted a Prime Directive

Current operating orders are "better cheaper faster" and COTS.

Which Prime Directive did they ignore?

Andy


  \-----------------/
   \     /---\     /
    \    |   |    /          Andy Kunz
     \   /---\   /           Montana Design
/---------+   +---------\     http://www.montanadesign.com
| /  |----|___|----|  \ |
\/___|      *      |___\/     Go fast, turn right,
                              and keep the wet side down!

1999\02\10@151508 by John Payson

flavicon
face
> You can ask yourself a question - Why a space shuttle, or a
> missile quidance system, or the F22 flight computers cost so
> much?  They don't buy components at Digikey, Marshal, NetBuy,
> Radio Shack or other mass production distributors.

|Seems the last Mars lander project did just that.
|Of course, having skirted a Prime Directive
|       ( IF IN_$10K THEN OUT_$1 )
|and then succeeded will likely not enhance the careers
|of those folks...

When designing any tricky system, two questions need to be
answered for each possible failure mode:

[1] How much will this failure mode cost, if it occurs.

[2] How much does it cost to reduce the probability of that failure
   mode occuring below a certain level.

The idea with the Mars project is that given a choice between two
designs...

[1] Design #1 has a 90% chance of success, and costs $X; the cost
   of failure equals the entire cost of the project.

[2] Design #2 has a 99% chance of success, and costs $5X; the
   cost of failure equals the entire cost of the project.

If one builds two of the #1 units, the total cost is $2X (well under
that of one design #2 unit) and one has the same probability of
success as with the #2 design (there's a 1% chance both will fail).
One also has the advantage, however, that if the first one happens
to work one has a second one which can be used for further research.

1999\02\10@180450 by Alan King

picon face
John Payson wrote:

> The idea with the Mars project is that given a choice between two
> designs...
>
> [1] Design #1 has a 90% chance of success, and costs $X; the cost
>     of failure equals the entire cost of the project.
>
> [2] Design #2 has a 99% chance of success, and costs $5X; the
>     cost of failure equals the entire cost of the project.
>
> If one builds two of the #1 units, the total cost is $2X (well under
> that of one design #2 unit) and one has the same probability of
> success as with the #2 design (there's a 1% chance both will fail).
> One also has the advantage, however, that if the first one happens
> to work one has a second one which can be used for further research.

And even if the first fails and you use the second, you still have $3X
left to do other things..  Of course an astronaut should run like hell
from this kind of thinking, but unmanned stuff it's great!

1999\02\11@045814 by Russell McMahon

picon face
But, unfortunately, Mars only allows minimum energy transfer orbits
only every so not too often so a blown landing opportunity also costs
you some years wait (which, of course,  is not at all what this
thread is about :-)

   Russell

-----Original Message-----
From: John Payson <spamBeGonesupercatspamBeGonespamCIRCAD.COM>


>|Seems the last Mars lander project did just that.


>If one builds two of the #1 units, the total cost is $2X (well under
>that of one design #2 unit) and one has the same probability of
>success as with the #2 design (there's a 1% chance both will fail).
>One also has the advantage, however, that if the first one happens
>to work one has a second one which can be used for further research.
>

1999\02\11@112750 by John Payson

flavicon
face
|But, unfortunately, Mars only allows minimum energy transfer orbits
|only every so not too often so a blown landing opportunity also costs
|you some years wait (which, of course,  is not at all what this
|thread is about :-)

Though now that the concept is validated, it should be
possible to select, say, three landing sites and send over
three spacecraft next time.  Just as cheap as sending one
super-reliable one, we'll almost certainly get at least as
much data is from the super-reliable one, and we may quite
likely get three times as much.

1999\02\11@115842 by goflo

flavicon
face
Yup. John is quite right, too, but it's clear that the Mars Lander
project evolved the way it did because funding was not available
for the usual approach - One-off everything, and so on...
There was no prospect of a second effort. No money.

Jack

Russell McMahon wrote:

{Quote hidden}

More... (looser matching)
- Last day of these posts
- In 1999 , 2000 only
- Today
- New search...