Searching \ for '[EE] Line-oriented text based protocols' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=line+oriented+text
Search entire site for: 'Line-oriented text based protocols'.

Exact match. Not showing close matches.
PICList Thread
'[EE] Line-oriented text based protocols'
2007\09\30@113936 by Peter Todd

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Practically all commonly used protocols on the internet, at least the
ones that use TCP as a transport, are line oriented, text based and
human readable and writable. Is there an actual RFC (or similar
standards document) describing this practice as a "best practice"?

I ask because at a job I have I wrote a little server app that has a
protocol along the lines of this:

beat X Y V\n
beat X Y V\n
beat X Y V\n

There X, Y are indexes to an array and V is the value to put in the
array. The application needs essentially no extra functionality and is
without a doubt a prototype that will be thrown out before the next
revision. The whole reason why there even is the protocol is due to a
temporary architectural hack. I did my server part of the project weeks
ago and it was quite nice being able to simply telnet and type the
commands manually.

The other engineer on the project, who is working on the client, wants
me to change it to essentially this:

SetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueV

There really are supposed to be no spaces, newlines, or nulls, between
the command packets. That's exactly what he wrote his app to produce.
The last time we met to integrate the two halves, he crypticly said
there was a deep reason for the above that I, without 4 years of
computer science, was never going to understand. If he's thinking job
security, I think he is wrong, because the whole app would take about a
weekend to write, and his client part of it is already 4 weeks late. I
figure I could get it done in a week, including learning Java on mobile
devices...

Anyway, if I'm not going to convince him, at least I might convince our
bosses why I think he is completely insane. RFC's just might help build
my case. I've got a nice list of every internet protocol I could find
using this design, but a meta-document would be even nicer.

- --
http://petertodd.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG/8In3bMhDbI9xWQRArbnAJ971WGb+TXemCbvgFtT0oM6gt7QtACfSd2H
mqOVw9Kjjoa+hNjGDoy8yls=
=tdT/
-----END PGP SIGNATURE-----

2007\09\30@120450 by wouter van ooijen

face picon face
> Practically all commonly used protocols on the internet, at
> least the ones that use TCP as a transport, are line
> oriented, text based and human readable and writable. Is
> there an actual RFC (or similar standards document)
> describing this practice as a "best practice"?

I would not consider this a "best" practice (common: yes, best: no), and
I am sure there are plenty protocols that don not follow this "line".
(maybe: remote file systems? any other protocol that must pass data
"transparent"?)

But in most cases it does not harm either. If you want to aim your
arrows target them to anyone who wants to spent time fussing over this.

> The last time we met to integrate
> the two halves, he crypticly said there was a deep reason for
> the above that I, without 4 years of computer science, was
> never going to understand.

After 4 y CS, 14 y working in the field, a few years as an independent,
and 4 y teaching I can't think of any reason why this would realy
matter. If performance is the issue any ASCII endoing would be stupid.

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\09\30@130500 by Mark Rages

face picon face
On 9/30/07, Peter Todd <spam_OUTpeteTakeThisOuTspampetertodd.ca> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Practically all commonly used protocols on the internet, at least the
> ones that use TCP as a transport, are line oriented, text based and
> human readable and writable. Is there an actual RFC (or similar
> standards document) describing this practice as a "best practice"?

...

>
> Anyway, if I'm not going to convince him, at least I might convince our
> bosses why I think he is completely insane. RFC's just might help build
> my case. I've got a nice list of every internet protocol I could find
> using this design, but a meta-document would be even nicer.
>

The Art of Unix Programming (google for it) probably has a few words
to say about this.

Anyone who has used Unix for a while will appreciate the simple
usefulness of human-readable, line oriented formats.   In fact, if you
are on a Unix system, you will find a whole set of tools designed to
work on your format.

Regards,
Mark
markrages@gmail
--
Mark Rages, Engineer
Midwest Telecine LLC
.....markragesKILLspamspam@spam@midwesttelecine.com

2007\09\30@171934 by William \Chops\ Westfield

face picon face

On Sep 30, 2007, at 8:35 AM, Peter Todd wrote:

> Practically all commonly used protocols on the internet, at least the
> ones that use TCP as a transport, are line oriented, text based and
> human readable and writable. Is there an actual RFC (or similar
> standards document) describing this practice as a "best practice"?

Not that I know of.  I'm not sure it's even true.  The "line oriented"
has caused problems even on PICList, for example, and  differences in
the value of "newline" have caused issues for as long as I can remember.
(ok, everything  file-transfer-like (FTP, MAIL, HTTP) is a copy of the
original NCP FTP protocol with text commands and responses.  But as a
counter example, Telnet negotiations are all binary...)

His scheme is somewhat more error-correcting in that each numeric value
is labeled with its type, but that doesn't seem to be a good reason for
omitting "field delimiters" like space and "record delimiters" like
newline.

On the third hand, without clear reasons, I would say he should change
HIS code because it's easier to change output code than parsing code.
Especially if the project is already 4 weeks late.

BillW

2007\09\30@175950 by Gerhard Fiedler

picon face
Peter Todd wrote:

> Practically all commonly used protocols on the internet, at least the
> ones that use TCP as a transport, are line oriented, text based and
> human readable and writable. Is there an actual RFC (or similar
> standards document) describing this practice as a "best practice"?

As others said, there probably isn't anything like this. Also consider that
there is a plethora of protocols that is not ASCII and therefore by
definition can't be line oriented (CAN for example, or I'm sure pretty much
all the streaming protocols).

What is "human readable" depends a lot on your tools. A CAN bus protocol
that consists of series of integers (raw, that is, binary encoded) in the
CAN bus packet payload can very well be considered "human readable" if you
have a CAN bus monitor that can display such packets. Of course, you can't
feed that bitstream into a normal terminal program.


> beat X Y V\n
> beat X Y V\n
> beat X Y V\n

> SetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueV

> The last time we met to integrate the two halves, [...]

Such a protocol is typically something that would (should? :) be defined
before any of the two halves is implemented. Normally I wouldn't go ahead
with such an implementation until I sent out a message (to the bosses and
the involved developers) with the protocol definition and a phrase like "if
I don't hear from you until <date>, that's what will be used by all parties
going forward" -- and then I wait for either a feedback or the timeout :)

> [...] he crypticly said there was a deep reason for the above that I,
> without 4 years of computer science, was never going to understand.

I'd say whether you understand the reason or not is not his business :) --
and he should be able to provide it. OTOH, it seems to me that parsing your
protocol and parsing his protocol is not that much different. So whether
that is worth fighting over depends on other criteria, not technical ones.

The one thing that his "protocol" doesn't provide is a packet end marker.
To me, that seems to be the only technical difference (and it doesn't
matter whether that end marker is an end-of-line or 'End' or whatever).
Whether that is a serious flaw depends on criteria I mostly don't know
(among them the format of the numbers, especially V).

> Anyway, if I'm not going to convince him, at least I might convince our
> bosses why I think he is completely insane.

He's probably not insane, just lazy. And dishonest.

Gerhard


'[EE] Line-oriented text based protocols'
2007\10\01@061327 by Peter Todd
picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Sep 30, 2007 at 06:57:12PM -0300, Gerhard Fiedler wrote:
{Quote hidden}

Of course. I mean, I am restricting my examples to protocols intended to
be used where bandwidth and processor power is plentifull. My server bit
talks to hardware over a serial line, and I'm using a binary protocol
for that for instance.

Anything UDP seems to be binary for instance, just to be able to keep
packet sizes low. DNS even has a complex scheme to do compression based
on root prefixes. All binary.

> > The last time we met to integrate the two halves, [...]
>
> Such a protocol is typically something that would (should? :) be defined
> before any of the two halves is implemented. Normally I wouldn't go ahead
> with such an implementation until I sent out a message (to the bosses and
> the involved developers) with the protocol definition and a phrase like "if
> I don't hear from you until <date>, that's what will be used by all parties
> going forward" -- and then I wait for either a feedback or the timeout :)

Yeah... well, we did have that discussion, and I did almost that, and he
timed out... :)

Good idea on the "if I don't hear from you" bit. I'll do that next time.

> > [...] he crypticly said there was a deep reason for the above that I,
> > without 4 years of computer science, was never going to understand.
>
> I'd say whether you understand the reason or not is not his business :) --
> and he should be able to provide it. OTOH, it seems to me that parsing your
> protocol and parsing his protocol is not that much different. So whether
> that is worth fighting over depends on other criteria, not technical ones.

Agreed, I've definitely spent more time arguing it than it'd take to
just implement it. But our bosses seem to think it's him in the wrong,
so I'm not giving in easilly just so I'll have to deal with all this
again next time. This particular project isn't my top priority on that
contract anyway, so it can wait.

> The one thing that his "protocol" doesn't provide is a packet end marker.
> To me, that seems to be the only technical difference (and it doesn't
> matter whether that end marker is an end-of-line or 'End' or whatever).
> Whether that is a serious flaw depends on criteria I mostly don't know
> (among them the format of the numbers, especially V).

Admittedly V is a 1 or a 0...

I like my format because the whole thing is a simple "break input into
lines" then scanf() combo. In perl or python you get similar quick
mechanisms. 5 minutes of work vs 1 hour of custom parsing code looking
for 'End'

> > Anyway, if I'm not going to convince him, at least I might convince our
> > bosses why I think he is completely insane.
>
> He's probably not insane, just lazy. And dishonest.

I kinda hope so, although, it's making me very suspicious of his
intentions, and perhaps his timesheets.


But this is a public form, so I'll remind people I do contract work with
tonnes of clients. :)

- --
http://petertodd.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHAMRV3bMhDbI9xWQRAsdpAJ9rzveNCjGsBBVN1K4wVVcFsHz28gCcDx4n
FIUxNNQIzrte8TrSoPTQNyg=
=P8KO
-----END PGP SIGNATURE-----

2007\10\01@061330 by Peter Todd

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Sep 30, 2007 at 12:04:58PM -0500, Mark Rages wrote:
> > Anyway, if I'm not going to convince him, at least I might convince our
> > bosses why I think he is completely insane. RFC's just might help build
> > my case. I've got a nice list of every internet protocol I could find
> > using this design, but a meta-document would be even nicer.
> >
>
> The Art of Unix Programming (google for it) probably has a few words
> to say about this.

Ahh, good idea. I'm a big fan of that book, first read it years ago, but
it's been long enough I didn't even think of it.

> Anyone who has used Unix for a while will appreciate the simple
> usefulness of human-readable, line oriented formats.   In fact, if you
> are on a Unix system, you will find a whole set of tools designed to
> work on your format.

Exactly. Like when I got a new laptop and was able to make sure I had
all my old programs available so I could set them up properly off-line
with a simple:

cat /var/lib/something/packages | cut -f " " -n 1 > packages
scp ... ...
cat packages-installed-on-old-laptop | xargs -n 1 apt-get install

(form memory obviously)

5 minutes of work to figure it out vs. 30 minutes trying to figure out
which of the dozen "system configuration mover" utilities to even try to
use. :)

- --
http://petertodd.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHAL5y3bMhDbI9xWQRAp1bAJ9N8HMtYPxKFr8rM7JB+yURECEI+gCfTaUy
h2/Vw+cY2ydMuhcEb4ArQ3Q=
=giPu
-----END PGP SIGNATURE-----

2007\10\01@061336 by Peter Todd

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Sep 30, 2007 at 06:03:33PM +0100, wouter van ooijen wrote:
{Quote hidden}

Actually, that touches on an issue that I'd be very interested to see a
good discussion of. There are a number of ways to pass data, http for
instance uses content-length headers or simply defining the end of the
data to be when the connection is closed. (over simplifying here...)
SMTP uses the a single .\n by itself as the end of the message, if the
message has a .\n in it, have ..\n instead. (although there is an
extension to simply pass stuff 8-bit clean, not quite sure the details
there) In researching this I did find a site that talked about those
various methods, but not in the context of a "universally" agreed upon
best practices document.

Of course, in addition to that you get the design pattern of how so many
programs store xml now. The protocol/file format is plain old XML, which
is then wrapped in a compression format. OpenOffice's ODF for instance
is XML inside a ZIP format archive. (so images and the like can be in
their own files within the archive) Other formats simply put everything
in a gzip/bzip2 box. Compression technology is so good now that many
people report such approaches yield smaller files than a binary
format. The main problem with that approach is file
identification gets messed by, the file command (unix) reports the
wrapper rather than what's inside, but there may be better approaches
there too...

> But in most cases it does not harm either. If you want to aim your
> arrows target them to anyone who wants to spent time fussing over this.

My thoughts exactly...

> > The last time we met to integrate
> > the two halves, he crypticly said there was a deep reason for
> > the above that I, without 4 years of computer science, was
> > never going to understand.
>
> After 4 y CS, 14 y working in the field, a few years as an independent,
> and 4 y teaching I can't think of any reason why this would realy
> matter. If performance is the issue any ASCII endoing would be stupid.

Performence is definitely not it, messages will come down the line no
faster than a user can click a button. If anything, his approach is
going to be much more tricky to parse.

- --
http://petertodd.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG/8+E3bMhDbI9xWQRAnwdAKCFePoKPnl7TS+BWSFuQuoA9UGjfwCgpth2
jbUo4Xu1xwnasD1CegUaaJo=
=67TZ
-----END PGP SIGNATURE-----

2007\10\01@061342 by Peter Todd

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Sep 30, 2007 at 02:19:31PM -0700, William Chops Westfield wrote:
{Quote hidden}

Quite true. As I mentioned to another poster, there is the issue of the
right way to pass files as well, content-length vs. escape characters.
Which also occures to me makes a big difference in efficiency sometimes,
if you are using with the ultra-optimized sendfile syscall on linux and
the like.

Though even windows telnet sends \n's as newline by default if I'm not
mistaken. My code as implemented wouldn't care anyway.

> (ok, everything  file-transfer-like (FTP, MAIL, HTTP) is a copy of the
> original NCP FTP protocol with text commands and responses.  But as a
> counter example, Telnet negotiations are all binary...)
>
> His scheme is somewhat more error-correcting in that each numeric value
> is labeled with its type, but that doesn't seem to be a good reason for
> omitting "field delimiters" like space and "record delimiters" like
> newline.

Yeah, and actually my code will include what I think is a much better
error checking system. Namely, the client acts like a keyboard in that
it sends "key presses", or values being changed, but the indicators to
the user as to what the state is are changed by the server, like the
caps lock status. If the user makes a command and doesn't see a
response, I'll assume they'll try again. Makes it possible to have
multiple users see the same state for free as well.

Kinda politically charged though, as it means his client stays very
simple and dumb, while all the logic is in the server. He wanted to push
the application code all the way to the client...

> On the third hand, without clear reasons, I would say he should change
> HIS code because it's easier to change output code than parsing code.
> Especially if the project is already 4 weeks late.

Yup, especially when it'd take him (or even me) all of 15 minutes to do.
Something I'm going to be pointing out to our increasingly annoyed
bosses.

- --
http://petertodd.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHAMcq3bMhDbI9xWQRAivYAKCGaS+DYQtAKsyoWXNwLwaBUvJ11ACcC8v6
kIrBBytlL7IjL8RU3HEfSuk=
=Nskn
-----END PGP SIGNATURE-----

2007\10\01@061931 by Alan B. Pearce

face picon face

>The one thing that his "protocol" doesn't provide is a packet end
>marker. To me, that seems to be the only technical difference (and
>it doesn't matter whether that end marker is an end-of-line or 'End'
>or whatever). Whether that is a serious flaw depends on criteria I
>mostly don't know (among them the format of the numbers, especially V).

To me not having an end of packet marker is dangerous, as it limits the
ability of the system to re-sync the data stream on an error or missed
block.

2007\10\01@064314 by wouter van ooijen

face picon face
> Actually, that touches on an issue that I'd be very
> interested to see a good discussion of. There are a number of
> ways to pass data, http for instance uses content-length
> headers or simply defining the end of the data to be when the
> connection is closed. (over simplifying here...) SMTP uses
> the a single .\n by itself as the end of the message, if the
> message has a .\n in it, have ..\n instead. (although there
> is an extension to simply pass stuff 8-bit clean, not quite
> sure the details
> there) In researching this I did find a site that talked about those
> various methods, but not in the context of a "universally"
> agreed upon
> best practices document.

IMHO this is an indication that there is not a lot of difference in
'bestness' between the various methods used.

But note that not all transport level protocols are good at defining the
end of a connection. IIRC the now almost forgotten OSI TP4 had a
reliable (two or three way handshake) way to close a connection, TCP has
not.

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\10\01@092139 by Gerhard Fiedler

picon face
Peter Todd wrote:

>>> beat X Y V\n
>>> beat X Y V\n
>>> beat X Y V\n
>>
>>> SetBeatXStepYValueVSetBeatXStepYValueVSetBeatXStepYValueV

> I like my format because the whole thing is a simple "break input into
> lines" then scanf() combo. In perl or python you get similar quick
> mechanisms. 5 minutes of work vs 1 hour of custom parsing code looking
> for 'End'

Now come on... if you can write a parser for your protocol in 5 min, you
should be able to write one for his in 5 min also (at least that part that
you showed us). In both cases the critical part is what to do with error
situations (unexpected characters), and scanf doesn't help you much with
that anyway. So if you can use scanf, you can use a very simple scheme for
his format also.

Gerhard

2007\10\01@093356 by Gerhard Fiedler

picon face
Alan B. Pearce wrote:

>> The one thing that his "protocol" doesn't provide is a packet end
>> marker. To me, that seems to be the only technical difference (and it
>> doesn't matter whether that end marker is an end-of-line or 'End' or
>> whatever). Whether that is a serious flaw depends on criteria I mostly
>> don't know (among them the format of the numbers, especially V).
>
> To me not having an end of packet marker is dangerous, as it limits the
> ability of the system to re-sync the data stream on an error or missed
> block.

Well, there seems to be only one type of packet and that has a start marker
('Set'), so the end marker could not be necessary. Resyncing is probably
not any easier on

 SetBeatXStepYValueVEndSetBeatXStepYValueVEndSet...

than it is on

 SetBeatXStepYValueVSetBeatXStepYValueVSet...

Gerhard

2007\10\01@094228 by M. Adam Davis

face picon face
The protocol you suggest is distinctly human readable ASCII.  The
protocol he is suggesting is a poor compromise between ASCII and
binary and has the following difficulties to boot:

* More difficult to parse as a human
* More difficult to parse as a machine
* More susceptable to line noise
* Much more bandwidth required (don't use it over cellular!)

There is a difference between stream encoding and discrete encoding,
and your particular application should suggest one approach over the
other.  His is more stream based, yours is discrete.

I'd be asking him: "Why are you trying to marry, and poorly so, binary
stream communications with ascii record communications?  What is the
advantage?  Despite my obvious lack of computer science, I need you to
explain this in a way that I can understand and explain it to others.
If you can't teach it, then you obviously don't understand it well
enough to safely implement it, and we prefer you use a method we can
prove is reliable."  Of course, you'll have to say that in a more
tactful manner.  I suspect he made a choice, and wants to stick with
it, though there's no good reason to do so.  Sounds like a bad worker
though - I bet you've had to hold his nose to the grindstone more
often than you'd like.

In terms of time/value, it is easier to change the output than the
input.  However, if he is a consultant or seperate entity then it may
be cheaper/faster for you to change the input.

Lastly, sounds like a turf war.  "Best Practices" are all well and
good, but who is the real customer, and how is this going to affect
them?  Unless otherwise specified, fix it in a way that meets
requirements, and make a mental note about how this particular person
has to be dealt with in the future.  It's annoying, but it smacks of
"not invented here" and all the extra email, discussions, and lateness
aren't going to help your position even though you may be able to show
that it's not your fault.  If you had the power to fix it silently,
you may still be considered part of the problem.

Plus it's fun - "Yeah, he mis-implemented the protocol that I emailed
everyone at the beginning of the project, but I coded the parsing
scheme to accept both his bastard protocol and the correct protocol.
Hopefully he'll fall in line, but if not at least we can move
forward."  Make sure to CC him on any such communications.  Doesn't
work in environments where the spec is document controlled and signed
off, but if that were the case then you wouldn't be having this
problem...

-Adam

On 9/30/07, Peter Todd <petespamKILLspampetertodd.ca> wrote:
{Quote hidden}

> -

2007\10\01@113633 by Chris McSweeny

picon face
Now I'm lazy - he just sounds incompetent, whilst you're simply being
pessimistic with the timescale required. I reckon I could change any code in
pretty much any language that produces his preferred format into something
which produces your preferred format in 5 minutes. Heck, the keypress
sequence in VS goes something like:
^H
SetBeat%dStep%dValue%d
<TAB>
beat %d %d %d\n
<alt>F
<alt>R

On 10/1/07, Peter Todd <.....peteKILLspamspam.....petertodd.ca> wrote:
>
> > On the third hand, without clear reasons, I would say he should change
> > HIS code because it's easier to change output code than parsing code.
> > Especially if the project is already 4 weeks late.
>
> Yup, especially when it'd take him (or even me) all of 15 minutes to do.
> Something I'm going to be pointing out to our increasingly annoyed
> bosses.
>

2007\10\02@022551 by Nate Duehr

face
flavicon
face

On Oct 1, 2007, at 4:19 AM, Alan B. Pearce wrote:

>
>> The one thing that his "protocol" doesn't provide is a packet end
>> marker. To me, that seems to be the only technical difference (and
>> it doesn't matter whether that end marker is an end-of-line or 'End'
>> or whatever). Whether that is a serious flaw depends on criteria I
>> mostly don't know (among them the format of the numbers,  
>> especially V).
>
> To me not having an end of packet marker is dangerous, as it limits  
> the
> ability of the system to re-sync the data stream on an error or missed
> block.

As a "engineering support" guy for most of my career, I can "ditto"  
this sentiment.

Virtually EVERY screw up in code by software engineers (especially  
those writing code to talk on busses/backplanes/networks they didn't  
design), that I've worked with over the years, has almost always come  
back to one of the most basic premises that all engineers writing  
code to talk over any kind of bus, network, or ANYTHING should know...

"What goes in isn't always what comes out, in the real world."

Any (good) comm protocol, even on INTERNAL busses, should alway have  
start/finish indicators and a numbering scheme, so the receiver can  
ask for re-send.

Been there, done this, too many times...

"Lost" packets of information on busses, backplanes, and other crap  
make for a really long workday for the guys trying to PROVE to the  
engineer that wrote it (sometimes YEARS ago) that there really is a  
problem with the hardware.

There's almost never low-level tools to look at the data streams, and  
when there are, they usually don't give much information, since  
they're usually using the code to "receive" that the receiver that's  
not getting what it wants, is using.  (Hint #2: Never let the same  
guy write the diagnostic tools as the guy who writes the comm  
protocol to begin with.)

Anyway... if you're not the "hardware guy"/hardware engineer and  
you're writing code to talk on someone else's hardware design, hell  
even on your own... overbuild it please.  :-)

Of course, if you do, you'll put me out of a job, I guess -- but  
there's always new/different bugs and problems to hunt down.    :-)

--
Nate Duehr
EraseMEnatespam_OUTspamTakeThisOuTnatetech.com



2007\10\02@100628 by David VanHorn

picon face
:)

Fun with protocols.

Yup, it's easy to make interesting mistakes in protocols.

Picture a packet structure like this:  (<CHAR> is a single ascii control byte)

<STX>Biglongpacketwithfixedlengthfieldsandnoseparators<ETX><LRC>

That's more or less Visa "second generation" packet structure.
The 1G structure had field separators, and was IMHO more robust, but
we're already talking about people who use 7 bit instead of 8 bit
because of the time saved in a 100 byte packet.

So I had this terminal from a particular bank, and for some reason
their terminals are loosing their minds (and all the stored
transactions.. $$) every so often, for no apparent reason.  Already,
many of the programmers who wrote these routines had been all through
it, but it wasn't not fixed yet.. The bank was getting mad because
when this happens (rare) the merchant looses all the transactions in
the terminal, and that is a big PITA to straighten out.

In going through the memory of the terminal, pulling out the stack and
figuring out what it was doing when it died, I could see that it was
in the middle of parsing a response packet.   Looking at our parsing
routines, I could see a compression routine that assumes the data
coming into it will always be numeric, and it would be BAD if the data
were ascii. This routine picks up data from the date field, which
should always be numeric... Now the packet is error checked with LRC,
and we are confident that the LRC routine is correct, so I contacted
visa, and asked if there is ANY way that they could send me alphas in
the date field.  Big resounding NO.

So, I wrote up a simulator in "mimic", a c-like language that we
developed for quickly testing protocols.  I had it simulate every
error I could think of, including line noise and let it rip over a
weekend, using this blown terminal's merchant ID and login
information.  7000 transactions later, I had trapped five instances
where my parser caught alphas in the date field!  Hmm!

So now I look at that data, and every one was when I NAKed the
response packet, but only a few times were where the re-sent packet
had a problem.

So I hacked the mimic code to do only this one error, NAKing the first
response packet the host sends, no matter what.  VERY interesting..
Exactly 1/5 of the responses have alphas in the date field.  I mean
20.00% and not 20.01%.
That's interesting because there were at that time, five tandem
non-stop machines that were doing the transactions at Visa.   I
smelled a rat.

I called up Visa again, and fed them this info, and at their request,
left my machine running, so they could find it.. Sure enough, one of
their machines had a bug!  If the terminal NAKed the packet, this
machine would re-send the packet, but with the first 10 bytes
repeated.  What this did, was to throw off our parsing, since after
the 10th byte, everything was totally out of position.

Now, back to the terminal.. I did mention that the error checking
routine was trusted?  Well, looking at it again, it WAS working fine.
But it had a little bit of a problem, and yet it wasn't really broken.

For some reason, it had been written to start at the back end of the
packet, and work forward till it found the <STX> char.  When the Visa
host got a <NAK>, it would send the packet like this:
<STX> nine bytes of packet <STX> real packet <ETX><LRC>

So the error checker started at the ETX, picked up the LRC, and ran up
the packet to the second <STX>, then said the packet was ok, but the
parser started at the front of the buffer, tossing junk till it found
the first <STX> and then since the packet was "ok", it parsed the rest
of the packet, and the compression routine, when handed alphas from
what should have been the date field, promptly threw up all over the
memory.

So, how many FUBARS can you count in that little fiasco?  :)

The predictable weak response from the programmers was that Visa
should never send bad packets, and I agree, but we all know that LRC
is weak, and malformed packets WILL pass through once in a while.  The
error checking from the back up was not technically a mistake, but it
really should have been working in the same direction as the parser,
and on the same size of buffer.  Then there's the compression routine
that saved a byte or so at the expense of a large vulnerability.

In addition to those lessons learned, I also got into the habit of
using a "black box" in my code. In some cases, just a couple of bytes
of information, in some cases a circular buffer, but the idea is that
every significant task makes an entry, so that after a crash, I can
look into the black box, and see what the last thing was that we
started doing.. ISRs have a separate byte, loaded with an ISR ID when
the ISR starts, and cleared to 0 on exit from the ISR, mainline tasks
have another byte, or a buffer holding a small number of task ids.
Believe me. it's a lot easier than trying to tease apart the content
of a 32k SRAM that's been partially corrupted.

2007\10\02@112223 by Harold Hallikainen

face
flavicon
face
Excellent Story!

THANKS!

Harold


{Quote hidden}

> -

2007\10\02@133055 by wouter van ooijen

face picon face
Sounds like the kind of fun work I used to do!

> So, how many FUBARS can you count in that little fiasco?  :)

learn: after hunting the bug for some time, you (no particular 'you'
implied) should start lloking for the bug*s*.

> The predictable weak response from the programmers was that
> Visa should never send bad packets,

did you have your whip and other punishing devices ready? *never* assume
the external world will behave, certainly not over a communication line!

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\10\03@084324 by M. Adam Davis

face picon face
Remind's me of the kernel programmer's attitude:

The user is _NEVER_ to be trusted!

When debugging black boxes for the automotive industry one of the most
fruitful things to check is whether all the I/O are bounded and error
checked.  I have yet to see a program (even my own, after a few months
away from it) that fully enforces all the I/O limits it expects to be
followed.
"But the A/D hardware would never return a value above X!"
"So tell me, what's the precision of the resistors in the voltage
divider, and the voltage reference?"
"... I'll get back to you..."

-Adam

On 10/2/07, wouter van ooijen <wouterspamspam_OUTvoti.nl> wrote:
{Quote hidden}

> -

2007\10\03@090946 by Russell McMahon

face
flavicon
face
> Remind's me of the kernel programmer's attitude:
>
> The user is _NEVER_ to be trusted!

Anything that can happen will happen.
Anything that can't happen will happen as well.

Essentially, Murphy's law in one of its many variants.

It makes no sense in any program that you want to be bullet proof, to
assume anything about the possible range, nature, timing, amount of,
values or any other characteristics of the data that can be served up
to you. If you do not make any such assumptions your program will
still not be bullet proof, but you can die with honour :-).


       Russell

2007\10\03@112807 by William \Chops\ Westfield

face picon face

>> *never* assume the external world will behave, certainly
>> not over a communication line!

One of the things I did in the early days of the Internet was
The First Implementation of IP Options in a Commercial Router.
We took a box to Interop (which was a much smaller, wilder,
trade show back in those days) and proceeded to send out some
broadcast PING packkets, with options, just to see how other
vendors on the show net would behave...
They did NOT behave very well at all.  Boxes were crashing
right and left.  If the unexpected options weren't enough,
some of the RESPONSES were totally broken.  Alas, one of the
boxes crashing was ours, since we weren't careful about checking
the packets we got back for CORRECTNESS before processing the
options...

BillW

2007\10\03@115515 by Alan B. Pearce

face picon face
>Alas, one of the boxes crashing was ours, since we weren't
>careful about checking the packets we got back for
>CORRECTNESS before processing the options...

Whoops - time for the red face mask ... ;)))

2007\10\03@143129 by David VanHorn

picon face
> > The predictable weak response from the programmers was that
> > Visa should never send bad packets,
>
> did you have your whip and other punishing devices ready? *never* assume
> the external world will behave, certainly not over a communication line!

Yeah.. Sigh..  The reverse order of the LRC check was somewhat embarrasing.
Fixing any of those little things would have fixed the problem, but I
thought it was interesting just how tall a stack of dependencies there
were before we hit a problem.

In the end, just a small firmware change on our end, and a small fix
on visa's end.

People DO get all bent out of shape when their money goes "poof" though.. :)

2007\10\03@195609 by Peter Todd

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, Oct 01, 2007 at 09:42:26AM -0400, M. Adam Davis wrote:
{Quote hidden}

I haven't worked directly with him on a project before, but he does have
a reputation for being difficult and argumentative. Heck, I'm sure I've
gottten a bit of such a reputation myself, but I'd like to think I more
readilly change my mind!

{Quote hidden}

Well as it turned out my worry about convincing arguments may have been
a bit misplaced... I just had a meeting with our bosses, they asked how
my last meeting with that engineer went, I began to tell them about the
disagreement and before I even really told the story at all they simply
told me to finish up my side of the project and demo it by hand with
telnet. As for the client... well, either he follows my spec, or they
find someone else. They even decided to rewrite my contract
retroactively for double the hours.

Better get back to work and earn that trust!

- --
http://petertodd.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHBCqq3bMhDbI9xWQRAly/AJ4mwX4bmpLPj9aiu1s1MDDF0NgYLQCfW/gc
9Zd3QmNqDeAmjUpbC2Ru+sI=
=F4m+
-----END PGP SIGNATURE-----

More... (looser matching)
- Last day of these posts
- In 2007 , 2008 only
- Today
- New search...