Searching \ for '[PIC]: Small Table Driven CRC16 Routine' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/method/errors.htm?key=crc
Search entire site for: 'Small Table Driven CRC16 Routine'.

Exact match. Not showing close matches.
PICList Thread
'[PIC]: Small Table Driven CRC16 Routine'
2002\05\12@224654 by Ashley Roll

flavicon
face
Hi Everyone,

I mentioned about a week ago that it was possible to implement a CRC routine
using smaller tables then the normal 256 element ones.

I have implemented this in C and PIC assembler. The ASM code is designed to
work on a PIC12C508 so it will work for any PIC. I've simulated it and it
does work. 85 instruction cycles per message byte, but it can probably be
improved a little.

http://www.digitalnemesis.com/ash/projects/EmbeddedCRC16/default.htm

Enjoy!

Cheers,
Ash.

---
Ashley Roll
Digital Nemesis Pty Ltd
http://www.digitalnemesis.com
Mobile: +61 (0)417 705 718

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\05\13@090533 by Scott Dattalo

face
flavicon
face
On Mon, 13 May 2002, Ashley Roll wrote:

> Hi Everyone,
>
> I mentioned about a week ago that it was possible to implement a CRC routine
> using smaller tables then the normal 256 element ones.
>
> I have implemented this in C and PIC assembler. The ASM code is designed to
> work on a PIC12C508 so it will work for any PIC. I've simulated it and it
> does work. 85 instruction cycles per message byte, but it can probably be
> improved a little.
>
> http://www.digitalnemesis.com/ash/projects/EmbeddedCRC16/default.htm


This is the CRC algorithm we used 15 years ago at I used to work for.
Nostalgia...

The PIC routine has numerous opportunities for optimization. While I don't
have time to go through each one, I can list a couple

1) Index = CRC16_High>>4;
  Index ^= CRC16_MessageByte>>4;

       SWAPF   CRC16_High, W
       ANDLW   00FH
       MOVWF   Index                   ; Index = CRC16_High >> 4
       SWAPF   CRC16_MessageByte, W
       ANDLW   00FH
       XORWF   Index, F                ; Index = Index ^
                                       ;(CRC16_MessageByte >> 4)

This can be written as:

       movf    CRC16_High,w
       xorwf   CRC16_MessageByte,w
       andlw   0x0f
       movwf   Index,f                 ; don't swap yet...


Then when index is needed:

       swapf   Index,w
       call    CRC16_Lookup

and
       swapf   Index,w
       iorlw   0x10
       call    CRC16_Lookup


2) Shift left 4.



       ; Shift the CRC Register left 4 bits
       SWAPF   CRC16_High, W
       ANDLW   0F0H
       MOVWF   Temp
       SWAPF   CRC16_Low, W
       ANDLW   00FH
       IORWF   Temp, W
       MOVWF   CRC16_High              ; CRC16_High = (CRC16_High << 4) |
(CRC16_Low >> 4)
       SWAPF   CRC16_Low, W
       ANDLW   0F0H
       MOVWF   CRC16_Low               ; CRC16_Low = CRC16_Low << 4

This can be written more efficiently as:

       movlw   0x0f
       andwf   hi,f
       swapf   hi,f

       swapf   lo,f
       andwf   lo,w
       iorwf   hi,f
       xorwf   lo,f


3) If space wasn't a major issue, I'd probably unroll the loop since only
two iterations are performed. This saves about 9 or 10 execution cycles at
the cost of ~15 or 20 instructions.

4) If space really wasn't a major issue, then you can combine the two
tables into one. E.g.

       call    CRC16_Lookup
       xorwf   CRC16_High,f



....
CRC16_Lookup
       swapf  Index,f     ; Index / 16 * 3
       movf   Index,w
       addwf  Index,w
       addwf  Index,w
       movwf  PCL

;0
       nop
       nop
       retlw  0

;1
       movlw  0x21
       xorwf  CRC16_Low,f
       retlw  0x10

;2
...


The low byte is CRC'd in the table, while the high is the same as before.
This increases the code size by 16-bytes for the table plus 3 bytes for
the index fixup, but saves about 9 execution cycles.


4) Notice that the high byte of the CRC array can be written as:

 crc_hi[i] = (i<<4) | (i>>3);

if the index is stored nibble swapped as suggested above, then the
constant for the high byte can be obtained like so:

       movf   Index,w
       btfsc  Index,7
       iorlw  1

       xorwf   CRC16_High,f

5) Notice that the low byte of the CRC array can be written as:

 crc_lo[i] = (i<<5) | i;

if the index is stored nibble swapped as suggested above, then the
constant for the low byte can be obtained like so:

       swapf  Index,w
       addwf  Index,w
       addwf  Index,w

       xorwf   CRC16_Low,f

.....

I said I wasn't going to do this, but here it goes...


;----------------------------------------------------------------------
; CRC_Update Subroutine
CRC_Update:
       ; Save the Message Byte in the W register
       MOVWF   CRC16_MessageByte

       ; We need to perform two iterations each processing 4 bits from
       ;the
       ; CRC16_MessageByte register. MSB first.
       MOVLW   2
       MOVWF   Count

CRC16_UpdateLoop:

 ; CRC16 <<= 4
       movlw   0x0f
       andwf   CRC16_High,f
       swapf   CRC16_High,f

       swapf   CRC16_Low,f
       andwf   CRC16_Low,w
       iorwf   CRC16_High,f
       xorwf   CRC16_Low,f

 ; extract the high nibble for xor'ing

       swapf   CRC16_MessageByte,w
       xorwf   CRC16_High,w
       andlw   0xf0
       movwf   Index,f                 ; don't swap yet...

       btfsc   Index,7
        iorlw  1

       xorwf   CRC16_High,f

       swapf  Index,w
       addwf  Index,w
       addwf  Index,w

       xorwf   CRC16_Low,f

       DECFSZ  Count, F
       GOTO    CRC16_UpdateLoop
       RETLW   0                               ; return Finished

---
This is untested, of course...

Scott

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2002\05\13@214933 by Ashley Roll

flavicon
face
Hi Scott,

Thanks for the optimisation effort.. I know what you mean when you say "I
wasn't going to do this.. but" That is exactly what happened when I wrote it
in the first place :)

{Quote hidden}

Hehe.. I knew I couldn't have been the first to do this :)

{Quote hidden}

That mask needs to be 0xF0 because we're after the two 4 bits.. Probably a
typo I'm sure.

{Quote hidden}

Looks good.. I'm adding it into a new version of the code..


{Quote hidden}

Nice.. I like that :) a little more concise them mine :)



> 3) If space wasn't a major issue, I'd probably unroll the
> loop since only
> two iterations are performed. This saves about 9 or 10
> execution cycles at
> the cost of ~15 or 20 instructions.

Yes, I'd initially considered unrolling it, but opted for the loop to save
the limited ROM space on the '508, thinking that if peopled wanted they
could unroll it themselves :)

> 4) If space really wasn't a major issue, then you can combine the two
> tables into one. E.g.
>
> [ interesting table stuff snipped ]

Interesting approach.. probably makes the code a little too unreadable for
my liking for a piece of example code that others can use..

> [ Example code snipped ]
> ---
> This is untested, of course...

The problem with your code was you were shifting the CRC register before
extracting the top 4 bits to XOR with the message byte to create the lookup
index... Easy change in order of the code and it should be fine.

I've added a new version of the PIC code on the web site. This is now down
to 74 instructions per message byte, but if the loop is unrolled and the
complex table scheme from Scott was implemented I'm sure it would be lower.

http://www.digitalnemesis.com/ash/projects/EmbeddedCRC16/default.htm

Cheers,
Ash.

---
Ashley Roll
Digital Nemesis Pty Ltd
http://www.digitalnemesis.com
Mobile: +61 (0)417 705 718

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2002\05\14@072254 by o-8859-1?Q?K=FCbek_Tony?=

flavicon
face
Hi Scott and Ashley,
nice initiative and somewhat challenging :) *gee* probably to much free time
today. Anyway I concluded by reading the original post the same result as
Scott
did, there were ways one could skip the table and make some parts shorter.
However
one thing was still bugging me, the double 4 bits shifts, I felt that these
should not be
needed. Wrapping my head around what exactely was happening ( I never have
dealt with CRC
or feedback registers before ) made me realise something, kind of an 'aha'
experience,
so without further ramblings here goes an *really breifly* commented version
( for challenge ).
It works with the test vector but don't take it for granted that it's fully
operational :)
( although I think so ).

It does not save much in regards to Scotts version (unrolled), but I guess
it is atleast
a few cycles shorther.

Ofcource I'll post an more commented version if there is anyone who actually
is interested,

The snippet is really migrane inducing :) and not for the faint hearted.

I needed three temps, and used the 'available' in the original snippet.
Executes in around 36 instr./byte, no additional stack or tables are needed
(including call/return).

But it wouldn't surprise me if Scott,Nikolai or Bob presents an 10 cycle
version :) *hint* hint*


;********************************************************************
;
;       CRC_Update: Calculates the CCITT CRC16 checksum for the byte in W
;       Updates: CRC16_High and CRC16_Low
;       Uses:    W,Index,Temp and CRC16_MessageByte
;       Executes in around 36 cycles/call/byte (including call/return)
;       Remember to initialize CRC16 to 0xFFFF before crc of the message
;       Test vector: "123456789" all ASCII without brackets generates 0x29B1
;
CRC_Update:
       ; calculate xor for the input byte with crc-high register
       XORWF   CRC16_High,W
       MOVWF   Temp    ; save
       ;MOVWF  Index
       SWAPF   Temp,W  ; swap nibbles to do additional xor with low nibble
       ANDLW   0x0F
       XORWF   Temp,F ; relating to original crc routine be Ashley temp now
contains
                      ; *both* the first table and second table index ( top
nibble first index, low nibble second index )
                      ; referred to as i and i2

       ; an observation of the original routine reveals that the crc is
rotated 8 bits in total
       ; for one byte input, hence what once was crc-low becomes crc-high.
       ; writing down the formula gives that the top nibble of the output
crc-high is:
       ; CRC_High:h = CRC_Low:h XOR TableHigh:l[i] XOR TableHigh:h[i2]
       ; and for the low nibble
       ; CRC_High:l = CRC_Low:l XOR TableLow:h[i] XOR TableHigh:l[i2]
       ; now by using the fact that the tables can be calculated instead
       ; we will have the following *nibble* tables:
       ; TableHigh:l = [0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1]
       ; TableHigh:h = [0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F]
       ; TableLow:l =  [0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F]
       ; TableLow:h =  [0,2,4,6,8,A,C,E,0,2,4,6,8,A,C,E]
       ; these will be hanled one by one below:
       ; note: remember thet temp holds both i and i2 at entry

       ; we start by calculating the table entry for the TableLow:h
       RLF     Temp,W
       ANDLW   0xF0
       MOVWF   Index
       SWAPF   Index,F
       SWAPF   Temp,W
       ANDLW   0xF0
       IORWF   Index,F ; w:h = TableHigh:h[i2], w:l= TableLow:h[i]
       MOVLW   0x00    ; prepare to do xor with TableHigh:l[i] *and*
TableHigh:l[i2]
                       ; for both nibbles at the same time
       BTFSC   Temp,7
       IORLW   0x10    ; if i >= 8 do XOR with 1 ( high nibble )
       BTFSC   Temp,0
       IORLW   0x01    ; if i2 >= 8 do xor with 1 ( low nibble )
       XORWF   Index,W ; w:h = TableHigh:l[i] XOR TableHigh:h[i2]
                       ; w:l = TableLow:h[i] XOR TableHigh:l[i2]

       XORWF   CRC16_Low,W ; finnally done with high(output) byte crc
       MOVWF   CRC16_High ; save as high byte of crc

       ; further observations can conclude that crc-low will have the
following
       ; content at exit:
       ; CRC_Low:h = TableLow:l[i] XOR TableLow:h[i2]
       ; and
       ; CRC_Low:l = TableLow:l[i2]
       ; note: remember thet temp holds both i and i2 at entry
       ; first we calculate TableLow:h[i2]
       MOVF    Temp,W
       ADDWF   Temp,W
       ANDLW   0x0F    ; w = TableLow:h[i2]
       MOVWF   CRC16_MessageByte
       MOVLW   0x0F
       ANDWF   Temp,W  ; w:l = i2 ( = TableLow:l[i2] )
       SWAPF   CRC16_MessageByte,F
       IORWF   CRC16_MessageByte,F
       MOVF    Temp,W
       ANDLW   0xF0    ; w = i ( = TableLow:l[i] )
       XORWF   CRC16_MessageByte,W

       MOVWF   CRC16_Low ; done with low output byte


       RETLW   0                               ; return Finished

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2002\05\14@082218 by Ashley Roll

flavicon
face
Humm... My Brain Hurts.. You crazy optimising people are giving me a
headache trying to keep up :)

If you are interested, a full discussion about how CRCs work is presented at
http://www.geocities.com/SiliconValley/Pines/8659/crc.htm

This is what I based my original code on.

I was trying to produce a _simple_ and relatively small CRC routine and I
seem to have been out-done in a big way.. Oh well :)

Maybe I should have a look at this again in the morning when I've drunk less
beer.. :P Might make more sense.

Cheers,
Ash.

---
Ashley Roll
Digital Nemesis Pty Ltd
http://www.digitalnemesis.com
Mobile: +61 (0)417 705 718




> {Original Message removed}

2002\05\14@085445 by Jon Baker

flavicon
face
----- Original Message -----
From: "Ashley Roll" <spam_OUTashTakeThisOuTspamDIGITALNEMESIS.COM>
To: <.....PICLISTKILLspamspam@spam@MITVMA.MIT.EDU>
Sent: Tuesday, May 14, 2002 1:19 PM
Subject: Re: [PIC]: Small Table Driven CRC16 Routine


> Humm... My Brain Hurts.. You crazy optimising people are giving me a
> headache trying to keep up :)

Anybody up for doing a brute force search to find the optimal code? :)

--
Jon Baker

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2002\05\14@090347 by Bob Ammerman

picon face
<snip>

>         BTFSC   Temp,0
>         IORLW   0x01    ; if i2 >= 8 do xor with 1 ( low nibble )

Either Temp,0 should be Temp,3 or the comment is wrong!

Bob Ammerman
RAm Systems

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2002\05\14@093649 by o-8859-1?Q?K=FCbek_Tony?=

flavicon
face
Hi,

Ashley wrote:
>Humm... My Brain Hurts.. You crazy optimising people are giving me a
>headache trying to keep up :)

he he, the *real* (tm) optimising people hasn't woken up yet, I'm just
placing some bait ;)

<snip>
>I was trying to produce a _simple_ and relatively small CRC routine and I
>seem to have been out-done in a big way.. Oh well :)

Yes I know, but the one does not disclude the other, both versions have
their
uses. I could never have written my interpretation without using your code
as an base.
( never dealt with crc's before ). The c-version on your site is extremely
easy to grasp.
Not meant to dish your efforts in any way.
Sometimes one just get that 'hmmm..that should be doable in
x(cycles/dollars/effort/etc) less..'
feeling.

>Maybe I should have a look at this again in the morning when I've drunk
less
>beer.. :P Might make more sense.

Or perhaps additional vic-bitter's would improve one's perception of reality
:)

Cheers mate,

/Tony

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2002\05\14@125151 by Scott Dattalo

face
flavicon
face
On Tue, 14 May 2002, K|bek Tony wrote:

{Quote hidden}

:)

Tony, this *is* a brilliant observation and am ashamed I didn't think of
it :). But, you do realize that it can be optimized...

; W = CRC Message byte
; CRC16 - current 16-bit CRC check sum
;
; let CRC16 = abcd
;     Message Byte = xy
;
; where abcd, xy are nibble variables
;
; Upon exit,
;
;  a = c ^ (i1>>3) ^ i2
;  b = d ^ ((i1<<1)&0xf) ^ (i2>>3)
;  c = i1 ^ ((i2<<1)&0xf)
;  d = i2
;
; First compute the nibble array indices:
;  i1 = a ^ x
;  i2 = i1 ^ b ^ y
;
;
; Notation  (I) : (J)  is a byte with I= high nibble, J= low nibble
;

CRC_Update
       xorwf   CRC16_High,w    ; (a^x):(b^y)
       movwf   Index           ;
       andlw   0xf0            ; W = (x^x):0
       swapf   Index,f         ; Index = (b^y):(a^x)
       xorwf   Index,f         ; Index = (a^b^x^y):(a^x) = i2:i1


 ; High byte
       rlf     Index,W         ;
       rlf     Index,W         ;
       andlw   0x1f            ; W = (i1>>3) : (((i1<<1)&0xf) | (i2>>3))
       xorwf   CRC16_Low,w     ; W = (i1>>3)^c : ((((i1<<1)&0xf) |
                               ;     (i2>>3)) ^ d)
       movwf   CRC16_High      ; low nibble of high byte is done

       movf    Index,w
       andlw   0xf0            ; W = i2 : 0
       xorwf   CRC16_High,f    ; High nibble is of high byte is done

 ; now low byte
       movwf   CRC16_Low       ; Low = i2 : 0
       addwf   CRC16_Low,f     ; Low = (i2<<1) : 0
       swapf   Index,W         ; W = i1 : i2
       xorwf   CRC16_Low,f     ; Low = i1 ^ (i2<<1) : i2

       retlw   0

17 Instructions, one extra working ram location. I'll let Nikolai, Bob,
and Dmitry squeeze out the last 5.

Scott

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2002\05\14@140633 by Bob Ammerman

picon face
This is absolutely brilliant. I stood by in awe watching the various flavors
flying by, each more quickly than I could come up with a comparable
imrprovement. I don't think I'll be finding the last 5 instructions, either.

Bob Ammerman
RAm Systems

{Original Message removed}

2002\05\14@182252 by o-8859-1?Q?K=FCbek_Tony?=

flavicon
face
Hi,

Scott wrote:
<snip>
>But, you do realize that it can be optimized...

well yes :), I though it was 'clever' and just did
'straight' coding, never really thought about all
implifications. However my hunch was that it would
be an very effective solution compared to previous onces.

{Quote hidden}

<snip>

Absolutely brilliant, piclist at it best, hats off for you sir :)

Makes one wonder how many other routines out 'there' that can be cut by 50%
:)

So Ashley, there you go, the *real* (tm) optimising people has woken up
and the bait has been taken, a thing of beauty was created ( if one could
consider code beautiful ).
( oh and I forgot to mention Dmitry previously, sorry )

Thanks to you all,

/Tony

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


More... (looser matching)
- Last day of these posts
- In 2002 , 2003 only
- Today
- New search...