Searching \ for '[PIC]: Challenge: FIR Filter on an 18f series devi' in subject line. ()
Help us get a faster server
FAQ page: www.piclist.com/techref/microchip/math/filter.htm?key=filter
Search entire site for: 'Challenge: FIR Filter on an 18f series devi'.

Exact match. Not showing close matches.
'[PIC]: Challenge: FIR Filter on an 18f series devi'
2002\09\20@014057 by

An FIR filter, or finite impulse response filter, is a fancy name for a
dot product between two vectors. Or in C:

sum = 0;
for(i=0; i<LENGTH; i++)
sum = sum + weights[i] * data[i];

In other words, you have two arrays "weights" and "data". They're
multiplied together element by element and the resulting products are

Now the challenge is this:

Suppose the filter weights and the data samples are 8-bit numbers stored
in RAM. The filter weights started off as constants, but since you were
clever, you mixed the calibration data with filter weights. The data
resides in a circular buffer. Suppose the FIR filter has 8 taps (i.e.
there are 8 elements in the Weights vector) and that the circular data
buffer can hold 16 samples.

How efficiently can this be implemented in assembly:

#define FIR_WEIGHTS       8
#define MAX_DATA_SAMPLES  16

char weights[FIR_WEIGHTS];
char data_buffer[MAX_DATA_SAMPLES];
char data_index; /* points to most recently received data sample */

int fir(void)
{
char i,j;
sum;

j  = data_index - FIR_WEIGHTS;
if(j < 0)
j = j + MAX_DATA_SAMPLES;

sum = 0;
for(i=0; i<FIR_WEIGHTS; i++) {

sum = sum + weights[i] * data_buffer[j];
j = j + 1;
if(j >= MAX_DATA_SAMPLES)
j = 0;

}

return sum;

}

---

Scott
PS. I see 8 instructions for initialization and 12 for a loop and about 128
cycles total.

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

Hi,
Scott Dattalo wrote :
<snip>
{Quote hidden}

So, sensei :) here are my humble stumblings, untested, etc..
I'll probably made some terrible misstakes but here goes:

A: weight can be aligned anywhere
data_buffer is aligned on an 32 byte boundary (&data_buffer[0] has
lowest five bits zeroed )
MAX_DATA_SAMPLES must be 16, and FIR_WEIGHTS must be less then 16
B: weight and data_buffer located anyware in ram
MAX_DATA_SAMPLES and FIR_WEIGHTS can be arbitrary ( but less than 256
)

FIR_A
CLRF    sum_l
CLRF    sum_h
LFSR    0,weights
LFSR    1,data_buffer+FIR_WEIGHTS
MOVF    data_index,W
MOVF    PLUSW1,F
MOVLW   FIR_WEIGHTS
MOVWF   i
FIR_A_LOOP
BCF     FSR1L,5
MOVF    POSTINC1,W
MULWF   POSTINC0
MOVF    PRODL,W
MOVF    PRODH,W
DECFSZ  i,F
GOTO    FIR_A_LOOP

FIR_B
CLRF    sum_l
CLRF    sum_h
LFSR    0,weights
LFSR    1,data_buffer+FIR_WEIGHTS
MOVF    data_index,W
MOVF    PLUSW1,F
MOVLW   FIR_WEIGHTS
MOVWF   i
FIR_B_LOOP
MOVLW   LOW(data_buffer+MAX_DATA_SAMPLES)
XORWF   FSR1L,W
BTFSC   STATUS,Z
LFSR    1,data_buffer
MOVF    POSTINC1,W
MULWF   POSTINC0
MOVF    PRODL,W
MOVF    PRODH,W
DECFSZ  i,F
GOTO    FIR_B_LOOP

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

> An FIR filter, or finite impulse response filter, is a fancy name for a
> dot product between two vectors. Or in C:
>
>   sum = 0;
>   for(i=0; i<LENGTH; i++)
>     sum = sum + weights[i] * data[i];
>
> In other words, you have two arrays "weights" and "data". They're
> multiplied together element by element and the resulting products are

That's called a "convolution", and is the basis for much digital signal
processing.  This is exactly the algorithm that DSPs are optimized for.  In
a DSP you want to do each multiply-accumulate in one cycle, including the

{Quote hidden}

Microchip has done this sort of thing for the 18 series utilizing the
hardware multiplier.  They were distributing this code along with a bunch of
other digital filtering stuff on a small CD about a year ago.  I sorta
remember it was called something like "Filter Lab".  The convolution code
was particularly clever in that it actually did 9 (or was it 10) bit
multiplies with some neat tricks.

*****************************************************************
Embed Inc, embedded system specialists in Littleton Massachusetts
(978) 742-9014, http://www.embedinc.com

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

On Fri, 20 Sep 2002, Olin Lathrop wrote:

> > An FIR filter, or finite impulse response filter, is a fancy name for a
> > dot product between two vectors. Or in C:
> >
> >   sum = 0;
> >   for(i=0; i<LENGTH; i++)
> >     sum = sum + weights[i] * data[i];
> >
> > In other words, you have two arrays "weights" and "data". They're
> > multiplied together element by element and the resulting products are
>
> That's called a "convolution", and is the basis for much digital signal
> processing.  This is exactly the algorithm that DSPs are optimized for.  In
> a DSP you want to do each multiply-accumulate in one cycle, including the

Well... technically it's correlation, but yeah you're right.

{Quote hidden}

Great! Maybe their code will win the challenge (but I doubt it -
especially now that Tony K. is interested).

Scott

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

On Fri, 20 Sep 2002, Tony K|bek wrote:

> Hi,
> Scott Dattalo wrote :
> <snip>
> >How efficiently can this be implemented in assembly:
> >
<snip>

>
> So, sensei :) here are my humble stumblings, untested, etc..
> I'll probably made some terrible misstakes but here goes:
>
> A: weight can be aligned anywhere
>    data_buffer is aligned on an 32 byte boundary (&data_buffer[0] has
> lowest five bits zeroed )
>    MAX_DATA_SAMPLES must be 16, and FIR_WEIGHTS must be less then 16

Hmm.

> B: weight and data_buffer located anyware in ram
>    MAX_DATA_SAMPLES and FIR_WEIGHTS can be arbitrary ( but less than 256
> )

This is certainly more useful.

>
>
> FIR_A
>         CLRF    sum_l
>         CLRF    sum_h
>         LFSR    0,weights
>         LFSR    1,data_buffer+FIR_WEIGHTS
>         MOVF    data_index,W
>         MOVF    PLUSW1,F
;         ^^^^^^^^^^^^^^^^^ ???
{Quote hidden}

Hint: If you align "weights" properly too, you can get rid of "i" and use

FIR_A2
CLRF    sum_l
CLRF    sum_h
LFSR    0,weights+FIR_WEIGHTS-1
LFSR    1,data_buffer+FIR_WEIGHTS-1
MOVF    data_index,W
FIR_A2_LOOP:
BCF     FSR1L,5
MOVF    POSTDEC1,W
MULWF   POSTDEC0
MOVF    PRODL,W
MOVF    PRODH,W
BTFSS   FSR0L,3
bra    FIR_A2_LOOP

{Quote hidden}

;         ^^^^^  That's clever!
;

>         MOVF    POSTINC1,W
>         MULWF   POSTINC0
>         MOVF    PRODL,W
>         MOVF    PRODH,W
>         DECFSZ  i,F
>         GOTO    FIR_B_LOOP

This is what I was thinking about:

FIR_C:
CLRF    sum_l                   ;sum = 0
CLRF    sum_h
LFSR    0,weights+FIR_WEIGTHS-1 ;FSR0 -> Fir weights
LFSR    1,data_buffer           ;FSR1 -> data
MOVLW   FIR_WEIGHTS             ;i is the loop counter
; (we'll count down)

FIR_C_LOOP:
MOVWF   i
MOVFF   PLUSW0,temp             ;temp = weights[i]
SUBWF   data_index,W            ;W = data_index-i
MOVF    PLUSW1,W                ;W = data[data_index-i]
MULWF   temp                    ;weights * data
MOVF    PRODL,W                 ;
MOVF    PRODH,W                 ;
DECFSZ  i,W
bra    FIR_C_LOOP

Scott

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

Scott,

> > That's called a "convolution", and is the basis for much digital signal
> > processing.  This is exactly the algorithm that DSPs are optimized for.  In
> > a DSP you want to do each multiply-accumulate in one cycle, including the
>
> Well... technically it's correlation, but yeah you're right.

why you sai it`s correlation ???
I remember from math that correlation is an average of a sample with
another sample with delat,
For istance I remember that
corr(x,y,tau)=<x(t)*y(t+tau)> averaged in t.

The stuff the guy described should be a particular convolution where the
kernel funciton is a FIR (casual).

I might be wrong, after all it`s a matter of words.

Thanks,

Stefano

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

Hi,
Scott Dattalo wrote:

<snip>
>>         MOVF    data_index,W
>>         MOVF    PLUSW1,F
>;         ^^^^^^^^^^^^^^^^^ ???
<snip>
>Hint: If you align "weights" properly too, you can get rid of "i" and
use
>
>FIR_A2
>        CLRF    sum_l
>        CLRF    sum_h
>        LFSR    0,weights+FIR_WEIGHTS-1
>        LFSR    1,data_buffer+FIR_WEIGHTS-1
>        MOVF    data_index,W
<snip>

Well I thought I needed to get to the data_buffer[data_index]
the first line above is to make an 12 bit addition :) But thinking about it, you're right as I posed the requirment
for the data_buffer to be aligned, then there is no need for either
the variable i or the PLUSW1 line. One can just add data_index to fsrl.

Hmm there were quite a few errors in my previous post, this might
work better:

FIR_B
CLRF    sum_l
CLRF    sum_h
LFSR    0,weights+FIR_WEIGHTS-1 ; fsr0 = weights[fir_weights-1]
LFSR    1,data_buffer       ; fsr1 = data_buffer[0]
MOVF    data_index,W
MOVF    PLUSW1,F               ; fsr1 = data_buffer[data_index]
MOVLW   FIR_WEIGHTS
MOVWF   i
FIR_B_LOOP
MOVLW   LOW(data_buffer-1)     ; w = low(data_buffer[0]-1)
XORWF   FSR1L,W                ; test if fsr1l =
low(data_buffer[0]-1)          BTFSC   STATUS,Z
LFSR    1,(data_buffer+MAX_DATA_SAMPLES-1)     ; do wrap
around, fsr1 = data_buffer[max_data_samples-1]
MOVF    POSTDEC1,W     ; get data_buffer[data_index-i]
MULWF   POSTDEC0       ; mutliply with weights[fir_weights-i]
MOVF    PRODL,W        ; add to acumulator
MOVF    PRODH,W
DECFSZ  i,F            ; loop again ?
GOTO    FIR_B_LOOP

Still, it's not as fast or as clever as Scott's, btw very innovative use
of i.
However mine uses one byte less of ram :)

/Tony

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

> why you sai it`s correlation ???
> I remember from math that correlation is an average of a sample with
> another sample with delat,
> For istance I remember that
> corr(x,y,tau)=<x(t)*y(t+tau)> averaged in t.
>
> The stuff the guy described should be a particular convolution where the
> kernel funciton is a FIR (casual).
>
> I might be wrong, after all it`s a matter of words.

I think Scott is right.  Performing one such array operation is a
correlation.  However, to do digital filtering with a kernal function
(impulse response), you perform multiple correlations as the input signal is
"slid" past the kernel.  The output signal is the result from successive
correlations.  That's a convolution.  In other words, a convolution is a
series of correlations.

*****************************************************************
Embed Inc, embedded system specialists in Littleton Massachusetts
(978) 742-9014, http://www.embedinc.com

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

On Fri, 20 Sep 2002, Olin Lathrop wrote:

> > why you sai it`s correlation ???
> > I remember from math that correlation is an average of a sample with
> > another sample with delat,
> > For istance I remember that
> > corr(x,y,tau)=<x(t)*y(t+tau)> averaged in t.
> >
> > The stuff the guy described should be a particular convolution where the
> > kernel funciton is a FIR (casual).
> >
> > I might be wrong, after all it`s a matter of words.
>
> I think Scott is right.  Performing one such array operation is a
> correlation.  However, to do digital filtering with a kernal function
> (impulse response), you perform multiple correlations as the input signal is
> "slid" past the kernel.  The output signal is the result from successive
> correlations.  That's a convolution.  In other words, a convolution is a
> series of correlations.

I hate to argue with you Olin, but I think you were right the first time.

When you have a filter and some data then the act of filtering is
convolving.  The mechanics of the arithmetic are essentially identical. In
convolution, one vector is inverted before the sum of products is
computed.

Scott

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

Scott,

I am assuming that the weights and data are both signed, as will be their
sum. Right?

Bob Ammerman
RAm Systems

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics

On Fri, 20 Sep 2002, Bob Ammerman wrote:

> Scott,
>
> I am assuming that the weights and data are both signed, as will be their
> sum. Right?

Bob,

We were having fun until you spoiled it! You're right, signed arithmetic
will certainly have an impact on the efficiency. At the very least, the
sign bit of the data will need to be examined this will add at least three
more cycles per loop iteration:

BTFSC  PLUSW0,7
bra   unsigned
signed:
....

unsigned:
....

Scott

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.

The 8x8 signed multiply from the 18 series datasheets looks like this:

movf    arg1,w
mulwf   arg2
btfsc     arg2,7
subwf    PRODH
movwf   arg2,w
btfsc      arg1,7
subwf    PRODH

Kinda cute, but messes us up when we have to use W in a PLUSW0 addressing
mode at the same time we are trying to use it to hold ARG1 and/or ARG2.

Of course, in the normal course of events the values of the filter
coefficients will be known in advance. Then, for maximum performance you can
unroll the loop, and build in the signedness stuff at least for the filter
coefficients.

Bob Ammerman
RAm Systems

{Original Message removed}
On Sat, 21 Sep 2002, Bob Ammerman wrote:

> The 8x8 signed multiply from the 18 series datasheets looks like this:
>
>     movf    arg1,w
>     mulwf   arg2
>     btfsc     arg2,7
>     subwf    PRODH
>     movwf   arg2,w
>     btfsc      arg1,7
>     subwf    PRODH
>
> Kinda cute, but messes us up when we have to use W in a PLUSW0 addressing
> mode at the same time we are trying to use it to hold ARG1 and/or ARG2.

Kinda cute, but kinda slow... But that's just the way it is. Perhaps the
dsPic handles this better?

I prefer to work exclusively with unsigned numbers on a PIC. If you're
careful then it's possible to express your data as though if it were
unsigned. If the data is coming from an A2D converter, then it's *always*
possible to treat the input stream as an unsigned number. But in general
you should be able to do this:

signed number = unsigned number  -  offset

Suppose we were computing the FIR of a vector of signed numbers:

Filt = sum ( w[i] * sdata[i])

where,
w[i] is the i'th filter coefficient
sdata[i] is the i'th signed data element
sum() is the sum-of-products operator
Filt is the result

Note that: sdata[i] = udata[i] + offset
where
udata[i] is the i'th data element represented as an unsigned entity

Substitute this into the FIR:

Filt = sum( w[i] * (udata[i] + offset) )
= sum( w[i] * udata[i] )  +  sum(w[i] * offset)
= sum( w[i] * udata[i] )  +  offset * sum(w[i])

The sum on the left is the one that can be optimized with the routines
Tony K. and I posted. The sum on the right doesn't depend on the data. If
the FIR weights are constant then so will this sum. If they're not, then
this sum can be precomputed - or at least re-computed only when the
weights change.

In addition, if the w[i]'s are signed a similar transformation can be
performed for them. But you shouldn't need to do this. A better approach
is to treat them (or transform them) as unsigned numbers and then convert
the final value, Filt, to a signed number.

> Of course, in the normal course of events the values of the filter
> coefficients will be known in advance. Then, for maximum performance you can
> unroll the loop, and build in the signedness stuff at least for the filter
> coefficients.

Yep.

Scott

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.

> > Kinda cute, but messes us up when we have to use W in a PLUSW0
> > mode at the same time we are trying to use it to hold ARG1 and/or ARG2.
>
> Kinda cute, but kinda slow... But that's just the way it is. Perhaps the
> dsPic handles this better?

I would indeed, if it existed :-)

Basically, the FIR loop on the dsPIC is a single instruction (not even a
go-back-to-the-top-of-the-loop instruction is needed). You would need a few
setup instructions and there are some alignment requirements for the
circular data buffer.

> I prefer to work exclusively with unsigned numbers on a PIC. If you're
> careful then it's possible to express your data as though if it were
> unsigned.

<snip formulae for doing everything with unsigned MUL>

Looks good!

Bob Ammerman
RAm Systems

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.

Hi,
Scott Dattalo wrote:
<snip>
>Kinda cute, but kinda slow... But that's just the way it is. Perhaps
the
>dsPic handles this better?
<snip>
Well it can use different representation but it has native support for
signed numbers. Data format is (can be) 1.15 fixed point, and can use
both
fractional and integer representation.

<snip>
>I prefer to work exclusively with unsigned numbers on a PIC. If you're
>careful then it's possible to express your data as though if it were
>unsigned. If the data is coming from an A2D converter, then it's
*always*
>possible to treat the input stream as an unsigned number. But in
general
>you should be able to do this:
<snip>

Yes I agree :) the only time I deal with signed numbers is normally at
presentation
level. All(well, as many as possible) calculations are done with signed
numbers,
if the input cannot be signed then it's just an matter to add an offset
as you also
concluded.
Then just before presenting the result (UI) remove the offset and do
conversion.

/Tony

--
http://www.piclist.com#nomail Going offline? Don't AutoReply us!
email listservmitvma.mit.edu with SET PICList DIGEST in the body

More... (looser matching)
- Last day of these posts
- In 2002 , 2003 only
- Today
- New search...