Truncated match.
PICList
Thread
'Bit counting (looking for old challenge emails)'
1998\07\21@111248
by
Steve Lawther
Folks,
I'm looking for a method of counting the ones bits in a byte. I sure
there were PICList mails on the subject a while back, but I can't find
them in my archive. Has anybody a rough month and year date for
downloading the archives.
An address for a fully searchable PICList archive would be just as
good.
Thanks in advance,
Steve Lawther
1998\07\21@115629
by
Bill Cornutt
1998\07\21@142952
by
Peter L. Peres

This is not an old challenge mail, it's a few methods. In the code, This
is a label, THIS is a constant number (or the address of a file register,
or the number to represent a bit, such as carry), and this is a mnemonic:
CountBits
clrf OUTPUT ; output register
movlwv 8 ; number of bits in a byte
movwf BIT_COUNTER ; bit counter register: destroyed
CountBits1
rrf INPUT_BITS, 1 ; shift each bit into carry
btfsc PSW, CY ; if cleared skip next
incfsz OUTPUT, 1 ; *** ; if set, increment counter
nop ; used only if incf sets CY (does it ?)
decfsz BIT_COUNTER, 1 ; any more bits ?
goto CountBits1 ; yes
rrf INPUT_BITS, 1 ; no. Shift input once more to restore orig.
; DONE
The incfsz at '***' is used because I don't have the data book here and
I don't remember if incf sets carry or not. In case it does not, the
incfsz should be an incf and the nop after it deleted. This also reduces
the timing to 51 T cycles, and the code size to 9 words.
The input data is unchanged in INPUT_BITS at exit, the number of words
used is of 10 and the run time should be 59 T cycles (off of my head).
There is a bruteforce approach that uses btfsc's and incf's for each
bit tested, for a run time of 17 T cycles and a code consumption of 17 T
cycles.
There is the fastest possible approach that uses a table lookup (can you
afford such a thing on a PIC ?!), and uses 258 words of program for a run
time of 5 T cycles (MAJOR overkill imho).
hope this helps,
Peter
1998\07\21@155027
by
Scott Dattalo
Bill Cornutt wrote:
>
> There is code at
>
> http://www.interstice.com/~sdattalo/technical/software/pic/picprty.html
>
> With the home page at
>
> http://www.interstice.com/~sdattalo/
>
Thanks for the plug Bill. However, the routines on my web page are by no
means the fastest (as I've been shown). One PICLIST member has an
isochronous bitcount routine that runs in 13 cycles. So unless he wishes
to post it, I guess there's a challenge for someone out there.
BTW. Peter's claim of a 5 cycle version neglects to take into account
the overhead of calling a lookup table. (I.e. a lookup table takes a
minimum of 6 cycles). But a 7 cycle table lookup approach is possible.
Scott
1998\07\21@164405
by
Myron Loewen

No need for a lookup table on this one.
We do a 10 bit parity in 11 cycles.
Here is an 8 bit version in 8 cycles that might work:
swapf Data,w ; Enter with 8 bit value in Data
xorwf Data,w
movwf Temp
rrf Temp,f
rrf Temp,f
xorwf Temp,w ; XOR is high if bits are different
rrf Temp,f ; Hence result is 1 if odd number of ones
xorwf Temp,f ; Exit with parity in Temp bit 0
{Quote hidden}>> There is code at
>>
>>
http://www.interstice.com/~sdattalo/technical/software/pic/picprty.html
>Thanks for the plug Bill. However, the routines on my web page are by no
>means the fastest (as I've been shown). One PICLIST member has an
>isochronous bitcount routine that runs in 13 cycles. So unless he wishes
>to post it, I guess there's a challenge for someone out there.
>
>BTW. Peter's claim of a 5 cycle version neglects to take into account
>the overhead of calling a lookup table. (I.e. a lookup table takes a
>minimum of 6 cycles). But a 7 cycle table lookup approach is possible.
>
>Scott
1998\07\21@173944
by
Scott Dattalo
Myron Loewen wrote:
>
> No need for a lookup table on this one.
> We do a 10 bit parity in 11 cycles.
> Here is an 8 bit version in 8 cycles that might work:
>
> swapf Data,w ; Enter with 8 bit value in Data
> xorwf Data,w
> movwf Temp
> rrf Temp,f
> rrf Temp,f
> xorwf Temp,w ; XOR is high if bits are different
> rrf Temp,f ; Hence result is 1 if odd number of ones
> xorwf Temp,f ; Exit with parity in Temp bit 0
I think you misunderstood the original question. We're trying to count
the total number of 'ones' in an 8bit number. But if you want just the
parity, why not use Payson's routine:
swapf X,w
xorwf X,f
rrf X,w
xorwf X,f
btfsc X,2
incf X,f
Scott
1998\07\21@181524
by
Dmitry Kiryashov
Myron Loewen wrote:
>
> No need for a lookup table on this one.
> We do a 10 bit parity in 11 cycles.
> Here is an 8 bit version in 8 cycles that might work:
>
> swapf Data,w ; Enter with 8 bit value in Data
> xorwf Data,w
> movwf Temp
> rrf Temp,f
> rrf Temp,f
> xorwf Temp,w ; XOR is high if bits are different
> rrf Temp,f ; Hence result is 1 if odd number of ones
> xorwf Temp,f ; Exit with parity in Temp bit 0
Some time ago John Payson was offer an excellent solution:
swapf Data,w
xorwf Data,f
rrf Data,w
xorwf Data,f
btfsc Data,2
incf Data,f
after that Data,0 contain calculated parity.
But question was to calculate number of "1" in a byte not parity.
WBR Dmitry.
1998\07\21@235545
by
Regulus Berdin

Hi to all,
> From: Scott Dattalo <.....sdattaloKILLspam@spam@UNIX.SRI.COM>
[SNIP]
> Thanks for the plug Bill. However, the routines on my web page are by no
> means the fastest (as I've been shown). One PICLIST member has an
> isochronous bitcount routine that runs in 13 cycles. So unless he wishes
> to post it, I guess there's a challenge for someone out there.
I took the challenge, here is what I got.
BITCOUNT: ;untested
RRF Q,W ;Q = abcdefgh
MOVWF T ;T = 0abcdefg
MOVLW B'01010101'
ANDWF Q ;Q = 0b0d0f0h
ANDWF T,W ;W = 0a0c0e0g
ADDWF Q ;Q = (a+b)(c+d)(e+f)(g+h)
RRF Q,W
MOVWF T
RRF T ;T = (xx)(a+b)(c+d)(e+f)
MOVLW B'00110011'
ANDWF Q ;Q = (00)(c+d)(00)(g+h)
ANDWF T,W ;W = (00)(a+b)(00)(e+f)
ADDWF Q ;Q = (a+b+c+d)(e+f+g+h)
SWAPF Q,W ;W = (e+f+g+h)(a+b+c+d)
ADDWF Q,W ;W = (a+b+c+d+e+f+g+h)(a+b+c+d+e+f+g+h)
ANDLW H'0F' ;W = (0000)(a+b+c+d+e+f+g+h)
Data is in Q and result is placed in W.
The routine is isochronous but 16 cycles. I cannot shave
it down to 13 or less :(. I wonder how this piclist member
did it or what method/algorithm he used.
Reggie
1998\07\22@124600
by
Scott Dattalo

Regulus Berdin wrote:
{Quote hidden}> I took the challenge, here is what I got.
>
> BITCOUNT: ;untested
> RRF Q,W ;Q = abcdefgh
> MOVWF T ;T = 0abcdefg
> MOVLW B'01010101'
> ANDWF Q ;Q = 0b0d0f0h
> ANDWF T,W ;W = 0a0c0e0g
> ADDWF Q ;Q = (a+b)(c+d)(e+f)(g+h)
>
> RRF Q,W
> MOVWF T
> RRF T ;T = (xx)(a+b)(c+d)(e+f)
> MOVLW B'00110011'
> ANDWF Q ;Q = (00)(c+d)(00)(g+h)
> ANDWF T,W ;W = (00)(a+b)(00)(e+f)
> ADDWF Q ;Q = (a+b+c+d)(e+f+g+h)
>
> SWAPF Q,W ;W = (e+f+g+h)(a+b+c+d)
> ADDWF Q,W ;W = (a+b+c+d+e+f+g+h)(a+b+c+d+e+f+g+h)
> ANDLW H'0F' ;W = (0000)(a+b+c+d+e+f+g+h)
>
> Data is in Q and result is placed in W.
>
> The routine is isochronous but 16 cycles. I cannot shave
> it down to 13 or less :(. I wonder how this piclist member
> did it or what method/algorithm he used.
I'm not surprised you'd step up, Reggie. This mystery person used a
method not too unlike yours. Although there is a mischievous trick that
saves a shift here and there.
The good news is that you have a week or so to deduce the mischief. Our
mystery person claims he will reveal all shortly...
Scott
1998\07\22@125458
by
Peter L. Peres
On Tue, 21 Jul 1998, Scott Dattalo wrote:
> BTW. Peter's claim of a 5 cycle version neglects to take into account
> the overhead of calling a lookup table. (I.e. a lookup table takes a
> minimum of 6 cycles). But a 7 cycle table lookup approach is possible.
Table
retlw 0
retlw 1
retlw 1
...
CountBits ; input in W, constant 'Table' stored in register TABLE
addwf TABLE, 0
movwf PC
With the call, 7 clocks, not 6, but one can cunningly combine this code
with the rx/tx subroutine code that needs it and spare the 2 T's for call.
That's 5 T's ;) Anyway, who can spare 256 program locations in a PIC for a
table...
Peter
1998\07\22@132114
by
Nuno Pedrosa

I have not enough background to produce any code now, but could you
split the byte into 2 nibbles and use just a 16 byte table? How many
cycles could you squeeze that into?
Nuno.
Peter L. Peres wrote:
{Quote hidden}>
> On Tue, 21 Jul 1998, Scott Dattalo wrote:
>
> > BTW. Peter's claim of a 5 cycle version neglects to take into account
> > the overhead of calling a lookup table. (I.e. a lookup table takes a
> > minimum of 6 cycles). But a 7 cycle table lookup approach is possible.
>
> Table
> retlw 0
> retlw 1
> retlw 1
> ...
>
> CountBits ; input in W, constant 'Table' stored in register TABLE
> addwf TABLE, 0
> movwf PC
>
> With the call, 7 clocks, not 6, but one can cunningly combine this code
> with the rx/tx subroutine code that needs it and spare the 2 T's for call.
> That's 5 T's ;) Anyway, who can spare 256 program locations in a PIC for a
> table...
>
> Peter

.^. _,^,_ /\
___(  )_____ Nuno Filipe Freitas Pedrosa __________ o(`}  , __/\/ /__
/*\\//*\ SIEMENS S.A. Portugal (] /' \ \/\
\(\\V//)/ Nuno.PedrosaKILLspamoen.siemens.de (]`% (") \/\ \
` = ' Tel. :0035114242454 (") /`/ / /\/
__B//\\P___________________________________________\' / `/______\/____
`' "Try and leave this world a little better than you found it..."
1998\07\22@180712
by
Peter L. Peres
On Wed, 22 Jul 1998, Nuno Pedrosa wrote:
> I have not enough background to produce any code now, but could you
> split the byte into 2 nibbles and use just a 16 byte table? How many
> cycles could you squeeze that into?
>
> Nuno.
CountBits
swapf INPUT, W
call CountNibble
movwf OUTPUT
movf INPUT, W
call CountNubble
addwf OUTPUT, F
; done
CountNibble
andlw 0Fh ; drop top nibble
addlw Table
movwf PC
Table
retlw 0
retlw 1
retlw 1
...
; 16 of these
; score: 25 W 20 T, 2 subroutines deep
Not a good score imho. The brute force method works better (17 W 17 T,
constant run time, and no subroutine calls required  also the other
methods shown).
Peter
1998\07\22@222021
by
Chip Weller
The best I can do is 17 cycles, including short call and the return
instruction:
CountBitsInW ; not tested.
movwf Tmp ; Tmp = 'a b c d e f g h' bits
andlw B'01010101' ; W = '0 b 0 d 0 f 0 h'
addwf Tmp,f ; carry, f = 'a+b c+d e+f g+h 0'
rrf Tmp,f ; shift carry back in, right justify
movf Tmp,w ; get the value.
andlw B'00110011' ; get c+d and g+h pairs.
xorwf Tmp,f ; Tmp = a+b and e+f pairs, shift up 2 bits.
rrf Tmp,f ; shift down by 2 bits, carry 0.
rrf Tmp,f ; next shift.
addwf Tmp,f ; add pairs together.
swapwf Tmp,w ; swap nibbles, get into w.
addwf Tmp,w ; add in other nibble
andlw 0x0F ; mask unwanted duplicate high byte.
return ; return with count in W.
{Original Message removed}
1998\07\22@230238
by
Regulus Berdin
> From: Chip Weller <.....awellerKILLspam.....columbus.rr.com>
> CountBitsInW ; not tested.
> movwf Tmp ; Tmp = 'a b c d e f g h' bits
> andlw B'01010101' ; W = '0 b 0 d 0 f 0 h'
> addwf Tmp,f ; carry, f = 'a+b c+d e+f g+h 0'
An error will occur at this point if there are 2 or more
consecutive 1's on your data because it will be carried
over to the next bit. Masking the b,d,f and h on Tmp
should eliminate this.
> rrf Tmp,f ; shift carry back in, right justify
> movf Tmp,w ; get the value.
> andlw B'00110011' ; get c+d and g+h pairs.
> xorwf Tmp,f ; Tmp = a+b and e+f pairs, shift up 2 bits.
> rrf Tmp,f ; shift down by 2 bits, carry 0.
> rrf Tmp,f ; next shift.
> addwf Tmp,f ; add pairs together.
Same error also exists here.
Reggie
1998\07\23@070359
by
Chip Weller

Reggie:
Sorry to disagree with your analysis. I just tested the routine (with a
small number of cases) and it works fine, except for misspelling "swapf" See
comments later.
CountBitsInW ; Tested.
movwf Tmp ; Tmp = 'a b c d e f g h' bits
andlw B'01010101' ; W = '0 b 0 d 0 f 0 h'
addwf Tmp,f ; carry, f = 'a+b c+d e+f g+h 0'
rrf Tmp,f ; shift carry back in, right justify
movf Tmp,w ; get the value.
andlw B'00110011' ; get c+d and g+h pairs.
xorwf Tmp,f ; Tmp = a+b and e+f pairs, shift up 2 bits.
rrf Tmp,f ; shift down by 2 bits, carry 0.
rrf Tmp,f ; next shift.
addwf Tmp,f ; add pairs together.
swapf Tmp,w
addwf Tmp,w
andlw 0x0F ; mask unused nibble.
return
>> From: Chip Weller <EraseMEawellerspam_OUTTakeThisOuTcolumbus.rr.com>
>> CountBitsInW ; not tested.
>> movwf Tmp ; Tmp = 'a b c d e f g h' bits
>> andlw B'01010101' ; W = '0 b 0 d 0 f 0 h'
>> addwf Tmp,f ; carry, f = 'a+b c+d e+f g+h 0'
>An error will occur at this point if there are 2 or more
>consecutive 1's on your data because it will be carried
>over to the next bit. Masking the b,d,f and h on Tmp
>should eliminate this.
The addwf Tmp,f will add bits b, d, f, h to each other. This effectively
shifts the data up and adds to to bits a, c, e, h, with a possible carry if
both a and b are 1. This is shifted back down in the next instruction.
>
>> rrf Tmp,f ; shift carry back in, right justify
>> movf Tmp,w ; get the value.
>> andlw B'00110011' ; get c+d and g+h pairs.
>> xorwf Tmp,f ; Tmp = a+b and e+f pairs, shift up 2 bits.
>> rrf Tmp,f ; shift down by 2 bits, carry 0.
>> rrf Tmp,f ; next shift.
>> addwf Tmp,f ; add pairs together.
>Same error also exists here.
I am not what you mean here. Tmp == a+b and e+f pairs aligned in each nibble
while W = c+d and g+h pairs.
>
>Reggie
1998\07\23@105254
by
Regulus Berdin

Hi all,
>Sorry to disagree with your analysis. I just tested the routine (with a
>small number of cases) and it works fine, except for misspelling "swapf"
I guess Chip is right. I am sorry :( .
I did not grasp the idea at first. For others sake, let me try to
further explain the concept of Chip's routine:
CountBitsInW ; Tested.
movwf Tmp ; Tmp = 'a b c d e f g h' bits
andlw B'01010101' ; W = '0 b 0 d 0 f 0 h'
addwf Tmp,f ; carry, f = 'a+b c+d e+f g+h 0'
;
; Tmp = a b c d e f g h
; W = 0 b 0 d 0 f 0 h
;
; Redefine Tmp as,
; Tmp = 0 b 0 d 0 f 0 h + a 0 c 0 d 0 g 0
; add W = 0 b 0 d 0 f 0 h
; 
; b 0 d 0 f 0 h 0 + a 0 c 0 d 0 g 0
;
; C&W = (a+b)(c+d)(e+f)(g+h)
rrf Tmp,f ; shift carry back in, right justify
;at this point Tmp=(ab)(cd)(ef)(gh) and C=0
movf Tmp,w ; get the value.
andlw B'00110011' ; get c+d and g+h pairs.
xorwf Tmp,f ; Tmp = a+b and e+f pairs, shift up 2 bits.
; the trick here is this,
; W = (00)(cd)(00)(gh)
; Xor Tmp = (ab)(cd)(ef)(gh)
; 
; Tmp = (ab)(00)(ef)(00)
rrf Tmp,f ; shift down by 2 bits, carry 0.
rrf Tmp,f ; next shift.
addwf Tmp,f ; add pairs together.
; then temp is aligned to W by shifting right 2 times
; before the 1st shift carry = 0 from above so 0 is shifted in
; and before the 2nd shift carry is also zero because bit0 of tmp=0
; before the shifts so,
;
; Tmp = 00(ab) (00)(ef)
; add W = 00(cd) (00)(gh)
; 
; Tmp = (a+b+c+d)(e+f+g+h)
swapf Tmp,w
addwf Tmp,w
andlw 0x0F ; mask unused nibble.
; swapping, adding the nibbles and masking the upper nibble
; derives the answer
I think this solution is actually 13 cycles excluding the return :).
Reggie
More... (looser matching)
 Last day of these posts
 In 1998
, 1999 only
 Today
 New search...