Truncated match.
PICList
Thread
'A challenge for square(r)s'
1999\03\22@142630
by
Scott Dattalo
It's been a while since we've had a challenge. And now that Dmitry's
back, I guess we better see if he's lost any of his pic skills :).
Now this isn't really a brand new challenge because I've seen bits and
pieces before posted to the pic list. But at any rate:
1) Shortest and/or fastest code to square an 8 bit number. (The shortest
is probably Andy's 8 by 8 multiply. The fastest is about 35 cycles, as I
recall).
2) Shortest and/or fastest code to square a 4bit number. (I can think
of one that takes only 10cycles)
No points  just glory!
1999\03\22@150244
by
Byron A Jeff
>
> It's been a while since we've had a challenge. And now that Dmitry's
> back, I guess we better see if he's lost any of his pic skills :).
>
> Now this isn't really a brand new challenge because I've seen bits and
> pieces before posted to the pic list. But at any rate:
>
> 1) Shortest and/or fastest code to square an 8 bit number. (The shortest
> is probably Andy's 8 by 8 multiply. The fastest is about 35 cycles, as I
> recall).
>
> 2) Shortest and/or fastest code to square a 4bit number. (I can think
> of one that takes only 10cycles)
Wouldn't the fastest simply be to have a jump table that returns the square
of the number? Only 16 entries and should take 6 cycles if the number to
square is already in W.
BAJ
1999\03\22@153610
by
Wagner Lipnharski

If 8^2 = 64 = 2^3 x 2^3 = 2^(3+3) = 2^6
then square root of 64 is 2^(6/2) = 2^3
It works for simple binary numbers.
Take a look at:
http://www.interstice.com/~sdattalo/technical/software/pic/picsqrt.html
Byron A Jeff wrote:
{Quote hidden}>
> >
> > It's been a while since we've had a challenge. And now that Dmitry's
> > back, I guess we better see if he's lost any of his pic skills :).
> >
> > Now this isn't really a brand new challenge because I've seen bits and
> > pieces before posted to the pic list. But at any rate:
> >
> > 1) Shortest and/or fastest code to square an 8 bit number. (The shortest
> > is probably Andy's 8 by 8 multiply. The fastest is about 35 cycles, as I
> > recall).
> >
> > 2) Shortest and/or fastest code to square a 4bit number. (I can think
> > of one that takes only 10cycles)
>
> Wouldn't the fastest simply be to have a jump table that returns the square
> of the number? Only 16 entries and should take 6 cycles if the number to
> square is already in W.
>
> BAJ
1999\03\22@155817
by
Marc
> 2) Shortest and/or fastest code to square a 4bit number. (I can think
> of one that takes only 10cycles)
A 4 bit number IMHO is most efficiently squared using a lookup table. That's
a bit less than 10 cycles :)
1999\03\22@164825
by
Scott Dattalo

Byron A Jeff wrote:
{Quote hidden}>
> >
> > It's been a while since we've had a challenge. And now that Dmitry's
> > back, I guess we better see if he's lost any of his pic skills :).
> >
> > Now this isn't really a brand new challenge because I've seen bits and
> > pieces before posted to the pic list. But at any rate:
> >
> > 1) Shortest and/or fastest code to square an 8 bit number. (The shortest
> > is probably Andy's 8 by 8 multiply. The fastest is about 35 cycles, as I
> > recall).
> >
> > 2) Shortest and/or fastest code to square a 4bit number. (I can think
> > of one that takes only 10cycles)
>
> Wouldn't the fastest simply be to have a jump table that returns the square
> of the number? Only 16 entries and should take 6 cycles if the number to
> square is already in W.
>
> BAJ
Of course... What I was thinking about applies to multiplication. For
example if you wanted to multiply two 4bit numbers stored in separate
registers how would you go about doing this? After I posted the
challenge, I realized that the solution I was thinking about requires 14
cycles. A lookup table can still be used, however 256 entries (at most I
guess) would be required.
So if I may, let's change the challenge to multiplication of 4bit
numbers instead of squaration (new term) of them. I'll by BAJ a Bud.
Scott
1999\03\22@175833
by
John Payson

So if I may, let's change the challenge to multiplication of 4bit
numbers instead of squaration (new term) of them. I'll by BAJ a Bud.
Well, let's see...
movlw 0
bcf C
btfss n1,0
addwf n2,w
rlf n2,f
btfss n1,1
addwf n2,w
rlf n2,f
btfss n1,2
addwf n2,w
rlf n2,f
btfss n1,3
addwf n2,w
rlf n2,f
swapf n2,f
15 cycles, putting the result in W and leaving n1 and n2 untouched at
the end. Striking the requirement that n2 be untouched would save the
last two cycles. Note that if n1<=15 and n1*n2<=255 this routine will
produce a correct result even if n2>15; the value in n2 will be trashed
in such a case, though.
Aiming for a result in n2...
swapf n1,w ; Assume top 4 bits zero
btfss n2,0
addwf n2,f
rrf n2,f
btfss n2,0
addwf n2,f
rrf n2,f
btfss n2,0
addwf n2,f
rrf n2,f
btfss n2,0
addwf n2,f
rrf n2,f
13 cycles, leaving the result in n2. This one won't work if either fac
tor is oversized.
BTW, on the subject of squaring, a potentially useful observation in
some cases is that
(a+b)^2 = a^2 + b^2 + 2ab
Thus, any multiplication could be performed as three squaring operations,
a subtraction, and a shift. Not generally worthwhile on a PIC, but if
one has an external OTP loaded with, e.g., a 16bit table of squares it
could be interesting in some cases...
Another useful trick with the above squaring formula, btw, is that when
squaring a large number some of the partial products will be identical.
For example, a 16bit number can be squared using two "general" 8x8 mul
tiplies and an 8bit square operation. To square a 32bit number (made
of 4byte chunks) one would compute:
a b c d
a b c d

ad bd cd dd
ac bc cc dc
ab bb cb db
aa ba ca da

aa  2ab  bb+2ac  2ad+2bc  2bd+cc  2cd  dd
Thus 10 multiplies (4 of them squares).
1999\03\23@110722
by
Ray Gardiner
><snip>
>So if I may, let's change the challenge to multiplication of 4bit
>numbers instead of squaration (new term) of them. I'll by BAJ a Bud.
>
>Scott
; X*Y ; result left in W
; X and Y are 4 bits
;
;
CLRW ; Clear Result
CLRC ; Clear Carry for RLF later ;
BTFSC Y,0 ; ?*1
ADDWF X,W ; W = ?X
RLF X,F ; X = 2X
;
BTFSC Y,1 ; ?*2
ADDWF X,W ; W = ?X + ?2X
RLF X,F ; W = 4X
BTFSC Y,2 ; ?*4
ADDWF X,W ; W = ?X + ?2X + ?4X
RLF X,F ; W = 8X
BTFSC Y,3 ; ?*8
ADDWF X,W ; ?X + ?2X + ?4X + ?8X
;;;;;;;;;;;;;;;;;;;;;;;;;
13 cycles result in W 13 bytes of code space
Looking back, I now realise that is what John Payson has already
posted. Albeit in a different form. Sigh(!)
Thanks for the challenge Scott, keep 'em coming.
Now, about that 10 cycle solution you mentioned....
Ray Gardiner spam_OUTrayTakeThisOuThdc.com.au
1999\03\23@131822
by
Andy David
Guys,
I've been off the list for a bit over a week, can anyone email me a summary
of this challenge? Not that I'm going to submit anything, I'm just
interested
Thanks,
 Andy.

Andrew David, Software Manager, Ultronics Ltd, Cheltenham.
.....akdavidKILLspam@spam@Ultronics.co.uk http://www.ultronics.com/

1999\03\23@214003
by
Scott Dattalo

John Payson wrote:
>
> So if I may, let's change the challenge to multiplication of 4bit
> numbers instead of squaration (new term) of them. I'll by BAJ a Bud.
>
> Well, let's see...
<snip>
> 15 cycles, putting the result in W and leaving n1 and n2 untouched at
> the end. Striking the requirement that n2 be untouched would save the
<snip>
>
> Aiming for a result in n2...
>
<snip>
> 13 cycles, leaving the result in n2. This one won't work if either fac
> tor is oversized.
and then Ray Gardiner likewise wrote:
>
<snip>
> 13 cycles result in W 13 bytes of code space
>
> Looking back, I now realise that is what John Payson has already
> posted. Albeit in a different form. Sigh(!)
These are the solutions that Andy Warren posted a couple of years ago (I
have no idea if he was the original author or not). I think this
shiftandadd trick is really clever. Good work, but what about the 8X8
squaring? Hint  it uses a variation of this same trick.
{Quote hidden}>
> BTW, on the subject of squaring, a potentially useful observation in
> some cases is that
>
> (a+b)^2 = a^2 + b^2 + 2ab
>
> Thus, any multiplication could be performed as three squaring operations,
> a subtraction, and a shift. Not generally worthwhile on a PIC, but if
> one has an external OTP loaded with, e.g., a 16bit table of squares it
> could be interesting in some cases...
>
> Another useful trick with the above squaring formula, btw, is that when
> squaring a large number some of the partial products will be identical.
> For example, a 16bit number can be squared using two "general" 8x8 mul
> tiplies and an 8bit square operation. To square a 32bit number (made
> of 4byte chunks) one would compute:
>
> a b c d
> a b c d
> 
> ad bd cd dd
> ac bc cc dc
> ab bb cb db
> aa ba ca da
> 
>
> aa  2ab  bb+2ac  2ad+2bc  2bd+cc  2cd  dd
>
> Thus 10 multiplies (4 of them squares).
Cute, Eh?
There's also a variation of this algorithm for multiplying two arbitrary
numbers:
a b
c d

ad bd
ac bc

ac  ad + bc  bd
now the middle term can be rewritten
ad + bc = ac + bd  (a  b)(c  d)
Which at first glance seems like an inefficient substitution. However,
notice that the 'ac' and 'bd' terms have already been computed. So two
multiplications and an addition are replaced with one multiplication and
two additions and two 'half sized' additions. The real savings occur for
multiplying large numbers. The efficiency compared to longhand
multiplication is comparable to efficiency of FFT's vs. DFT's.
Unfortunately, when I attempted to use this to multiply 16 bit numbers
on a pic (using 8 bit numbers for the a,b,c,d terms) I found that the
16bit longhand shiftandadd algorithm to be much faster (and
smaller).
Scott
1999\03\24@194214
by
Dmitry Kiryashov
John Payson wrote:
> Aiming for a result in n2...
>
> swapf n1,w ; Assume top 4 bits zero
> btfss n2,0
Why btfss ?
> addwf n2,f
> rrf n2,f
If we have Cy=1 before entering this code fragment and we skip
first addition we'll got error.
{Quote hidden}> btfss n2,0
> addwf n2,f
> rrf n2,f
> btfss n2,0
> addwf n2,f
> rrf n2,f
> btfss n2,0
> addwf n2,f
> rrf n2,f
>
> 13 cycles, leaving the result in n2. This one won't work if
> either factor is oversized.
At all code looks very impressive and compact.
WBR Dmitry.
1999\03\25@112330
by
John Payson
John Payson wrote:
> Aiming for a result in n2...
>
> swapf n1,w ; Assume top 4 bits zero
> btfss n2,0
Why btfss ?
> addwf n2,f
> rrf n2,f
If we have Cy=1 before entering this code fragment and we skip
first addition we'll got error.
[snips]
Mea culpae. Sorry 'bout that.
1999\03\29@121702
by
Dmitry Kiryashov
Hi guys. Sorry for such delayed reply.
Scott Dattalo wrote:
>
> John Payson wrote:
> >
> > 13 cycles, leaving the result in n2. This one won't work if either fac
> > tor is oversized.
How about this ?
; 4 X 4 multiplication
movfw x ;0000abcd
addwf x,F
btfss y,0 ;0000efgh
movlw 0
btfsc y,1
addwf x,W
rlf x,F
btfsc y,2
addwf x,W
rlf x,F
btfsc y,3
addwf x,W ;result in W
; 12 clocks/words
8 bit version maybe coming soon ;)
WBR Dmitry.
More... (looser matching)
 Last day of these posts
 In 1999
, 2000 only
 Today
 New search...