Truncated match.
PICList
Thread
'10 bit maths'
1996\11\01@055440
by
Andy David
Guys,
What's the fastest way to square a 10bit number?
 Andy.
*************************************************************
Andrew David Senior Project Engineer  Software
Ultronics Division spam_OUTAndyTakeThisOuTUltronics.co.uk
Ultra Hydraulics Ltd.
Anson Business Park
Cheltenham Road East
Staverton
Glos. GL2 9QN
Tel.: (01452) 858376 (Direct) Ultronics Fax.: (01452) 858377
*************************************************************
1996\11\01@130101
by
Chuck McManis
> What's the fastest way to square a 10bit number?
Look it up in a table. Given a 20 bit result, 10 bits is 1024
possible answers, a 32K bit eprom (2732) could hold the
result.
Actually squaring the number in PIC assembly, no idea. :)
Chuck
1996\11\04@031818
by
Stuart Taylor
Hi
>
> > What's the fastest way to square a 10bit number?
>
Try looking in the applications handbook, the maths routines include
8 and 16 bit multiplies. Use the 16 bit one but limit the first byte to
a 2 bit multiply.
Stuart Taylor
1996\11\04@040504
by
fastfwd

Stuart Taylor <.....PICLISTKILLspam@spam@MITVMA.MIT.EDU> wrote:
> > > What's the fastest way to square a 10bit number?
>
> Try looking in the applications handbook, the maths routines include
> 8 and 16 bit multiplies. Use the 16 bit one but limit the first byte
> to a 2 bit multiply.
Yeah, but those routines are really slow... For a specialized case
like this, a better choice (barring some clever squaring algorithm of
which I'm unaware) would be one of the following. Note that I've
only worked these out for 4bit numbers; the 10bit equivalents will
be a lot messier, but these examples'll give you the idea:
SOLUTION 1  Real fast for 4bit numbers, probably slower than
solution #2 for 10bit numbers:
; Enter with 4bit number to be squared in register "X".
; Exits with X*X in the W register.
movlw 0
clrc
btfsc x,0
addwf x,w
rlf x
btfsc x,2
addwf x,w
rlf x
btfsc x,4
addwf x,w
rlf x
btfsc x,6
addwf x,w
SOLUTION #2: Slower than #1 for 4bit numbers, but with
sufficientlygood ordering of instructions, could be pretty fast for
10bit numbers:
BASIC version first, showing the results of squaring all
possible 4bit numbers:
for a = 0 to 1
for b = 0 to 1
for c = 0 to 1
for d = 0 to 1
n = a*8 + b*4 + c*2 + d
w = 0
x = 0
y = 0
z = 0
if d then z = a*16 + b*8 + c*4 + 1
if c then y = a * 32 + b*16 + 4
if b then x = a*64 + 16
if a then w = 64
r = w+x+y+z
s = n*n
print n,s,r,: if s <> r then print "Error" else print "OK"
next:next:next:next
Now the PIC version:
; Enter with number to be squared in X.
; Exits with X*X in the Wregister.
btfss x,0
goto doc
clrc
rlf x,w
addlw 1
doc:
btfss x,1
goto dob
btfsc x,3
addlw 32
btfsc x,2
addlw 16
dob:
btfss x,2
goto doa
btfsc x,3
addlw 64
addlw 16
doa:
btfsc x,3
addlw 64
Personally, I'd go with Solution #1... As I was working out the
assembly code for Solution #2, it was quickly becoming unmanageable
even for only 4 bits.
Andy
Andrew Warren  fastfwdKILLspamix.netcom.com
Fast Forward Engineering, Vista, California
http://www.geocities.com/SiliconValley/2499
1996\11\04@043859
by
Keith Dowsett
>Guys,
>
>
>What's the fastest way to square a 10bit number?
>
>
Is there scope to break it down into three simpler multiplications?
(a+b)^2 = a^2 + b^2 + 2ab
This gives one two bit multiply (trivial) one 2 bit x 8 bit multiply
(straightforward) and one eight bit multiply (conventional but still a bit slow)
If you are subsequently rescaling the result to fit in 16bits you can save
a couple of cycles on the eight bit multiply too.
Keith.
==========================================================
Keith Dowsett "Variables won't; constants aren't."
Email: .....kdowsettKILLspam.....rpms.ac.uk
WWW: http://kd.rpms.ac.uk/index.html
1996\11\04@100333
by
Andy David

Stuart Taylor suggested...
>
> > > > What's the fastest way to square a 10bit number?
> >
> > Try looking in the applications handbook, the maths routines include
> > 8 and 16 bit multiplies. Use the 16 bit one but limit the first byte to
> > a 2 bit multiply.
Thanks, Stuart.
The apps handbook is always a good place to start, but those routines
are way too slow.
I was thinking that a 10 bit by 10 bit multiply is faster than a
16 by 16, and there might be a more efficient algorithm for squaring
than multiplying... I'm a little short of general reference material
maybe I'll ask my wife for some of Knuth's book for Christmas...
Keith Dowsett offered...
{Quote hidden}>
>Is there scope to break it down into three simpler multiplications?
>
>(a+b)^2 = a^2 + b^2 + 2ab
>
>This gives one two bit multiply (trivial) one 2 bit x 8 bit multiply
>(straightforward) and one eight bit multiply (conventional but still a bit
> slow)
>If you are subsequently rescaling the result to fit in 16bits you can
> save a couple of cycles on the eight bit multiply too.
Thanks Keith.
I hadn't thought about this type of appproach, but I expect that taking
the calculation 'head on' will be quicker. Time permitting I'll compare
this
approach to below and let you know how it works out.
Andrew Warren worked out...
>
> SOLUTION 1  Real fast for 4bit numbers, probably slower than
> solution #2 for 10bit numbers:
>
...[snip some lines of code]...
>
> SOLUTION #2: Slower than #1 for 4bit numbers, but with
> sufficientlygood ordering of instructions, could be pretty fast for
> 10bit numbers:
>
...[snip a few more lines of code]...
>
> Personally, I'd go with Solution #1...
thanks, Andy, that's very helpful. I'll complete these for 10bits and
if anyone is interested I'll post the code. As usual I've been sidetracked
today so it might be a day or 2.
For things like this, what reference material would you suggest?
 Andy.
*************************************************************
Andrew David Senior Project Engineer  Software
Ultronics Division EraseMEAndyspam_OUTTakeThisOuTUltronics.co.uk
Ultra Hydraulics Ltd.
Anson Business Park
Cheltenham Road East
Staverton
Glos. GL2 9QN
Tel.: (01452) 858376 (Direct) Ultronics Fax.: (01452) 858377
*************************************************************
1996\11\04@134244
by
Scott Dattalo

Keith Dowsett wrote:
>
> >Guys,
> >
> >
> >What's the fastest way to square a 10bit number?
> >
> >
> Is there scope to break it down into three simpler multiplications?
>
> (a+b)^2 = a^2 + b^2 + 2ab
>
Precisely what I was thinking. However, let me expand upon a
couple different branches.
Branch 1.
Suppose you divide the number you wish to square in half, that
is into 5 most significant bits and 5 lsb's:
N = a*32 + b
a = 5 msb's
b = 5 lsb's
Now using Keith's observation:
N^2 = 32^2 * a^2 + 64*a*b + b^2
= (a^2)<<10 + (a*b)<<5 + b^2
And the 10bit multiplication has been broken down into
three 5bit multiplications. However, with a little work
the 5bit squares may be evaluated via a tablelookup.
This would leave you with one general purpose 5bit
multiplication.
Branch 2.
Suppose you divide the number into three nibbles. The most
significant nibble would only be 2 bits wide.
N = a*256 + b*16 + c
N^2 = (a*256)^2 + (b*16)^2 + c^2 +
2*(a*256*b*16 + a*256*c + b*16*c)
= (a^2)<<16 + (b^2)<<8 + c^2 +
((a*b)<<12 + (a*c)<<8 + b*c<<4))<<1
Squaring the nibbles can be done trivially with a look up
table. The other three 4X4 bit multiplications can be
quickly computed with Andy W's 13 instruction suggestion.
If you wish to take advantage of 'a' being only 2bits
wide, then investigate
N^2 = (a^2)<<16 + (b^2)<<8 + c^2 +
((a*(b*16 + c)<<8 + b*c<<4))<<1
This reduces to: 3 table lookups, one 4X4 bit multiplication
and one 2X8 bit multiplication. My guess is that this could
be coded to execute in less than 60 cycles.
Scott
P.S. Have you considered running the square root routine
backwards? :)
1996\11\04@134451
by
fastfwd
Keith Dowsett <PICLISTspam_OUTMITVMA.MIT.EDU> wrote:
> > What's the fastest way to square a 10bit number?
>
> Is there scope to break it down into three simpler multiplications?
>
> (a+b)^2 = a^2 + b^2 + 2ab
>
> This gives one two bit multiply (trivial) one 2 bit x 8 bit multiply
> (straightforward) and one eight bit multiply (conventional but still
> a bit slow)
Keith:
Have you coded this into PIC assembly language? I have a feeling
that, in practice, it'll be slower than a standard 10bit x 10bit
multiplication.
Andy
=== Andrew Warren  @spam@fastfwdKILLspamix.netcom.com ===
=== Fast Forward Engineering  Vista, California ===
=== ===
=== Custodian of the PICLIST Fund  For more info, see: ===
=== http://www.geocities.com/SiliconValley/2499/fund.html ===
1996\11\04@142417
by
fastfwd

Andy David <KILLspamPICLISTKILLspamMITVMA.MIT.EDU> wrote:
> The apps handbook is always a good place to start, but those
> routines are way too slow. I was thinking that a 10 bit by 10 bit
> multiply is faster than a 16 by 16, and there might be a more
> efficient algorithm for squaring than multiplying... I'm a little
> short of general reference material maybe I'll ask my wife for some
> of Knuth's book for Christmas...
> ....
> For things like this, what reference material would you suggest?
Andy:
A 10x10 multiply IS faster than 16x16, mostly because the result is
guaranteed to fit in three bytes rather than 4. If you don't have
the code space to unroll the whole multiplication into inline code
(as I showed in my "Solution #1"), looped code can still be made
pretty fast by writing three loops:
One for the first few bits of the multiplier, where the result
of each addition of the multiplicand to the intermediate result
is guaranteed to fit in the leastsignificant two bytes,
one for the next group of bits, where the addition may affect
all three bytes of the result, and
one for the final few bits, where the addition is guaranteed to
only affect the two mostsignificant bytes of the result.
As for reference books... I don't know; I looked through Knuth and a
couple of other books, but didn't find any really fast squaring
algorithms.
As usual, though, I'll probably discover a month from now, while
investigating some other problem, that Knuth's book DOES have a good
squaring algorithm.
Andy
P.S. By the way, the "Solution #1" algorithm I posted earlier is
very similar to Keith Dowsett's suggestion; I just took it a little
farther and (for 4bit numbers) factored the problem into:
(a + b + c + d)^2
= a^2 + b^2 + c^2 + d^2 + 2ab + 2ac + 2ad + 2bc + 2bd + 2cd.
This removes ALL multiplications from the algorithm; each term just
reduces to either "0" or a single bit.
=== Andrew Warren  RemoveMEfastfwdTakeThisOuTix.netcom.com ===
=== Fast Forward Engineering  Vista, California ===
=== ===
=== Custodian of the PICLIST Fund  For more info, see: ===
=== http://www.geocities.com/SiliconValley/2499/fund.html ===
1996\11\05@063301
by
Andy David

> Keith Dowsett wrote:
> >
> > >
> > Is there scope to break it down into three simpler multiplications?
> >
> > (a+b)^2 = a^2 + b^2 + 2ab
> >
>
Scott Dattalo wrote:
> Precisely what I was thinking. However, let me expand upon a
> couple different branches.
>
hmmmm ...methinks I need an intuition implant...
my gut feel was that something like Andy W's "straight line"
solution would be a bit quicker than a few smaller calculations
whose results are combined. My gut feel isn't the most reliable
indication of anything, though :) I'm working on improving it.
Thanks to Keith, Andy and Scott for excellent suggestions
and examples... looks like I've got a few routines to
write! If I'm brave enough, I'll post the fastest one I can
come up with this week.
 Andy.
*************************************************************
Andrew David Senior Project Engineer  Software
Ultronics Division spamBeGoneAndyspamBeGoneUltronics.co.uk
Ultra Hydraulics Ltd.
Anson Business Park
Cheltenham Road East
Staverton
Glos. GL2 9QN
Tel.: (01452) 858376 (Direct) Ultronics Fax.: (01452) 858377
*************************************************************
1996\11\05@111351
by
John Payson

> > > Is there scope to break it down into three simpler multiplications?
> > >
> > > (a+b)^2 = a^2 + b^2 + 2ab
> hmmmm ...methinks I need an intuition implant...
> my gut feel was that something like Andy W's "straight line"
> solution would be a bit quicker than a few smaller calculations
> whose results are combined. My gut feel isn't the most reliable
> indication of anything, though :) I'm working on improving it.
>
> Thanks to Keith, Andy and Scott for excellent suggestions
> and examples... looks like I've got a few routines to
> write! If I'm brave enough, I'll post the fastest one I can
> come up with this week.
Something like this? [warning: untested]
clrf DstH
clrf DstM
clrf DstL
movf SrcL,w
andlw $0F
btfss Src,4
addwf DstM
rrf DstM
rrf DstL
btfss Src,5
addwf DstM
rrf DstM
rrf DstL
btfss Src,6
addwf DstM
rrf DstM
rrf DstL
btfss Src,7
addwf DstM
call Sqr4
addwf DstL
swapf SrcL
andlw $0F
call Sqr4
addwf DstM
; At this point, 16bit result is in DstM:DstH
; 25 words of code prior to this point (plus a
; 17word tablelookup). Total execution time:
; 35 cycles up to this point.
btfss SrcH,0
goto NoBit8
movf SrcL,w
btfsc C
incf DstH
addwf DstM
btfsc C
incf DstH
incf DstH
; Another 9 words for bit 8; 3 or 9 cycles to exec.
NoBit8:
btfss SrcH,1
goto NoBit9
movlw 4
btfss SrcH,0
movlw 8
addwf DstH
rlf SrcL,w
btfsc C
incf DstH
btfsc C
incf DstH
addwf DstM
btfsc C
incf DstH
addwf DstM
btfsc C
incf DstH
; Another 17 words for bit 9; 3 or 17 cycles to execute
; Total worstcase time: 35+26 = 61 cycles.
NoBit9:
retlw 0 ; All done!
Sqr4:
addwf PC
db 0,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225
1996\11\05@152500
by
Matthew Mucker
>> > What's the fastest way to square a 10bit number?
>
>Try looking in the applications handbook, the maths routines include
>8 and 16 bit multiplies. Use the 16 bit one but limit the first byte to
>a 2 bit multiply.
>
>Stuart Taylor
What is this "applications handbook?" Could you please elaborate a little
more about whait it is, what's in it, and where to get it?
Thanks,
Matt
"DOS Computers manufactured by companies such as IBM, Compaq, Tandy, and
millions of others are by far the most popular, with about 70 million
machines in use wordwide. Macintosh fans, on the other hand, may note that
cockroaches are far more numerous than humans, and that numbers alone do
not denote a higher life form."
More... (looser matching)
 Last day of these posts
 In 1996
, 1997 only
 Today
 New search...