Searching \ for ' [PIC] Number representation' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/microchip/devices.htm?key=pic
Search entire site for: 'Number representation'.

No exact or substring matches. trying for part
PICList Thread
'[PICLIST] [PIC] Number representation'
2002\03\13@065424 by Claudio Tagliola

flavicon
face
Hi all,


I want to represent a certain value in as little 8-bit bytes as
possible. The value range is from -360.0 to 360.0, with 0.1 decimal
accuracy, maybe 0.01 if room permits. This (0.1 acc.) is possible in 16
bits, but what representation would be best/most convenient? I'm
thinking about fixed point, but not shure (not my expertise). The
operations I would need most is simple addition, but gonio calculations
are needed too, albeit less used. PIC platform is the 16 family. Code
space is an issue, code execition speed is not (except when it's > 1 ms
per operation on a 40 Mhz PIC).


Best regards,

Claudio

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@071644 by Spehro Pefhany

picon face
At 12:53 PM 3/13/02 +0100, you wrote:
>Hi all,
>
>
>I want to represent a certain value in as little 8-bit bytes as
>possible. The value range is from -360.0 to 360.0, with 0.1 decimal
>accuracy, maybe 0.01 if room permits. This (0.1 acc.) is possible in 16
>bits, but what representation would be best/most convenient? I'm
>thinking about fixed point, but not shure (not my expertise). The
>operations I would need most is simple addition, but gonio calculations
>are needed too, albeit less used. PIC platform is the 16 family. Code
>space is an issue, code execition speed is not (except when it's > 1 ms
>per operation on a 40 Mhz PIC).

I suggest scaling the numbers by 80. you then get a resolution of 0.0125
and you can get a number for display by dividing by 8 (3 shifts),
giving you +/-3600. 50 is another possibility.

Best regards,

Spehro Pefhany --"it's the network..."            "The Journey is the reward"
spam_OUTspeffTakeThisOuTspaminterlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
9/11 United we Stand

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@093414 by o-8859-1?Q?K=FCbek_Tony?=

flavicon
face
Hi,

Claudio Tagliola wrote:
>I want to represent a certain value in as little 8-bit bytes as
>possible. The value range is from -360.0 to 360.0, with 0.1 decimal
>accuracy, maybe 0.01 if room permits. This (0.1 acc.) is possible in 16
>bits, but what representation would be best/most convenient? I'm
>thinking about fixed point, but not shure (not my expertise). The
>operations I would need most is simple addition, but gonio calculations
>are needed too, albeit less used. PIC platform is the 16 family. Code
>space is an issue, code execition speed is not (except when it's > 1 ms
>per operation on a 40 Mhz PIC).

Well fixed point is definately the way to go for limited range numbers.
Floating point has it's main purpose with higly varying data (i.e. data with
different magnitude)
and this is not the case here.

The selection of your fixed point format needs to consider several facts
if you want the most efficient (in speed) solution. One way is as Spehro
suggested
if definately suitable if you wanty to present the data with 10x resolution
and
the granularity is sufficient for your app. It will still require
2bytes/sample though.
And it has the drawback that you need to multiply each sample. If for
example
you add samples frequently to an filter an extract data infrequently then
this would
be somewhat limited. Normally one used an fixed point format that is an
multiple of 2.
This way, particulary when using filter such as FIR, with size n samples (
where n is factor of 2 )
you just use the accumulated average as an 'fixed point' value.
For example let's assume an 16 sample FIR filter, when you use the
accumulated sum
you know it has the scale input*16, hence in this case your 'fixed point' is
located
between bit number 3 and 4 ( starting at 0 as lowest bit ). So bits 0-3 are
the
fractional part and all above is the 'integer' part. To 'extract' the
integer part
one just performs an right shift by 4.
Fixed point math are trivial in the sence that you do not need to know
that it is fixed point, use any standard add/subtract/mutliply etc routine
and you'll be fine. The only time one need to know the format is when you
want to 'extract'
your data to 'readable' format. Nicolais excellent code generator for
constant
multiplication can perform wonders for this type of calculations :)

For example, consider the following:
Fixed point data 3bytes where upper two bytes are integer part and lowest
byte
is fractional part ( 256 sample sum for example ) i.e. 16q8 format.
That is real_vaule= Value_24bit/256, this is trivial ofcource. However
now lets assume we want to have an resolution of 0.01 instead. We then
will have: real_value_in_hundreds = Value_24bit*100/256 =
Value_24bit*(100/256) = Value_24bit*0.390625
Now this could be a bit tricky and time consuming to calculate, here comes
the code generator to the rescue. Enter input data size in Nicolais
generator
and enter the constant 0.360625. Now it will generate very efficient code
for your fixedpoint raw data to 'readable' data with resolution 0.01.

It really is easy to deal with fixed point,


/Tony

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@100511 by Claudio Tagliola

flavicon
face
Hmm, I was thinking about fixed point already, but 24 bit is too much.
If I understand this fixed point matter correctly, the highest bit is
the sign bit (positive/negative), correct? So if I wanted to stuff it
into 16 bits, that leaves 15 bits for the actual value.

If I want to have a range of about 720, this would require at least 10
bits, with sign it would require 11 bits. If I want a 0.1 decimal
accuracy, this would need 4 bits. So it basically leaves only two
options: 11q5 or 12q4. Not very convenient, but representation size is
more important then convenience. Now, add/dec is straightforward with
fixed point. But how about multiply, divide and
sin/cos/tan/arcsin/arccos/arctan?
I've looked at the piclist, but it's all 8 bit based fixed point math
routines (e.g. 16q8, 16q16, etc).

{Original Message removed}

2002\03\13@133039 by David Koski

flavicon
face
From -360.0 to 360.0 in 0.01 steps gives 72001 values.  16 bits has 65536
values.  If 0.02 steps is okay that gives 36001 values, and is within the 16 bit
range.  If 0.0 is binary 0x00 then 360.0 would be 18000 (0x4650) and -360.0
would be -18000 (0xb9b0).  What are you going to do with these values?  To
display them would require a divide by 10.  If they are just offloaded to
something else that can convert to -360.0..360.0 then you could just pass them
as is.

David

On Wed, 13 Mar 2002 12:53:49 +0100
Claudio Tagliola <.....cptagliolaKILLspamspam@spam@CHELLO.NL> wrote:

{Quote hidden}

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@140438 by o-8859-1?Q?K=FCbek_Tony?=

flavicon
face
Hi,

Claudio Tagliola replied:

>Hmm, I was thinking about fixed point already, but 24 bit is too much.

Well I didn't propose that you should use that in *this* project :) it was
added just as an example of fixed point operations.

>
>If I understand this fixed point matter correctly, the highest bit is
>the sign bit (positive/negative), correct? So if I wanted to stuff it
>into 16 bits, that leaves 15 bits for the actual value.

Nope, this is not relevant to 'fixed point' per se, this is only valid
as any other case if you need to deal with numbers that can be both positive
*and* negative. It's not at all related to fixed point.
So if your numbers can be both negative and positive then use the top bit
as sign storage else don't ! :) regardless of fixed point usage.

>
>If I want to have a range of about 720, this would require at least 10
>bits, with sign it would require 11 bits. If I want a 0.1 decimal

Well true (as it is stated) however in your original question you posed
that the range was -360 to 360 this is 9 bits + sign bit, is this what you
refer to ?

>accuracy, this would need 4 bits. So it basically leaves only two
>options: 11q5 or 12q4. Not very convenient, but representation size is

Well if we still are talking the range -360 to 360 you also have the option
10q7 (including sign bit).

>more important then convenience. Now, add/dec is straightforward with
>fixed point. But how about multiply, divide and

Well there can be some pitfalls if you want to have both operands as fixed
point numbers. But normally this is not the case.

Examples in fixed point format 4q2 (variables labelled with f) :


***** first, fixed point operands with 'normal' non-fixed point operands:

fA= 001010b ( i.e. 2.5 in decimal if 'fixed point' is considered 0010.10b)
this would be read ( as an 'normal' number ) as 0x0A or 10 decimal.

B= 0x02 ( 'normal' number )

some operations:

fA*B = 0x0A * 0x02 = 0x14 = 010100b which is 5.0 in fixed point as assumed.

fA/b = 0x0A/0x02 = 0x05 = 000101b which is 1.25 in fixed point as assumed

***** second, both operands are fixed point

Now the tricky bit, assume both operands are fixed point numbers:

fA*fA = 0x0A*0x0A = 0x64 = 1100100b which is 25.0 in fixed point, apart
from the obvious 'overflow'(result does not fit in 4q2 format)
it's not correct (or rather what we assumed). We assumed the outcome:
2.5*2.5 = 6.25 (0x19, 011001b ). Hmm lets consider what we have done, trying
to find what went
wrong.
We have fA which really is real_value*4 ( i.e. an divide
by 4 leaves the integer part) as we choosed our fixed point format
to be 4q2, now when we do the multiplication,
we will have:

fA*fA=real_value*4 * real_value*4 = real_value^2 * 4^2

Here we can spot what happens, instead of the assumed output of
real_value^2 *4 we also have squared the fixed point scale. So to
summarize, when performing multiplication there both operands are
fixed point numbers(assuming power of 2 fixed point):

1: Make sure the variable size are atleast fractional number of bits
larger than input size, i.e. an 4q2 (6 bits) input needs to use an
mul routine operating on atleast 4q2+2 bits (8 bits) operands.

2: After operation scale back result, the same number of bits. I.e.
input 4q2*4q2 will generate output 4q4, to 'get back' divide by fractional
part in this case 4.


Perform the same calculation yourself and you'll see that it is fairly
trivial
once you get over the inital threshold. As an excersise do the same with
division.

Fixed point is not so hard, once you start using it it'll become a second
nature :)

>sin/cos/tan/arcsin/arccos/arctan?

Well, these needs to be 'compensated' for the input size but in general
not for an particular fixed point format ( i.e. normally it doesn't matter
if you have 1q8, 2q6 or 3q5 etc it's just 8 bits input that matters ).

>
>I've looked at the piclist, but it's all 8 bit based fixed point math
>routines (e.g. 16q8, 16q16, etc).

As I stated above ( ramblings :) ?) normally it doesnt matter until both
operands
are fixed point.

/Tony

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@141645 by Spehro Pefhany

picon face
>If I understand this fixed point matter correctly, the highest bit is
>the sign bit (positive/negative), correct?

Usually we use 2's complement so that 0 is 0x0000, 1 is 0x0001, -1
is 0xFFFF and -32768 is 0x8000. It's certainly possible to do it the
way you suggest. It slows calculation (adding and subtracting signed
numbers) but speeds up display routines. If you use tools such as the
Windows calculator, it uses 2's complement.

>So if I wanted to stuff it
>into 16 bits, that leaves 15 bits for the actual value.

This is true.


>If I want to have a range of about 720, this would require at least 10
>bits, with sign it would require 11 bits. If I want a 0.1 decimal
>accuracy, this would need 4 bits. So it basically leaves only two
>options: 11q5 or 12q4. Not very convenient, but representation size is
>more important then convenience. N

Usually we don't bother looking finer than byte resolution. To get a
range of +/- 3600 (0.1 resolution) you need

n ~= ln(7200)/ln(2) = 12.8 bits

If you were willing to go with less resolution, you could use
12 bits and pack two numbers into 3 bytes. Eg. 0.25 resolution, means
your number varies by 4 * 360 * 2 including sign or 2880 which fits
in the range of 2^12. I suppose you might need this if you were
implementing an FIR filter or something like that which needed more
RAM than your PIC had using 2 bytes per.

>ow, add/dec is straightforward with
>fixed point. But how about multiply, divide and

Multiply with fixed point can be done by doing a n x n multiply
which yields a 2*n bit wide answer, the key thing then is to discard
the bits at the left which are ALWAYS zero regardless of what the
two numbers may be, and to keep enough bits on the right to get
enough resolution. This can be a non-trivial trade-off. You may
have to do calculations (and maintain intermediate results) to 16
or 32 bits to get good 12 or 16 bit answers in all cases.

Floating point shines here because it always keeps the same number of
bits of resolution (but that may well have to be more than your final
number to get a good answer depending on how your algorithms work).
For example, one of the things I do a lot of is polynomial evaluation
with polynomials derived from equations that are almost linearly
dependent. The condition number (the ratio of the largest to smallest
eigenvalues) of the resulting matrix is >> 1, meaning a slight error
in a coefficient or the intermediate calculations results
in a large error in the result. This can cause the result to jump by
more than the display resolution or do other undesirable things.
(This sort of thing is sometimes incorrectly blamed on "rounding
errors"). The tradeoff with floating points is that you lose some
bits to keeping track of the radix point in each number.

>sin/cos/tan/arcsin/arccos/arctan?

If it gets more complex than a square root, I generally use a C
compiler. I've not seen the advantage in getting into CORDIC or
LUTs for this yet, not that it's any great problem. Exceptions:
For fast (microseconds) 8-12 bit sin() and cos(), I do use a LUT,
and I've written some log/exp stuff for other micros.  Sounds like
you might want to at least consider using floating point C for your
processing if you are going to be doing a lot of polar-rectangular
coordinate transformations or similar stuff.

>I've looked at the piclist, but it's all 8 bit based fixed point math
>routines (e.g. 16q8, 16q16, etc).

Have a look on Microchip's site, there are 16 bit and floating point
math routines. ISTR some trig too.

Best regards,

Spehro Pefhany --"it's the network..."            "The Journey is the reward"
speffspamKILLspaminterlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
9/11 United we Stand

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@154320 by Bob Ammerman

picon face
----- Original Message -----
From: "David Koski" <.....davidKILLspamspam.....KOSMOSISLAND.COM>
To: <EraseMEPICLISTspam_OUTspamTakeThisOuTMITVMA.MIT.EDU>
Sent: Wednesday, March 13, 2002 9:25 AM
Subject: Re: [PIC] Number representation


> From -360.0 to 360.0 in 0.01 steps gives 72001 values...

er, no it doesn't it gives 7201 values

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@155447 by steve

flavicon
face
> >sin/cos/tan/arcsin/arccos/arctan?
>
> If it gets more complex than a square root, I generally use a C
> compiler. I've not seen the advantage in getting into CORDIC or
> LUTs for this yet, not that it's any great problem.

I would recommend reading Jack Crenshaw's "Math toolkit for real-
time programming". The title and chapter summary make it sound
like it would be at a higher level than would be appropriate for 8 bit
microcontrollers, but how he goes about deriving methods is very
appropriate. It's also a pretty easy read for a math book.

Steve.

======================================================
Steve Baldwin                Electronic Product Design
TLA Microsystems Ltd         Microcontroller Specialists
PO Box 15-680, New Lynn      http://www.tla.co.nz
Auckland, New Zealand        ph  +64 9 820-2221
email: stevebspamspam_OUTtla.co.nz      fax +64 9 820-1929
======================================================

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@160454 by Pic Dude

flavicon
face
7201?  Me slightly confused.  Or is that femto-confused? :-)

Hmmm ... ((360 - (-360)) / .01) +1 = 72001.


{Original Message removed}

2002\03\13@170257 by Thomas McGahee

flavicon
face
Of course, for the utmost in simplicity, use 2 as your storage
factor. Allows storage of values that when converted will have
a range from +65535 to -65535. The maximum error is off by 1.
For example, 35999 (which represents 359.99) is stored as
17999. When converted back you get 35998. ALL ODD values
will therefore be too small by exactly one count. Thus both
35998 and 35999 will ultimately convert to 35998.

If you can tolerate the least significant digit in the answer
being off by 1, then this divide by 2 storage method is
the way to go, as it is super simple to implement.

Fr. Thomas McGahee


{Original Message removed}

2002\03\13@172558 by Spehro Pefhany
picon face
At 05:05 PM 3/13/02 -0500, you wrote:
>Of course, for the utmost in simplicity, use 2 as your storage
>factor.

Your 2 is equivalent to my suggestion of 50.
360.00 * 50 = 18000

{Quote hidden}

The 80 scale factor gives 60% better resolution (0.0125 vs. 0.02) and
requires a  simple 3-bit shift to put it into a form where it can be
converted to BCD or ASCII for display with 0.1 resolution.  I thought
there might be cases  where the 50 is better, though, which is why I
mentioned it. I just can't think of any at the moment. 8-(

Best regards,

Spehro Pefhany --"it's the network..."            "The Journey is the reward"
@spam@speffKILLspamspaminterlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
9/11 United we Stand

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2002\03\13@201331 by Bob Ammerman

picon face
Sorry,

I was fooled by the -360.0 to 360.0,

missed the 'by 0.01 steps'.

I would have written it as:

-360.00 to 360.00 by 0.01 steps

to keep the precision of all the values the same.

Sorry,

Bob Ammerman


{Original Message removed}

2002\03\14@165151 by Claudio Tagliola

flavicon
face
Thanx all for the help (nice crash course Tony :) I'm looking at
ordering the book, it looks very good.

What I understand from Spehro the 2's complement signing method is
easier to do math with compared to the highest bit sign. Display isn't
very important, that module has a separate pic, so it has time enough to
get the numbers displayed. The other end is a PC, so that has power
enough to convert it back to a 'normal' format. The math is much more
important, we have one module which has to do a fair amount of gonio
routines (calculate a heading from the three axis orientation), but
still only at about 10 Hz max, so that'll leave enough cpu cycles left
to get that done.

However, as I'm using a C compiler, what would be the more advisable
route: convert data at the incoming point to a format the compiler can
handle or write some specific fixed point routines to handle the math I
want to do? From what I've read, the 12q4 format is good, with a range
of -2000 to 2000 and an accuracy of 0.067 (right? I messed up with the
range before, so a check can't hurt :).

(Look out, ramblings ahead...) This range is mainly used for angles (no
surprise with a 360 value range), math is mostly navigation. So one
other option is a binary value for a full rotation (256 for example). If
you put this into the 12q4 format, the 4 bit accuracy will be converted
back to roughly 0.1 degree (0.0937 to be exact). However, this will make
some calculations a bit easier, as any overflow of the 8q4 fixed point
can be discarded, as that would point to the same direction again. Total
range would be -2880 to 2880. Hmm... That's a bit much for angles.

Last ramble: 8q8 with scaled 256 as a full 360 degree rotation. This
would be easy math, 8q8 operator 8q8 with 16q8 result math routines are
plenty available. Most results can discard the highest 8 bits, so that
would result in nice clean code. Am I overlooking something here?
Problems I might encounter?

Regards,
Claudio

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2002\03\15@020415 by David Koski

flavicon
face
Maybe I missed it but I don't see in your posts where you describe the input
(transducer?) or the algorithm.  Do you have something like this?

transducer -> number_format -> algorithm -> output

If so, the best solution will probably take all steps in consideration.  You
already mentioned that the output is offloaded for display.  If the algorithm
can be modified to accommodate integer input or what the transducer produces,
why not go with that?  It looks like the algorithm needs to call the shot, as
long as it produces meaningful values even if they have to be converted by the
other PIC.  Again, I am probably missing something.

An example of adapting an algorithm:  A transducer produces degrees C in 8q8
format.  The desired result is degrees F * 10.  It will be output to a device
that will handle the display routine like your example.  There will be one
decimal point precision.  Now consider that the temperature is actually Deg C *
256.  So you could do something like this:

Deg F * 10 = ((((transducer_value/256)*9)/5)+32)*10

But we can do something like this instead:

Deg F * 10 = 18 * Deg_C + 320

It could be implemented something like this:

temp1 = transducer>>4; // temp1 = Deg C * 16
temp2 = transducer>>7; // temp2 = Deg C * 2
DegC18 = temp1 + temp2; // DegC18 = Deg C * 18
result = DegC18 + 320;  // Deg F * 10

This can be further refined.  Also, negatives are not considered here.  But
comparing to the previous example, much less code space is required as only
shifts and additions are needed.  Now we have something like this:

transducer -> algorithm -> output

---

I don't see what difference using a compiler has since it produces assembly (or
machine) code anyway.  It is always good to keep an eye on the assembly list
file anyway.  BTW, what "format" is the "data at the incoming point"?

David

On Thu, 14 Mar 2002 22:48:48 +0100
Claudio Tagliola <KILLspamcptagliolaKILLspamspamCHELLO.NL> wrote:

{Quote hidden}

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2002\03\15@025102 by Claudio Tagliola

flavicon
face
My project doesn't have one linear manipilation line. There are a lot of
different inputs (some directly from a PIC's A/D, some digitally from
other PIC based modules, some from a PC), and outputs are also plenty.
Again, some to PIC based LCD displays, others to a PC and still others
to some PIC based actuator. One of the types of data in this system,
among many other types, is angles, used for navigation. Even the
navigation is not one single algorithm, there are different modes of
navigation, all with its own algorithms.

So the properties of the format have to mirror the use. Therefore, a
-360.00 to 360.00 with 0.01 accuracy should be easiest. However, it
doesn't have to mimic the actual scaling, as this is only needed for
display.

Another point, in the algorithms, all common math operators are needed,
including most of the sin/cos/tan/arcs/arcc/arct methods. On the other
hand, CPU cycles are plenty, as is code space (we plan to migrate to
larger 16k or 32k PICs if needed).

As for my question about the C compilers, I was wondering what the
penalty would be for conversion to supported multibyte
fixed/float/integer types, compared to writing one's own math library
for a certain number representation format.

Ok, gotta run now.

Regards,
Claudio

{Original Message removed}

2002\03\15@060656 by o-8859-1?Q?K=FCbek_Tony?=

flavicon
face
Hi,

Claudio Tagliola wrote:
<snip>
>What I understand from Spehro the 2's complement signing method is
>easier to do math with compared to the highest bit sign. Display isn't
<snip>

well either I'm confused or you :) these are the same method. 2's complement
are used to negate an number, one performs:

1: 1's complement on the number ( invert all bits 1->0, 0->1)
2: add one

This also means ( providing that the resulting number fits in the variable
size )
that the top bit will have different polarity for positive and negative
numbers.
Normally one treats an top bit of 1 to mean that a number is negative.

Consider this:

A = 1 dec = 0x01  ( note top bit = 0 )

B = ~A (complement, on pics COMF A )= 0xFE

C = B+1 = 0xFF ( note top bit = 1 ) this is to be interperated as -1 dec

Let's try with an higher number

A2 = 100 dec = 0x64 ( top bit = 0 )
B2 = ~A2 = 0x9B
C2 = B2+1 = 0x9C ( top bit = 1 ) to be interperated as -100 dec

Now this is all fine and dandy, however the main thing about 2's complement
is that you basiclly only need to consider that the inputs always fit
in the variables, apart from that the 'math' will work out without much
special consideration.
Check out:

C + A2 ( -1 + 100 ) = 0xFF + 0x64 = 0x163 the top bit(byte) is discarded
as we use only 8 bit math and we will have 0x63 = 99 dec

C + C2 ( -1 + -100 ) = 0xFF + 0x9C = 0x19B, as before we discard the top bit
(byte)
and we will have 0x9B, as you can see the top bit is 1, which we decided
means that
the number is negative, to 'convert'this to an redable 'positive' form we do
something like (in pseudo code):

if(Input:topbit = 1)
{
negative = true;
number = ~Input;
number += 1;
}

I.e. 2's complement again, lets try with the last result:

we had 0x9B, top bit set means this is an negative number,
set flag to send the '-' to the display if required.

Complement: ~0x9B = 0x64
Add one: 0x64+1 = 0x65 = 101 dec :) !!!!

Have fun :)

For what it's worth i personally think signed math ( in asm ) are
harder than fixed point math. There are more 'obscure' pitfalls
with signed math.

/Tony

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


More... (looser matching)
- Last day of these posts
- In 2002 , 2003 only
- Today
- New search...