Searching \ for '[PIC]: Super Assembler (parser)' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/microchip/devices.htm?key=pic
Search entire site for: 'Super Assembler (parser)'.

Exact match. Not showing close matches.
PICList Thread
'[PIC]: Super Assembler (parser)'
2001\05\02@042018 by Roman Black

flavicon
face
After reading Olin's post a few days back about
using the linker it got me thinking. With linking
modules you get some advantages over absolute code
and some disadvantages. But it did give me an idea.

What about a "super assembler parser" for the PIC?

You could write the code in assembly in an easy
format, using variable names and call/goto labels
without having to think about paging and banking
issues. For program branching you would use simple
mnemonics like "branch on zero" or "call on zero"
etc.

Then you run a little dos program (parser) that
goes through the code, and replaces all the branching
mnemonics with correct (and smallest) code including
all PCLATH and paging issues. And anything that was
obviously a table or computed goto would be relocated
to the correct code page using ORGs. It would generate
an output .asm file that was easy to read and did
everything for you.

You could just type in all your assembler code,
make sure it is marked as 16F84/16F877 etc at the
top and it would generate the correct .asm file
to run on that chip. change the chip tag and
re-parse it to reconfigure code for the new PIC.

Maybe you could even get it to use the same code
on 12xxx PICs by replacing the "addlw" etc with
the appropriate instructions.

Any thoughts on this?? :o)
-Roman

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@074900 by Kari Lehikko

flavicon
face
> Any thoughts on this?? :o)
> -Roman

Sounds like a C-compiler to me :P

- Kari -

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@104220 by Scott Dattalo

face
flavicon
face
On Wed, 2 May 2001, Kari Lehikko wrote:

> > Any thoughts on this?? :o)
> > -Roman
>
> Sounds like a C-compiler to me :P

Sure does!

http://www.dattalo.com/gnupic/sdcc.html

What Roman describes is being implemented in SDCC's PIC Port. Specifically,
there's a post code compile stage called "pCode" that takes SDCC's PIC generated
code and optimizes it. Currently it performs two type optimizations. One is a
conditional "peep hole" optimizer, the other is a register allocation optimizer.
Banking is not currently handled and the optimization is confined to just single
files.

The good news is that this code is written in such a way as to not require a C
compiler. For example, it'd be possible to convert say a .hex file into the
pCode format, perform the optimizations, and write the results back into a new
.hex file. However, to be honest I think that would be a very difficult thing to
do. All of the boundary conditions could prove insurmountable. Also, simple
things like delay loops or isochronous code would be "optimized" away.

A better approach is to integrate the optimization into the assembler (as Roman
suggests). Directives could be created to control the manner in which a program
may be simplified. For example, the BANK macro could be turned into a directive
that generates the optimum banking instructions based on where the code is
located. Or another nice feature would be to provide automatic variable scoping
and allocation.

To get the most functionality, one would need to exceed the bounds of the
assembler and encroach upon the linker. Well, the good news is that the gnupic
project (Craig Franklin) is working on a linker.

These are interesting issues that are being actively persued.

Scott

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@104659 by Roman Black

flavicon
face
Kari Lehikko wrote:
>
> > Any thoughts on this?? :o)
> > -Roman
>
> Sounds like a C-compiler to me :P

Yep i'm aware of a PIC C compilers benefits,
but assembler has a lot of benefits too. Too
many benefits...

What I was suggesting was a simple parser program
you could run and it generates a new .asm file
from your old one with perfect code page banking
all done and with the smallest code. And auto
relocating of some code "modules" like tables.
It could have a lot of the benefits of using the
linker but still generate good .asm source that
could be tweaked. You could write your whole code
and just run one program that checks all the paging
issues and re-writes any of the branching to be
perfect with no effort to you.

And re-write it for any PIC just by changing the
PIC type specified in the source code.

Maybe i'm the only one that thinks it would be
nice to have 8kwords of code and just press a
button to have it all "fixed". :o)
-Roman

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@110346 by Douglas Wood

picon face
What you're describing is EPICIS...

Douglas Wood
Software Engineer
spam_OUTdbwoodTakeThisOuTspamkc.rr.com

Home of the EPICIS Development System for the PIC and SX
http://epicis.piclist.com

{Original Message removed}

2001\05\02@110358 by Roman Black

flavicon
face
Scott Dattalo wrote:

{Quote hidden}

Yes, I thought of simplifying the entire branching
system in the .asm, by having custom "branch"
commands that the parser recognises and replaces with
asm code. For example, instead of doing:

       pagesel blah    ;
       goto blah       ; (long gotos or calls are a pain)

or even:

       sublw d'99'     ; sub to compare
       skpz            ; skp match
       goto not_match  ;
       goto match      ;   (very messy page select issues here!)

you could just do a:

       sublw d'99'             ;
       X_brz match             ; (special label)
       X_brnz not_match        ;

And all page select issues would be handled
for you. What about this?

       sublw d'99'                     ;
       X_callz match_function          ;
       X_callnz not_match_function     ; (nice twist!)

It's got me thinking, i've written a lot of C code
parsing and text utilities in years gone by, and it
would not be that hard to determine which page code
is on and generate the proper "branch" assembler to
do it. :o)
-Roman

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@111632 by jamesnewton

face picon face
Where do we get EPICIS? Your page just says to email you.

---
James Newton (PICList Admin #3)
.....jamesnewtonKILLspamspam@spam@piclist.com 1-619-652-0593
PIC/PICList FAQ: http://www.piclist.com or .org

{Original Message removed}

2001\05\02@124428 by jamesnewton

face picon face
How would you account for the movement of code near the end of the program
as a result of the insertion or deletion of page directives earlier in the
code? E.g. You process a forward goto that you see does not require a page
setting because it is in the same page but later you have to add page
setting code for another goto between the first one and the first ones
target. Now the first gotos target is pushed out of range and requires
paging code. Going back to insert it might even cause the second goto to
pushed in range of its target and removal of its page setting code may bring
the first goto target back in range.

---
James Newton (PICList Admin #3)
jamesnewtonspamKILLspampiclist.com 1-619-652-0593
PIC/PICList FAQ: http://www.piclist.com or .org

{Original Message removed}

2001\05\02@124840 by Douglas Wood

picon face
It would have to be at least a three pass assembler.

Douglas Wood
Software Engineer
.....dbwoodKILLspamspam.....kc.rr.com

Home of the EPICIS Development System for the PIC and SX
http://epicis.piclist.com

----- Original Message -----
From: "James Newton" <EraseMEjamesnewtonspam_OUTspamTakeThisOuTpiclist.com>
To: <PICLISTspamspam_OUTMITVMA.MIT.EDU>
Sent: Wednesday, May 02, 2001 11:42 AM
Subject: Re: [PIC]: Super Assembler (parser)


> How would you account for the movement of code near the end of the program
> as a result of the insertion or deletion of page directives earlier in the
> code? E.g. You process a forward goto that you see does not require a page
> setting because it is in the same page but later you have to add page
> setting code for another goto between the first one and the first ones
> target. Now the first gotos target is pushed out of range and requires
> paging code. Going back to insert it might even cause the second goto to
> pushed in range of its target and removal of its page setting code may
bring
> the first goto target back in range.
>
> ---
> James Newton (PICList Admin #3)
> @spam@jamesnewtonKILLspamspampiclist.com 1-619-652-0593
> PIC/PICList FAQ: http://www.piclist.com or .org
>
> {Original Message removed}

2001\05\02@125253 by Scott Dattalo

face
flavicon
face
On Wed, 2 May 2001, James Newton wrote:

> How would you account for the movement of code near the end of the program
> as a result of the insertion or deletion of page directives earlier in the
> code? E.g. You process a forward goto that you see does not require a page
> setting because it is in the same page but later you have to add page
> setting code for another goto between the first one and the first ones
> target. Now the first gotos target is pushed out of range and requires
> paging code. Going back to insert it might even cause the second goto to
> pushed in range of its target and removal of its page setting code may bring
> the first goto target back in range.

First of all, I don't have this implemented and have not given it a significant
amount of thought. But, one way to approach this problem is to begin by assuming
the worst case scenario - all gotos will cross the dreaded page boundary. Then
as a post processing step, you go back and analyze that assumption and remove
all cases that don't cross the boundary. The way the pCode optimizer works in
SDCC right now is that it will continue to loop on the rule processing until no
more rules are applicable. So the boundary condition you described (of
instructions being pushed away or towards the code page boundary) will be
handled by multiple passes through the optimizer.

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@125300 by David VanHorn

flavicon
face
At 09:42 AM 5/2/01 -0700, James Newton wrote:
>How would you account for the movement of code near the end of the program
>as a result of the insertion or deletion of page directives earlier in the
>code? E.g. You process a forward goto that you see does not require a page
>setting because it is in the same page but later you have to add page
>setting code for another goto between the first one and the first ones
>target. Now the first gotos target is pushed out of range and requires
>paging code. Going back to insert it might even cause the second goto to
>pushed in range of its target and removal of its page setting code may bring
>the first goto target back in range.

Start from the back then.
Scan through and get positions on all the jump targets.
Start at the front, and fix all the jumps for target "Z", then "Y"...

--
Dave's Engineering Page: http://www.dvanhorn.org
Where's dave? http://www.findu.com/cgi-bin/find.cgi?kc6ete-9

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@130752 by Bob Ammerman

picon face
Borland does this using an N pass algorithm to optimize jumps in their x86
assembler.

It just keeps redoing the assembly until nothing changes, or it reaches a
limiting number of passes defined on the command line.

Bob Ammerman
RAm Systems
(contract development of high performance, high function, low-level
software)

----- Original Message -----
From: "Douglas Wood" <KILLspamdbwoodKILLspamspamKC.RR.COM>
To: <RemoveMEPICLISTTakeThisOuTspamMITVMA.MIT.EDU>
Sent: Wednesday, May 02, 2001 12:53 PM
Subject: Re: [PIC]: Super Assembler (parser)


{Quote hidden}

program
> > as a result of the insertion or deletion of page directives earlier in
the
> > code? E.g. You process a forward goto that you see does not require a
page
{Quote hidden}

> > {Original Message removed}

2001\05\02@130951 by jamesnewton

face picon face
but, but, but...

Removing page code from a goto that doesn't need it may cause a goto after
that to move out of range of the target that was in range that you have
already removed paging code from which you would then have to put back and
which could then cause another goto to move out of range...

...I just don't see this one as being solvable this way.

Somebody please prove me wrong <GRIN>

---
James Newton (PICList Admin #3)
EraseMEjamesnewtonspampiclist.com 1-619-652-0593
PIC/PICList FAQ: http://www.piclist.com or .org

{Original Message removed}

2001\05\02@130956 by jamesnewton

face picon face
Same problem when processing a backward goto.

---
James Newton (PICList Admin #3)
RemoveMEjamesnewtonEraseMEspamEraseMEpiclist.com 1-619-652-0593
PIC/PICList FAQ: http://www.piclist.com or .org

{Original Message removed}

2001\05\02@132226 by Roman Black

flavicon
face
Douglas Wood wrote:
>
> It would have to be at least a three pass assembler.

>
> > How would you account for the movement of code near the end of the program
> > as a result of the insertion or deletion of page directives earlier in the
> > code? E.g. You process a forward goto that you see does not require a page
> > setting because it is in the same page but later you have to add page
> > setting code for another goto between the first one and the first ones
> > target. Now the first gotos target is pushed out of range and requires
> > paging code. Going back to insert it might even cause the second goto to
> > pushed in range of its target and removal of its page setting code may
> bring the first goto target back in range.


Not neccesarily. I imagined something more "modular".
The parser knows the PIC type and how many pages
are there.

Then maybe each function treated as a separate module
in the way that each code page would be filled with
"modules", and no module would span a page boundary.

A trick would be to allow AVERAGE number of words
(code space) for every branch when placing modules.
So as it starts from the front, each module has its
branch code modified, then hopefully because you
allowed for the average size some will compress and
some will expand, so most modules will not need to
be moved. A little extra headroom might help too.

I like the module idea, it's rare than any one
function would be more than 512words (or especially
2kwords) so a "smart" parser that can move function
modules around is not too hard to implement.
-Roman

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@132647 by jamesnewton

face picon face
Hey! That sounds like it might work... especially the "or it reaches a
limiting number of passes" part. Seems like MASM did something like that as
well... related to macro expansion...

Could also track the size of each attempt and then take the smallest, not
necessarily the last, version.

---
James Newton (PICList Admin #3)
RemoveMEjamesnewtonspam_OUTspamKILLspampiclist.com 1-619-652-0593
PIC/PICList FAQ: http://www.piclist.com or .org

{Original Message removed}

2001\05\02@150838 by Don Hyde

flavicon
face
I've thought a little about a PIC code optimizer, but quickly got
discouraged at how much work it was compared to just writing another program
and going on...

I was thinking along the lines of what I understand that sophisticated
optimizing compilers do.

Most high-level languages enforce rules that make a program conform to a
tree structure.  In C, main() calls other functions, which in turn call
other functions...  Some high-level-language IDE's even display this call
tree in a little window.  Individual statements in most languages are also
tree structures, grammatically.  The parsing process generally creates a
parse tree in memory.  The optimizer usually starts on this parse tree,
applying heuristics that recognize various patterns in the tree, and
rearrange the branches so that the code will be more efficient.

For example, if you write x = b + a*c, in many processors it is more
efficient to do the multiply first, which will leave a*c in some register,
then add b to it.  Math (addition is commutative) says it's OK to do so.

A Parse Tree:

         =
        / \
       x   +
          / \
         b   *
            / \
           a   c

The tree can be optimized by recognizing that the + has two things "hanging"
from it, one a simple variable and the other the result of some other
computation.  The hueristic says "swap it so that the complicated one comes
first, and the simple one after it."

This is a pretty dumb and simple example, but hopefully it gives you the
shape of the thing...

The first thing I would like to get is the call tree.  That alone could give
a definitive answer to the question "can I have a stack overflow?"  At this
point, it might even be possible for the thing to automatically identify any
code that is called from only one place, and replace the call-return with
goto-goto, or even drop it inline.  This just might fix the stack overflow.

While we are at this level, the optimizer might just be able to recognize
when temporary variables are being used in such a way that they could share
the same physical RAM location.

The next thing I would like is for my optimizer to recognize "modules" of
code and analyze how closely they are related to one another on the basis of
their calls and goto's.  Two modules that had a lot of calls and goto's to
one another are more closely related than two modules that neither call nor
goto one another.

Clearly it would be beneficial to put closely related modules in the same
page with one another.

So the optimizer would compute that pairwise "relatedness" scores of the
modules, and then start dropping them into pages on that basis.

The next thing it needs to know is how big the modules are so that it can
tell when a page is filled up.  Initially, this is just an estimate, but it
probably won't be very far off.

Perhaps you get the idea.  Eventually, a code spitter just goes through the
whole thing and spits out the actual code.  It could be a little trying to
debug, since it will no longer be in the same order as the source code,
but...

The problem with doing all this kind of stuff with assembly code is that
assembly code does not have to be structured so that this kind of stuff will
work (it's legal to goto or call from anywhere to anywhere), and at the same
time it provides very few clues as to what the actual structure of the code
is.  There's nothing simple like the {} around a function in C.  Assembly
language just doesn't lend itself to automatic optimization very well.

Which is why there are languages like C and optimizing C compilers.

So, in the end, if I did all this and did a really good job of it, I might
wind up with something almost as good as a really good optimizing C
compiler.  But other people have already done some pretty good C
compilers...

So, I can hand write and hand optimize assembly code, or I can use a C
compiler.  Sometimes the hand work seems justified, and results in squeezing
the application into a one-size-smaller chip than I could get with C.  Other
times the extra labor of assembler doesn't pay and hurts the time-to-market.

> {Original Message removed}

2001\05\02@164921 by Bill Westfield

face picon face
I think the Intel 8086 assembler did a bunch of the things you are talking
about (although it had different things to do, of course.)

You could type:

       call fooo

fooo    proc near
       move  A,B
       ret
       endp


and the exact instructions produced for "call", "move", and "ret" would
depend on the "declarations" of A, B, and fooo.

I found it REALLY ANNOYING, and referred to it as a "strongly typed
assembler" (which was supposed to sound funny.)  It ended up hiding all
sorts of information that is the bread and meat of assembly language
programming and optimization, and I was frequently having to override
the typing with the equivilent of casts to get pointers to variables,
low/high bytes of variables, etc, etc.

Perhaps a PIC has less to hide and more to gain, but...

BillW

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@180637 by Olin Lathrop

face picon face
> Then you run a little dos program (parser) that
> goes through the code, and replaces all the branching
> mnemonics with correct (and smallest) code including
> all PCLATH and paging issues. And anything that was
> obviously a table or computed goto would be relocated
> to the correct code page using ORGs. It would generate
> an output .asm file that was easy to read and did
> everything for you.

I've thought about this too.  I've already got a PIC ASM preprocessor that
expands some directives MPASM doesn't know about.  So far it only handles
/INBIT and /OUTBIT.

I have thought about expanding it to do what you suggested.  The problem is
that it's a fairly large amount of work that is hard to get to.  It is
tempting to think of this as a glorified preprocessor, but it will
essentially need to be an assembler with additional linker smarts.  My
current thinking is a combination assembler/linker.  It still allows
separate code modules each with its own private name space, but all these
modules still have to be presented to it all at once.  I was envisioning a
sortof INCLUDE directive that treated the included file like a module with a
private namespace and the like.  The top source file would MODULE_INCLUDE
all the modules.  Each module could be divided into sections as allowed by
the source code.  Code from sections with no references to them would not be
included in the build.  This gives you the pick and chose feature of
linkable libraries.

The program would at least need to know the size of each instruction and
look for specific ones, like CALL and GOTO.  Each module's initial size
would assume all external CALLs and GOTOs require page switching code.
Affinity functions would be computed between each module.  Modules would be
layed out in memory with large ones first, but also with some "pull" from
the affinity functions.  For simplicity and to guarantee completion in a
finite number of iterations, modules would stay on whatever page they
started on since they can only shrink as new modules are placed on the same
page.  The full solution is a tree searching algorithm which could be quite
lengthy, so I'd probably use huristics and decide 100 iterations (or
whatever) was usually good enough.

All this may be interesting to think about, but it's not likely to happen.
As I've stated before, I think extra page switching instructions between
modules that do happen to end up on the same page is a relatively minor
overhead.  It doesn't sound worth this kind of effort to solve (reduce,
actually).

I'd much rather spend the time adding more intelligent macro and assembly
state handling to the preprocessor.  I'd like floating point handling and
decent string munging in macros so that labels can be generated from macro
string arguments like in "real" assemblers.  I've been thinking of using my
source to source translater technology to add a Pascal like meta-language
that emits assembler instructions.  I've already got code that can read in
Pascal and generate an in-memory description of the operations to perform.


********************************************************************
Olin Lathrop, embedded systems consultant in Littleton Massachusetts
(978) 742-9014, RemoveMEolinTakeThisOuTspamspamembedinc.com, http://www.embedinc.com

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\02@190610 by Tony Nixon

flavicon
picon face
Roman Black wrote:
>
> After reading Olin's post a few days back about
> using the linker it got me thinking. With linking
> modules you get some advantages over absolute code
> and some disadvantages. But it did give me an idea.
>
> What about a "super assembler parser" for the PIC?

I was thinking of starting that same project just the other day.

Still thinking about it....

The trouble is, the harder you think about it, the more complicated is
the problem to solve.

--
Best regards

Tony

mICros
http://www.bubblesoftonline.com
EraseMEsalesspamspamspamBeGonepicnpoke.com

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\05\03@010735 by Andrew Warren

face
flavicon
face
James Newton <RemoveMEjamesnewtonKILLspamspampiclist.com> wrote:

> but, but, but...
>
> Removing page code from a goto that doesn't need it may cause a goto
> after that to move out of range of the target that was in range that
> you have already removed paging code from which you would then have to
> put back and which could then cause another goto to move out of
> range...
>
> ...I just don't see this one as being solvable this way.
>
> Somebody please prove me wrong <GRIN>

James:

If all "long jumps" are the same length -- in other words, if you
don't have a 3-word "medium jump" and a 5-word "long jump" -- then it
IS solvable... You just start with all jumps set to "short", then go
through your code from the lowest address to the highest, adjusting
the jumps from 1-word to 5-word as necessary.

The process is "mostly" monotonic, so it's guaranteed to complete.
I'll buy a beer for anyone who can give an example which won't
complete within 10 passes. [Actually, I think it'll always complete
within 5, but if you can find an example that'll take 6, you should
be able to find one that takes 10].

For the PIC, this is probably the best algorithm; for processors
whose "short" and "long" jumps depend on the actual distance between
source and destination (e.g., a short jump can traverse 128 bytes
while a long jump can traverse 64K), a different algorithm is needed,
since the simplistic one described above is neither guaranteed to
complete nor guaranteed to produce the most-optimal mix of jump
instructions on those processors.

For THOSE processors, the very best algorithm is:

   Start everything at the shortest possible branch, and throw all
   branches on a worklist.  Pull a branch from the worklist; if
   it's out of range, increase it to the next-larger size, then put
   all the branches that SPAN the just-increased branch on the
   worklist.  Repeat till stable.

This algorithm is guaranteed to complete in linear time and to
produce the most-optimal code; the proof (very set-theoretical) was
first shown to me by Cliff Click, PhD, who was (and maybe still is) a
compiler researcher/designer at Motorola.

-Andy


=== Andrew Warren - fastfwdSTOPspamspamspam_OUTix.netcom.com
=== Fast Forward Engineering - San Diego, California
=== http://www.geocities.com/SiliconValley/2499

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@062411 by uter van ooijen & floortje hanneman

picon face
> It would have to be at least a three pass assembler.
Actually N-pass. You must continue 'sweeping' through the code, inserting
page handling instrcutions, untill a sweep inserts no more instructions.
Wouter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@062427 by uter van ooijen & floortje hanneman

picon face
> First of all, I don't have this implemented and have not given it a
significant
> amount of thought. But, one way to approach this problem is to begin by
assuming
> the worst case scenario - all gotos will cross the dreaded page boundary.
Then
> as a post processing step, you go back and analyze that assumption and
remove
> all cases that don't cross the boundary. The way the pCode optimizer works
in
> SDCC right now is that it will continue to loop on the rule processing
until no
> more rules are applicable. So the boundary condition you described (of
> instructions being pushed away or towards the code page boundary) will be
> handled by multiple passes through the optimizer.

I think that the opposite will be slightly better: add the page handling
untill no more is needed. Consider

  page here
here:
  page there
  goto there
  <lots of code>
there:

For the right amount of <lots of code> the page instruction(s) will not be
detected as removeable because there is just over the edge of the current
page. But when the 'page there' was not there in the first place there would
just be in the current page.

Wouter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@062433 by uter van ooijen & floortje hanneman

picon face
> This algorithm is guaranteed to complete in linear time and to
> produce the most-optimal code; the proof (very set-theoretical) was
> first shown to me by Cliff Click, PhD, who was (and maybe still is) a
> compiler researcher/designer at Motorola.

I guess the complexity is not a big worry for the program size that fits in
a PIC. But I wonder how this will be handled:

  incfsz x, f
    goto there

inserting page handling instructions before the goto is not a good idea. The
inline assembly in Jal has a page pseudo-instruction (that still confuses
people because it is not documented), so the fragment would be

  page there
  incfsz x,f
     goto there

When compiler finds out that the page is not needed it will be removed (at
least that is how it will be implemented, at the moment the page just
generates zero code on non-paged targets).

Wouter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@065853 by Roman Black

flavicon
face
Andrew Warren wrote:

> If all "long jumps" are the same length -- in other words, if you
> don't have a 3-word "medium jump" and a 5-word "long jump" -- then it
> IS solvable... You just start with all jumps set to "short", then go
> through your code from the lowest address to the highest, adjusting
> the jumps from 1-word to 5-word as necessary.


All this discussion is pretty exciting. I think the
thing that excites me is being able to use the
power of assembler but to add an extra functionality
like "smart" branching. Like assembler ++ ?? ;o)

I also think it is do-able, especially in a simple
form where code is in functions, and the parser just
needs to work on "branch" mnemonics. All my asm
code is in definite functions anyway.

I would also think an upgrade to the RISC branching
instructions would be a good move forward, like
replacing all test/goto pairs with a special
branch mnemonic. Ideally this would be a dual-branch
instruction... Then the parser would always insert
the best code for branching or paging.
This can't be that hard to do, and if branching is
simplified, then assembler becomes much closer to
C for ease of use but keeps all the power.

Keeping them in pairs:
       sublw 0x21
       BRZ match       (goto labels)
       BRNZ no_match

or:
       sublw 0x21
       BRZ match
       BRNZ            (no label needed)
       (other code here)

This would make the code so much easier to read,
especially re banking issues, and just let the
parser handle it. What about:
       btfss reg2,5            ;
       goto bit_clear          ;
       goto bit_set            ;

could be replaced with:
       BR_BIT reg2,5 bit_set   ;
       BR_NBIT bit_clear       ;

So it makes sense to suggest mnemonics for all
goto or skip PAIRS??
-Roman

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@085913 by Nicholas Irias

flavicon
face
>
> This would make the code so much easier to read,
> especially re banking issues, and just let the
> parser handle it. What about:
>         btfss reg2,5            ;
>         goto bit_clear          ;
>         goto bit_set            ;
>
> could be replaced with:
>         BR_BIT reg2,5 bit_set   ;
>         BR_NBIT bit_clear       ;
>
> So it makes sense to suggest mnemonics for all
> goto or skip PAIRS??
> -Roman
>

Yes.  But making the pre-processed code look more like C would be even
better.  Why not make it look like:

IF reg2,5
goto bit_set
ELSE
goto bit_clear

This makes the code much more readable

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@091008 by uter van ooijen & floortje hanneman

picon face
{Quote hidden}

But then why not put the actual code between the IF/ELSE? Of course you
would need something to delimit the then and else parts, so what about { }?
Wouter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@101335 by Mohamed Eldesoky

flavicon
face
Wow
Just Another Language     JAL 2
lol
;)

Mohamed Eldesoky
~~~~~~~~~~~~~
Life is to try.
~~~~~~~~~~~~~
{Original Message removed}

2001\05\03@155404 by Peter L. Peres

picon face
>How would you account for the movement of code near the end of the
>program

Compilers handle this by using two passes. By the second pass all the
information required is known (including the size of jumps etc). A
loophole optimizer may add additional passes (several) precisely for the
reason you gave. None of this happens at the file level, it happens in the
data structures of the compiler.

Peter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@155410 by Peter L. Peres

picon face
Roman, most decent C compilers allow you to mix assembly with C. You do
things like:

void a_function(char of_b)
{
 // some C code here

 #pragma asm

 movf  a,w
 movwf PORTB


 #pragma endasm

} // end of function

Notice how you access C variables from assembly and how the code is placed
by the compiler. I don't know if sdcc has this.

Peter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@155421 by Peter L. Peres

picon face
Roman, you have re-invented C compilers ;-). I understand that pic basic
pro does this too.

Peter

--
http://www.piclist.com hint: The list server can filter out subtopics
(like ads or off topics) for you. See http://www.piclist.com/#topics


2001\05\03@162031 by jamesnewton

face picon face
I think its more about learning how its done and finding a different (maybe
better) way to do it.

Do you know of any good docs on how C compilers do this sort of
optimization? Books? Examples? How does one learn this?

---
James Newton (PICList Admin #3)
spamBeGonejamesnewtonSTOPspamspamEraseMEpiclist.com 1-619-652-0593
PIC/PICList FAQ: http://www.piclist.com or .org

{Original Message removed}

2001\05\04@052807 by Peter L. Peres

picon face
> I think its more about learning how its done and finding a different
> (maybe better) way to do it.

Yes. Take a deep breath and dive. Unless you have taken a course in
compilers you will need to work hard to understand the existing solutions
first, before you come up with something new and unexpected. You have to
understand that everyone who wrote a computer language, interpreted or
not, in the last ~50 years, has walked down this path at some time or
other, so solutions may exist for what you are trying to do.

The reference university text book is 'The Dragon Book', which is actually
called 'Compilers, Principles, Techniques and Tools' by Aho, Sethi and
Ullman (Addision Wesley). It has gone through X editions so far and it is
called the dragon book because of the dragon picture on the cover.

This is a university textbook and like any university textbooks it
contains only the theory of fishing, and no fish ;-)

A better book for finding fish imho is 'Unix Tool Building' by Kenneth
Ingham (Academic Press / Harcourt Brace).

The regular expressions that define a grammar can get pretty complex fast
and 'Mastering Regular Expressions' by Friedl (O'Reilly) will help here.

I understand that the most notable feature that defines parser grammars is
the complete lack of comments therein no matter who wrote them ;-).

The largest problem wrt grammars and newbies is, that a grammar is a 5th
generation computer language and almost none of the mistakes made there
lead to direct effects (i.e. you need to understand how it all works to
find the logical fault).

There are also other books about this (many). They are usually specialized
and require a lot of background.

Peter

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2001\05\04@144324 by uter van ooijen & floortje hanneman

picon face
> books on compiler construction

The difficulty I have with most such books is that they all go into much
detail explaining the easy parts of a compiler (scanner/parser, code
generation for a stack automaton, activation records, static and dynamic
link) which I can easily find out for myself (although it surely helps to
read a few of those books), and a few describe some advanced techniques that
I find not applicable to PICs (optimization of register allocation). But
none describe the real hard things, like code/data allocation and page/bank
optimization (that are 4 different things!) and handling a fixed-depth
stack. This can partly be attributed to the fact that the PIC architecture
is way beside the current mainstream of CPU architectures. Two books I
remember as eye-openers are a description of the CHILL-11 compiler (Wulff et
al) and 'A microprogrammed APL interpreter' by R.Zaks.

Wouter

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2001\05\04@203440 by Dincer Aydin

picon face
       Hello,

     I think that the article entitled "Data Memory Paging Management"
at http://www.embedded.com/2000/0001/0001feat2.htm may be related to
this thread.
    Also, I think that using indentation for scoping is a nice idea,as it
is done in Python scripting language. "Life is better without braces"
they say.


    As for compiler construction info available
on the net, I insert urls from my bookmarks:

Anthony Aaby's
Compiler Design
http://cs.wwc.edu/~aabyan/464/Book/
http://cs.wwc.edu/~aabyan/464/

Introduction to Programming Languages
 http://cs.wwc.edu/~aabyan/221_2/PLBOOK/

Compilers and Compiler Generators (downloadable Book)
 http://scifac.ru.ac.za/compilers/

Compiler Course Slides
 http://csis.pace.edu/~bergin/slides/CompilerSlides.html

compiler design
 http://www.cs.rpi.edu/~moorthy/Courses/compiler/Lectures/lectures/lecture1/

Algonquin College Computer Studies CST 8152 Compilers
 http://www.algonquinc.on.ca/computerstudies/alleni/cst8152/97s/slides/index.htm

compilerconstruction.org Catalog of Compiler Construction Tools
 http://www.compilerconstruction.org/

compilers.net
 http://www.compilers.net/

flipCode - Tutorial - Implementing A Scripting Engine
 http://www.flipcode.com/tutorials/tut_scr01.shtml

Links and Selected Readings
 http://gcc.gnu.org/readings.html

Programming Language Creator
 http://www.adeptsoftware.com/plc/

Researchers in Programming Languages and Compilers
 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/mleone/web/language-people.html

Sergey Gorelov's Homepage
 http://www.sygot.hotmail.ru/

Translators >> course
 http://www.cs.ru.ac.za/CSc301/Translators/trans.htm

Visualization of Compiler Graphs
 http://www.cs.uni-sb.de/RW/users/sander/html/gsvcg1.html

Welcome to MATHS
 http://www.csci.csusb.edu/www/dick/maths/welcome.html

Write a compiler next time
 http://openmap.bbn.com/~kanderso/building/tinycomp/

BOOK Parsing Techniques - A Practical Guide
 http://www.cs.vu.nl/~dick/PTAPG.html

--
Best regards,
Dincer                            KILLspamdinceraydinspamBeGonespamsofthome.net

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


More... (looser matching)
- Last day of these posts
- In 2001 , 2002 only
- Today
- New search...