A Self-Wiring Array of "Bicores" for Robotic Control

by Terry Newton


Described here is a minimalistic robotic platform and a program that I wrote for it. Influencing both the mechanics and the software were ideas from the field of BEAM robotics, a hobby/science that specialises minimalism as an art form and to demonstrate lessons from nature. BEAM comes from the works and ideas of Mark Tilden, combined with the crazy stuff that happens when a bunch of eager robotists not necessarily with electronics experience tinker with concepts more advanced than they realise. One seemingly simple but popular BEAM element is the bicore, the two-neuron form of Mark Tilden's nervous network. Only recently have studies been done on two coupled bicores, and that resists definition. I'm not that versed in higher math, and the why of it interests me much less than exploring what can it do. The program I shall describe simulates the equivalent of five coupled bicores, next to impossible to mathematically describe. Since I cannot pretend to know how such a network should be connected, the software connects itself by simple random selection. To allow adaptation to proceed without barriors, the last bicore is used to drive the motors through differentiators with motor reverse signals coming from opposing feelers, an arrangement similar to Mark Tilden's "Beam Ant" design. Amusing and useful things happen when the connections between the bicores are allowed to reconnect themselves. I hesitate to call it evolution or genetics, as the population size is one (unless the slots I'll get to later count as "members") and only environmental pressure from excessive feeler contact drive the randomising process. Because of the inherent richness of the modes of expression exibited by coupled bicores, self-wiring is particularly easy to implement: try anything, there's a good chance of it doing something useful. I also suspect that networks of coupled bicores connected in arbitrary mannors perform a unique class of computations of their own, capturing data in the phase space between oscillations enabling it to be acted upon later. I suspect the same thing happens in living things.

The provided algorithms simulate a form of nervous network, a robotic control structure invented and patented by Mark Tilden. Commercial use is prohibited unless arrangements are made with the patent holder. A certain amount of my own intellectual property is contained within, the hardware and the software that simulates a network of bicores. The software and schematic design is protected by copyright, commercial use is prohibited unless arrangements are made with Terry Newton and Mark Tilden as well if it involves bicores.

The Platform

The robot I used for my experiments is a miniature version of my PicBot II design based on a PIC16LF84 processor which features 1K words of program store (14 bit), 68 bytes of ram and 64 bytes of nonvolatile eeprom. Motor signals are amplified by a pair of miniature Zetex H-bridge chips and delivered to a pair of miniature "pager" motors which are angled in the classic BEAM "popper" arrangement. Power is provided by 0.5 farads of capacitive store charged by a 37mm by 33mm solar cell that delivers a maximum of 15ma (usually much less) with an open circuit voltage of approximately 4.8 volts. A voltage trigger chip signals the processor when the charge is greater than 3.2 volts, sufficient for several inches of movement before discharging to the point of sluggishness. After a timed activity phase the software should go to sleep and wait for the trigger to indicate sufficient voltage before moving again. Just in case the programmer overdoes it, and to protect from random processor pin states that occur at low voltages, the H-bridge motor drivers do not respond to low voltage signals. For best results the discharge should not go below about 2.7 volts before putting the processor to sleep.

Pictures of the miniature PicBot II, side view and top view without the solar cell...

Schematic of the miniature PicBot II...

The PIC is programmed in-circuit (has to be since it is soldered in-place) using simple hardware between a PC's printer port and freely available software. The circuit itself is not BEAM, it is what is placed inside the processor that determines what it does and by who's philosophy. The mechanical design however is very BEAM-ish. Was the easiest thing to make.

Controlling an Autonomous Robot with a Processor

One of the big problems with processors is they view the world in discrete steps. Everything is a bit and every number is an integer. This often leads to the programmer trying to specify exactly what each input combination should do:

This method is useless if the environment is variable and does not yield to such simplistic logic, or the environment is not known well enough for the programmer to predict every combination. Plus it's boring.

It is better if the robot can decide for itself what actions should go with what sensor combinations. This can be directly mapped using memory:

Now the robot is free to pick what it does in response to a given environmental condition, and change its mind if things aren't going well. But it is still direct-mapping. Although many tricks can be applied to improve performance, it cannot corelate the present with the past more than one move except through environmental cues. A more sophisticated structure is needed.

This is where processor meets BEAM. Bicores are basically two-pole oscillators with inputs which affect the time each node takes to change state. Bicores are complementary, anything done to one input has the opposite effect on the other output. If node A is made to be high 70% of the time, node B will be high 30% of the time. This makes things easy since bicore effects show up whether the inputs tend towards high or low or the nodes are active high or low. Differences are compensated for by merely swapping either the inputs or outputs. No attempt was made to make the simulated bicores exactly model real bicores, for one thing real bicores are made with inverters but that's an extra step for a computer, I did what seemed natural at the time. A nice effect one gets from loosely coupling two or more bicores is if the rates are similar they can store information for many cycles in the form of the phase relation between the bicores. When many bicores are coupled with more than one input to each bicore node things get really interesting - the network has, for a given pattern of connections, replicate complex patterns in response to the environment and time. The pattern of interconnections becomes a selector for a wide range of behaviors leading to effects like accidental mapmaking.

But how to take an inherently analog thing like a bicore and run it on an inherently stepped device like a processor? Easy, take tiny steps and don't worry too much about exactly matching reality. Oftentimes the core properties of analog systems show up just fine when 0..1 real is represented as say 0..50 integer. Sometimes the extra nonlinearity helps. Concepts like information coding in phase space and chaotic couplings in recurrent neural networks work fine in discrete steps so long as the steps are not so large they erase the information being encoded. For practical purposes activation levels are restricted from 0 to 127 (usually set much lower) and connection weights vary between 1 and 15, or 0 for no connection.

The following overall structure is used in the self-wiring bicore array program...

This general shell is good for simulating many types of real-time networks, not just the nervous variety. For simulating bicores, the initialise step is to ensure that all bicore states begin complementary. Evaluation and network change methods can vary widely, for my experiments these steps are extremely simplistic.

Simulating Evolving Bicores

For evaluation, I hard-coded the "horse" layer so that reversal can only take place in appropriate situations, so that's not a factor. It is wired so that there is no standing still state, so it doesn't have to check for that. Currently it evaluates success or failure at avoiding feeler contact by subtracting from the score if the feelers touch, subtracting more if they touch too much, adding a point if no recent feeler contact and adding several points for feeler touches followed by no touches. This evolves networks that do whatever they can to avoid feeler contact. To "program" other behaviors, other or additional scoring factors can be applied, light level to evolve light-seeking networks for example.

The evolution algorithm is equally simplistic. For diversity, three slots are defined in eeprom, when called upon to save or restore a network it randomly selects one of the three slots. When a network's score reaches a maximum value, it is saved to eeprom. When score drops to 0, a previously saved network is loaded into operating ram. If the network still continues to score poorly, all connections are randomised. For incremental changes a running changeat value is maintained. When score dips below the changeat value, a connection input source or weight value is changed. Bits that cause the feelers to become disconnected from the reversers are randomly changed 25% of the time, giving the robot the ability to temporarily ignore its feelers and squeeze past obstacles that would otherwise block its path. Also randomly changed is the value for the motor differentiators, altering the intensity of motor drive. Often this value tends toward minimum, since that results in the fewest feeler hits per period, saving higher speeds for problem situations.

The network simulation component is where most of the processing occurs, the evaluation and adaptation parts are simple mainly because of the richness of interconnected bicores combined with an effective horse layer based upon Mark Tilden's Beam Ant design. To simulate a bicore one need not be concerned with resistors, capacitors, thresholds and all that, but only consider what a bicore does: it flips from one state to another with the outputs always complementary, and the time it takes to flip is influenced from a nominal value by any inputs applied to it. To handle input for each bicore node an integrating accumulator is used to translate input from multiple sources into a numeric value. Inputs which are currently "on" add their weight to the accumulator, inputs which are "off" subtract their weight. In an oscillating environment the accumulator will hold approximate position if the input duty cycle is exactly 50% but amplify any difference. This is a deviation from most hard-wired bicore networks that was used mainly for convenience but probably has the effect of making the networks more chaotic. To determine the "off" period of a bicore a counter is used, decremented once for each time slice whenever the node is off. When it reaches 0 the node turns on, the opposing node turns off and the counter is reloaded with a value minus whatever is in the input accumulator, thus all inputs are excitory, decreasing the node's "off" time and increasing the frequency.

The following illustrates one time slice of one bicore...

Note that input activations are delayed until the opposing node relinquishes control, wasn't intentional just how it came out. This is but one way one can simulate bicore-like activity, the most obvious way at the time I wrote the program. I'm sure the method can be improved. For each time slice the bicore proceedure is applied to all of the elements in the array then all of the node states are changed at once. Nodes which drive a motor set the motor's differentiator counter when they trigger and clear the opposing motor's counter. The differentiator counters decrement (down to 0) each time slice, the motor is activated as long as the counter is above 0.

The Full Algorithm

This puts it all of the components together. Each of four bicores have four evolvable connections, two per node. The fourth bicore drives the motor differentiator counters, which drive the forward direction terminals of the motors. The fifth bicore pair, or photo bicore, has inputs hardwired to the absolute light levels and is not evolvable. The numbering scheme for the bicores is arbitrary. The reverse motor terminals are wired to opposing feelers, but are ignored if the corresponding ignore-feeler bit is set. Input sources range from 0 to 15 to represent always 0, leftfeeler, rightfeeler, lightonleft, lightonright, photobicoreA, photobicoreB, badenv, and eight bicore node outputs.

I practically had to rewrite the program in my mind to express structurally, as the machine code is anything but structured so the above doesn't exactly represent the program but is mostly equivalent. Not everything is specified, some of the lines represent entire subroutines like read senses. I tried to generalise it as much as possible for easy translation to other platforms, I don't really have 2-dimensional arrays but same idea.

With only 1K of program store available and few coding tools that support structured programming, the real program is written as unstructured spagetti-code. This is normal for PIC programming, users of larger processors with real compilers have it easy in this respect, but it is the price one has to pay to make use of a single chip computer that runs on 3 volts and a milliamp. To solarise and miniaturise a semi-intelligent autonomous agent one has to throw accuracy, intricacy and structure to the wind and work with the technology available. Or be stuck with batteries and a low survival score. Besides, the really good principles remain true even when simplified, so I am not complaining. Simple hardware forces simple solutions.

One advantage of simple processors, spagetti or not is there is only so much there, making program verification relatively easy. There are only so many ways it can go, and only the simulation engine needs to be bug free. I don't think very many people even understand the concept being simulated, I don't but that is why I am simulating it. It may remain a mystery but even if it does I still have an autonomous robot that has personality and can solve problems others in its class cannot.

Assembled Object Code

The following hex code represents the instructions for the PIC for running on the PicBot II Miniature robot platform. To replicate, copy the code between the cut lines into a .hex file then use PIC programming software to load into the processor. This hex dump was taken August 16 1999, program dated 8-16-99 12:12am.

-------------------------- cut --------------------------
-------------------------- cut --------------------------

The eeprom data block in the above translates into the following bicore array:

Numbers in parenthesis are weights, 0 is the same as no connection. A and B correspond with left and right for photo-core (bicore 5 in the psuedocode), bicore 4-A feeds left motor, 4-B feeds right. Actual motor differentiator value is equal to DiffNum times 4 plus the minimum allowed value. With the minimum set at 30, 0-7 ranges diffvalue from 30 to 58. NoDiffsInReverse is a bit that when set does what it says, not represented in the pseudocode.

Technical eeprom data details:


The network in the code sample has connected itself mostly as a photovore; left light is applied to the right motor, right light is (indirectly) applied to the left motor. However if the light is blocked by an obstacle then strongly photovoric wiring schemes will score poorly once they get there, forcing development of alternate plans that spend about as much time going away from the light as towards, avoiding punishment and taking the one point reward it gets for not hitting anything. When it comes to self-organising systems the programmer is not in control, only makes suggestions. Often the programmer's plans are shoved aside in favor of plans that better satisfy the robot...

It seems to have turned into a light-avoiding robot.

The ReverseLeft/Right and IgnLeft/RightFeel flags are not supposed to show up in saved copies, supposed to be cleared after three good moves. Must be a bug or a misunderstanding. Newer versions of the program are coded to reset those flags so I won't have to see them, but behaviorally they seem no better or worse. I can't say that my scoring function is particularly effective, the simpler method of simply punishing for feeler contact and rewarding otherwise works too. Observably the effects are very subtle and difficult to qualify without considerable study. The underlying horse layer in this particular creation is very adept at solving basic navigational problems all by itself, generally it will randomly select the reverse action to back away in about 8 moves then it should stick (unless it learns to do it otherwise). The code is programmed to clear the flags so it will use the feelers and not be backing up (another last minute hack - thought it would be nice to have self-control over reverse actions) but for some strange reason networks still get saved with the bits enabled. I think I'm beginning to understand why... (maybe not but trying to think positively:) reverse actions are not seen at all by the scoring function so are perfectly acceptable, they will clear themselves if there is no need, and with the new scoring function there is a reward for successful obstacle avoidance - careful, it might very well learn to seek out obstacles just to avoid them, but not making contact enough to negatively impact the score, this can be done by twiddling the ignore and reverse bits. Then it saves. This part of it has nothing to do with bicores, but having an adaptable layer underneath the bicores frees them to do what they do best - generate semi-chaotic patterns in response to input stimuli, leading to the production of path-tracing oscillations that guide it around obstacles. If you don't reward too much for hitting them. The programmer can suggest, but often the environment-robot system will interpret the cues in unexpected ways. The programmer is not really in control, having only the power to "mess it up", but not necessarily make it work as expected.

I grew tired of messing it up, so I went back to the hex code I used in this document for another overnight run, this time in a smaller tray devoid of obstacles besides the walls. The light was positioned so that it was beyond the edge of the tray. Here's what it produced:

At least it didn't save with those pesky horse flags enabled. It appears to have completely transformed from the starting state, aparently wiring up light-avoiding connections as in the previous run but it is hard to state without running the network to see what kind of output it produces. At least there is a bit more diversity, still not what I'd like to see but not surprising given the limited population size. Lesson: just because a term is present in the scoring function does not mean the robot will do what it implies, particularly if the term conflicts with other terms. Because the robot cannot get to the light, the term cannot be satisfied without scoring poorly because of excessive feeler contact.

Tenative Conclusions

It acts like it is alive. It fights if you mess with it, or tries to run. If something is in its way and it can't get around, it will try to push the object out of the way. Neat, but this is simply a side-effect of being able to select different speeds and being able to temporarily ignore its feelers.

This needs more research before hard conclusions can be reached, but tentively I conclude:

Web links and other reference material

"Living Machines" by Brosl Hasslacher and Mark W. Tilden:
http://www.geocities.com/SouthBeach/6897/living_machines.pdf (found on Chiu-Yuan Fang's beam web site: http://www.geocities.com/SouthBeach/6897/beam2.html)

Mark Tilden's Nervous Network Patent:
http://www.xs4all.nl/~vsim/amiller/patent.html (found on A.A. van Zoelen's robotic page: http://www.xs4all.nl/~vsim/robotics.html in his mirror of Andrew Miller's walker page)

Brian Bush's BEAM faq:

"Biomorphic Robots and Nervous Net Research: A New Machine Control Paradigm" by Mark W. Tilden: http://people.ne.mediaone.net/bushbo/beam/Biomorphic.html (from Brian Bush's page)

The PicBot II robot design, by Terry Newton:

Mirror of David Tait's page for obtaining SPP programming software, look for "16F84 in-circuit programmers": http://www.dontronics.com/dtlinks.html

Just a (very) small sampling of evolved/genetic neural network papers...

"Incremental Evolution of Neural Network Architectures for Adaptive Behavior" by D.Cliff, I.Harvey, P.Husbands, CSRP256, December 1992 University of Sussex

"An Evolutionary Algorithm that Constructs Recurrent Neural Networks" by P.Angeline, G.Saunders, J.Pollack, Laboratory for Artificial Intelligence Research, Ohio State University

"Embodied Evolution: Embodying an Evolutionary Algorithm in a Population of Robots" by R.Watson, S.Ficici, J.Pollack, Dynamical and Evolutionary Machine Organization, Volen Center for Complex Systems, Brandeis University

Hundreds of papers partially about this stuff are available at Cora Research Paper Search:
How much of it is correct I cannot say, but there's a lot of it. Don't let your head explode!

This document was last modified August 16, 1999, minor editing March 25, 2000
Contents (C) Copyright 1999, 2000 by Terry Newton
email: wtnewton@nc5.infi.net

End of Document