Neural net project comments
by Trystan on Nov.29, 2008, under Cognitive Science, Programming
I began experimenting with neural network code a number of months ago when I decided to implement a network capable of maze navigation. I had read about NNs but I wanted to have the level of understanding that comes with actually implementing such an algorithm. In this post I shall discuss the design of the NN framework, the NN intended to solve mazes, the problems with said design and my thoughts on NNs in general and connectionism. The code was all written in Common Lisp (SBCL) in Eclipse/CUSP on a Linux system.
The maze environment was realized as a 2D array of 0s and 1s where a 1 represented a wall. The mazes were randomly generated using a recursive spatial partitioning algorithm. There are some pretty pictures in the programming category of this blog. The decision to have mazes be the problem domain was somewhat arbitrary and introduced some complexity. In retrospect, finding efficient solutions to mazes was not the best problem to tackle with a simple NN. Although it seems totally doable with the right network design.
The implementation of the NN framework leveraged the Lisp object system allowing for simple extension of the algorithm. The default Neuron object was associated with a set of generic methods Activation, Learning and Propagation. A user wishing to change the Activation function could simply inherit from Neuron and override Activation for their Neuron type. Different types of Neurons and their associated functions could be easily intermixed within the same network without much explicit ‘glue’ code. The implementation also included an object/functional abstraction of a neural layer as used for feedforward networks. The highest level abstraction, feedforward-nn, encapsulated a set of neural layers, an explicit input layer and output layer, and user defined functions that specify the mapping from the problem domain to the input and output layers. At least from my own use, these abstractions facilitated very good flexibility and ease of reuse. They also map back intuitively to typical theoretical specifications of NNs.
I quickly discovered the difficult aspect of NNs is not the implementation of an NN framework but rather the design of a network to solve a particular problem. Initially, I considered ‘evolving’ a network using genetic algorithms. Indeed, this would have been neat especially if it worked. However, I felt that would divert my focus too much and require even more time. I settled on an intuitive design that included 9 input neurons and 8 output neurons; there were no intermediate layers. The neurons were wired in feed-forward fashion with this exception of the output layer feeding back into the input layer (more on this in a moment). The first 8 neurons of the input layer represented a ‘touch sense’ where the neuron was set to fire if a wall was present in that direction (8 possible directions for a 2D cellular environment). The output layer represented the direction to move where the sum of the output neuron’s activation levels was [0, 8]. The 9th neuron on the input layer was a ‘smell sense’ which fired if the agent was closer to the cheese/exit than it was on the previous pass. Now, in order to potentiate a ‘correct’ move, a move closer to the cheese, the agent must ‘remember’ the firing sequence of the output neurons from the last pass; hence the aforementioned feedback from the output layer to the input layer.
Unfortunately, this critter never does get to the cheese. Typically after a few moves it settles into a corner and never departs. After a direction is established, the critter becomes more and more determined to keep going that direction. To be successful, there must be some notion of de-potentiation after continued lack of stimulus on neuron 9. That way, it would eventually try another direction even when the input stimulus remains constant. Therefore, I believe some additional tweaking could produce a network capable of solving the mazes.
Regardless, I feel I’ve accomplished what I set out for which was to obtain a fundamental understanding of NNs, their potential and their limitations. Of particular interest to me when I started the project was to evolve my own opinion regarding the biological validity of NNs. I believe a major shortcoming of the current NN definitions, in the scope of biological simulation, is the total reliance on changes in ‘physical’ representation to map to changes in behavior. Given identical input to an NN you will have identical output until the weights, part of the physical representation, are altered. However, I’m skeptical that changes in biological behavior can be modeled purely from these representational changes. There are meta-representational aspects that must be considered such as firing frequency. More simply, you have the wire but you also have what is transmitted ‘over’ the wire. NNs appear to focus on changing the wires to produce changes in behavior. Indeed, it is known that long term potentiation does exist and it likely does produce changes in behavior. However, I assert that is likely not the only mode under which behavioral changes occur and NNs do not currently encompass any mode besides LTP. Rhetorically, how would one describe the behavior of working memory using a model dependent on LTP? To be clear, this is not to say that the necessary behavior can not ‘emerge’ from an NN. I’m merely stating that the current formalism, in of itself, doesn’t seem to allow for such behavior.
Another concern regarding the use of NNs to understand biological function is the potential complexity of the NN resulting from an attempt to model an abstract biological function such as the meaning of an object (I am assuming that the mind is ultimately reducible to the physical or computational realm). As I eluded in the previous paragraph, I assert that abstract concepts are meta-representational, or that one must look beyond a simple neural correlate snapshot to understand how the more complex areas of the mind function. If this is true, then a model at the level of an NN may prove to be as much as a mystery as the physical reality it was intended to model and resist proofs of its correctness.
In the end, NNs will probably continue to prove valuable to those researching at the granular level, neuroscientists, biopsychologists, etc. There is also the professional domain that can apply them to interesting problems regardless of their biological validity. However, I’m skeptical regarding the applicability to the investigation of abstract constructs such as the meaning of objects and reasoning.