Paul Grobstein, Bryn
Mawr College Department of Biology
Clare Congdon, Bryn Mawr
College Computer Science Program
Your experiences last week should have given you a feeling that maybe it is really true that any calculation or process, no matter how complex or sophisticated, might indeed be performed by a suitable interacting assembly of quite simple elements. You should also have a sense of the difference between an algorithmic process (an explicit list of instructions) to achieve such results and the parallel, distributed way in which networks do it. Finally, you probably have a sense that while networks can (perhaps) perform any task in a parallel, distributed way, it is not particularly easy to "design" a network to perform any given task. This week, we will explore networks which can "learn" and, in so doing, structure themselves so as to achieve solutions to particular tasks.
The core concept of most "learning networks" is that of modifiable synaptic weights, with modifications depending on the networks activity and performance. We will talk in class about one simple form of such a networks, a perceptron. You will also be introduced to "back propagation" networks, in which there can exist not only input and output elements with modifiable synapses between them but one or more layers of intermediate elements ("hidden units") as well, and to a software package, tlearn, which will allow you to explore properties of such networks. The software comes from a Exercises in Rethinking Innateness, by Kim Plunkett and Jeffrey Elman (MIT Press, 1997), copies of which are in the lab. Chapter 2 provides an introduction to the use of the software, and there is a user's manual in the appendix at the end.
Beyond their ability to generate solutions to problems, two related aspects of back-propagation networks are particularly interesting. One is their ability to generate different solutions to the same problem (including solutions you might not have thought of) and the other is their ability to "generalize". By the latter, one means that the networks yield outputs which reflect their "training" not only for inputs on which they were trained but also for inputs on which they were not. The two are related in that the networks "categorization" of novel inputs may or may not correspond to what you expected when you "trained" the network, depending on whether the network has or has not found a solution to the problem presented which is similar to the one you expected.
To get a feeling for these issues, you should during this week implement and train several back-propagation networks (with the same structure but different starting points) on some problem where you can imagine a possible solution. You should then look at the structure of each, as well as the responses to novel inputs, to see whether each (or any) network came up with the solution you imagined. One easy example would be to train a three layer network (perhaps four elements in each layer) on an autoassociation task, that is to give activity in the output which replicates the input for some restricted training set of input/output relations. It is easy to imagine one structure of a network which would do this. Do your networks have that structure? Do they give an output corresponding to the input for inputs other than those on which they were trained?
There are lots of other interesting explorations that can be done with back-propagation networks if you have additional time, or want to perhaps use them for a term project. Can one, for example, teach a back-propagation network to give an output which reflects the number of active inputs, irrespective of which ones they are? If so, how many different ways are there to solve this problem? Do any of them generalize beyond input/output relations on which the network was trained? Will a back-propagation network yield a "lateral inhibition" structure of the kind talked about last week? What kinds of training would and would not yield such a structure? And so forth.
As in previous weeks, you should record in your journal your experiences with the lab assignment, as well as any additional thoughts about the readings and the relation of both to the larger issues of better understanding information processing in complex systems.
Maintained by: