CS372: Artificial Intelligence (Fall 1998)

Week 3: Responses

Readings


Diana Applegate


This week's material has proven to be the most challenging by far due to its heavy mathematical emphasis. Since there weren't any prerequisites for this class, I figured that my Calc-1-only mathematics background would suffice...but now I wish that I had taken more mathematics prior to AI. I feel like I'm missing out on some really important stuff, especially since I want to go on in computer science. I find the symbols involved in the various equations we've looked at to be pretty confusing. But of course, if the rest of the class is clear on this stuff, then I'll see you during office hours to straighten that out. In addition, it's very difficult to understand the Calc that the book lays out, even though I have taken Calc before. What makes me even more uneasy are the graphical representations of the vectors associated with the TLU. I haven't got a clue as to how to interpret them, or what they're even trying to tell me (example: fig 3.1, pg 39 in text). Specifically, I'm not clear about what a hyperplane is. Will we have to reproduce these graphs or answer any questions about them on future assignments? Luckily, I think I do understand the main points about neural nets and how they're supposed to work. The computer simulation that we looked at in class on alphabet learning really helped put this in perspective for me. It was great to be able to see the training phases take place, and then to observe that the computer "learned" to recognize some of the letters with surprisingly good accuracy. On a different note, I think that this week's robot lab went pretty smoothly. I was, however, a little confused about some of the terminology used in the instructions. When you talk about the robot "seeing the light" and, for example, moving forward in response to this sensory reading...is the robot seeing ambient light, or is the robot seeing light that's brighter than ambient? In other words, for our purposes, is there a distinction between ambient and significantly brighter than ambient (like if I shined a flashligiht at the bot)? Or, is it simply that everything less than ambient is considered dark, and ambient and up is considered light? I thought that it would be more interesting if, in fact, there was a distinction between ambient and bright light. Then we'd be able to embed more actions/behaviors into our robots.



Jocelyn Arcari

This week dealt with more technical terms and focused on understanding learning in S-R agents, Perception, Adaline, etc. When speaking about training in S-R agents, I thought it was interesting that we only have to teach it a certain number of possible actions and it can produce others that have not specifically been taught to it, once it has had sufficient training. Calling this "learning" seems to be an appropriate term, in that sense. The three types of learning discussed -- supervised learning, reinforcement learning, and unsupervised learning reminded me a lot of our experimental psychology class where pigeons' actions were either reinforced with food when they acted appropriately, or were not reinforced depending on which type of learning we were studying. Depending on which scenario we had set up, the learning in the pigeons occurred more or less quickly. So too, the manner in which the learning is occurring in our agents probably differs depending on which method is being used to "teach" the agent certain actions. I am confused as far as whether there is a difference between "batch mode" and incremental learning?

The example given in class of a situation which utilizes Adaline (Adaptive Linear Element) was very interesting. Averaging out the "right and wrong" responses the agent gives (least means square of error Rule) in a training set has been used to cancel out the heartbeat of a mother and detect the heartbeat of a fetus in Ultrasound tests. To me, that's an amazing application of computer learning.

The algorithm given (in class) for the delta rule is helps highlight the importance of "If, Then" statements when writing programs as we did in the lab. In the training (using the delta rule), if the response was correct, then you are able to move on and apply the next pattern. If not, you'd have to repeat it again until the correct output is given. This relates to the If, Then statements we needed to use when telling "Timid," "Indecisive," and "Paranoid" what to do -- "If there is light, move forward, If not, do something else." The example demonstrated where trying to teach the computer to recognize the 7 alphabetical characters reinforced my understanding of Backpropogation Methods and changing weights of different layers in neural networks. A little more practice with these ideas might be helpful.


David Costello

Last week's classes about neural networks resolved many of the issues I asked about in my previous e-mail. Though neural networks are a form of brain-function modeling, they are not nearly as complex as our own minds. Neural networks can be made to learn like the human brain but only in specific fields such as speech or writing methods. However, I still have a few questions which I hope you can address. Does this inadequacy in neural networks stem from the inability to model the complexity of the brain or rather from the fact that the brain has other parts which cannot yet be modeled? In other words, if we had the ability to create infinitely complex networks would we have the ability to model the brain?

Another question I have has to do with the hidden middle layers. You were telling a student about how the number of middle layer nodes affect the output layer. I would appreciate it if you explain this relationship further so I can understand this dynamic. Lastly, you told us that there were three kinds of computer learning: supervised learning, reinforcement learning, and unsupervised learning. What I don't understand is how unsupervised learning can be applied in computing. Isn't learning without a teacher nearly impossible? Why would someone have a machine learn unsupervised when it can learn through supervision or reinforcement? Besides these issues I'm pretty ok with the rest of the material.



Sonia Dubielzig

Seeing the computer model of a neural network in action, I could see how TLU's make a surprisingly good model of learning in the human brain. Simply understanding how the TLU operates by threshold values, and learns by repetitively manipulating and fine-tuning weight vectors, had not prepared me for the computer demonstration on Thursday.

What struck me most about the demonstration was how, like biological neural networks, some of the connections between cells in the input layer, middle layer, and output layer gained in strength over time. Others, however, lost their strength and became less important when determining the identity of the letter. Just like synapses in the brain, as time goes on, some connections grow stronger and clearer, while others, not being used, grow fainter.

The ability of the program to recognize its own mistakes was also somewhat of a surprise. I wonder how it could be programmed to do that. In the program used by the US postal service, a more complex system of learning must have been done to cope with all the enormous variation in handwriting style. And certainly, no programmer could have sat and entered in different ways to type each letter 100 times, like they did with the ancestor to the handwriting recognition program that we saw in the video a couple weeks ago.

I am still having problems with understanding the term "vector" as it applies to "feature vector" and "weight vector". Although you drew a quick 3-dimensional graphical analysis of weight vectors and the concept of gradient descent on the board last Thursday, you quickly dismissed it, saying we didn't have to worry about it. However, it was at that point that I felt I was closest to grasping the connection between physical vectors that have magnitude and angle, and the concept of vectors used in programming.


Emily Greenfest

The lecture, reading, and lab this past week were to me pretty straightforward. I've worked a bit with neural networks before and it was nice to finally get an explaination of them in computer science terminology (i.e. without all of the extra descriptive terms thrown into the picture by the biological sciences). There were two odd things that I heard outside of the classroom this first week that struck me as amusing and that I think are worth discussin a bit here.

The first is a response my group received when we were showing a biology student our robot. We just asked her if she would look at it run and she did, I we got from her the response you are most likely to get "cool!" (with some personal touch). Since she responded "cool" that makes me think that something about it interested her or some aspect was novel, or there was something there that caught her attention and I asked her what she thought and she replied that it was neat, but it was made out of legos and she didn't see how anything made out of Legos could be profound. I think at that point i got the "huh?" look on my face and tried to explain to her that not only was (at this stage) nothing absolutely "profound" going on, but that if it were, the Legos certainly weren't responsible for it. And she replied that she knew what she observed was a result of the program the computer was running ... it was just that it was attached to Legos that made it not likely to ever be profound. That got me thinking.

How much of intelligence is actually intelligent behavior and how much of its is the package it comes in? Are we more likely to not call animals intelligent even though the percieve and learn and what not because of the packaging that performs those operations?

The other event of this past week that struck me was Deepak's comment in the last class that it is the middle layer of the neural network that does the learning. I suddenly had this vision of the middle layer getting frustrated with its non-intellectual outer layers and leaving. Again, this little scenario just raises a couple of questions: is the whole entity learning, or just the middle layer? Can the middle layer learn without the inputs and outputs? I think the answer to that is no -- with out inputs the layer would have no frame of reference and no reason to learn, without outputs, it would have no way to convey that it had learned. on the other hand, if the middle layer can learn on its own, then what about us? Do we need our senses and our means of expressing ourselves? Would we be intelligent without them? It makes you wonder if the old science-fiction claim that someday humans were going to evolve into a super-intelligent being composed solely of "brain" is at all feasibly possible. For, if a being must be able to recieve input and display output as well as process data to be intelligent, then it is not actually our brains that make us a intelligent species, but the specific interaction of brain with the rest of our body.


Ayishih Hakim

Reading and hearing more about TLUs in class and from Nilson I am beginning to realize that TLU's are a huge basis for AI. Is this true?

In another computer science class, Principles of Computer Organization,we discussed Threshold units very briefly. THe lecture on threshold units ended with a finalizing comment that threshold units are not widely used because it has proved to be overly cumbersome to physically simulate a TLU. After this lecture I concluded that the feasibility of using threshold units in actual computers/robots is not practical. The only purpose that I apprehended for TLU's is to provide a theorectical understanding of a processing unit. However Nilson puts overwhelming emphasis on the structure and realiablity of these systems.

On another note:

When reading over the syllabus I noticed that the short video on the robot responding to light is an example of perceptrons at work. As I was watching the video it never occured to me that perceptron learning was taking place. I figured that the video was demonstrating something similar to what we were assigned to do for lab. What I did for lab definately did not employ any type of TLU.

Can you please address this in class.


Ada Hogan

I'm wondering how Adaline is used more in communications- in clearing out background noise for example. How does the program distinguish between white noise and language? Is it by word recognition, in a similar procedure to what we saw with handwriting recognition with a lot of "noise" in it, as we saw in the video in class last thursday? I was looking at the learning rule we were discussing with regards to back propagation as well, and how a higher number of hidden/ middle nets to process information from the input layer improves learning time. I'm thinking about this in relation to the learning process of a human. When we first started looking at learning processes, we took the example of the missionaries and the canibales, documented human reasoning processes, and then used this to program a computer. The computer had a similar "thought" process, and "thought" at the same speed. Now, with neural networks and added middle nets, the computer could theoretically learn more and more quicky. Faster than a human? Than what kind of human- a child or an adult? If we take the handwriting recognition program, it certainly seemed that it could learn to identify letters, in print OR cursive, much faster than a child of 5 would learn. We are able to speed up the thought process, but to which point do we want it to go? At this point, I start rethinking the goals of AI, and again how we define "intelligence". What is the scientist really interested in? Perhaps in a robot that outshines human performance, or perhaps a robot that mimics human learning, even in its mistakes. I suppose, at this point, it might be useful to distinguish between programs that are really defined as being goal related in accomplishing a certain task( handwriting recognition or in phone communications) and others that follow a given behavior, such as shadow seeking. Faced with the two programs, I would want the first to fulfill its task as quickly and effectively as possible. As for the second, I'm more interested in following its behavior. How far would I want to go in perfecting its response time? On the other hand, how are we to judge just what behavior will be comparable with human behavior?


Peter Ingebretson

I like neural networks; they seem like a very good idea. I am surprised that no one thought of sticking TLUs together into neural nets for as long as they did. In fact, I would like to learn more about what other people have done with them so far: what has worked and what hasn't. I look forward to an assignment using them.

I had a curiosity about something that could be done with neural nets. Have people before tried training networks by running them backwords? By this I mean training the network for a while using generalized delta, then selecting a desired output, for each step selecting a possible set of inputs that could generate that output. After doing this for each layer, a random input would be generated; one that would give the output we started with. If this input were supposed to give the output, the network would not be changed, otherwise it would. The advantage would be that we would only need a finite training set to use as examples, then we could tell the network if it were on the right track by looking at what it considered right.

This was just a thought, I have no idea whether this has been tried or whether this might work. At any rate, I think networks are versatile, and I wish we could send more time working with them.


Sarah Klaum

As I have mulled over this past week and the material covered, I must honestly admit that the thought most pervasive in my mind right now is one of confusion. I seem to have gotten a little tangled up in the onslaught of formulas used to formulate and describe perceptrons and neural networks. The concepts themselves at first seem relatively straightforward, and I am sure that with a bit more perseverance, I will be able to understand the root of my current confusion. In an attempt to gain more insight, I went to the interactive site on serendip that deals with perceptrons. What I found most intriguing were points brought up in the conclusion. I realize that I had been thinking about perceptrons in very black-and-white terms. Either the result of the training session was a perceptron that did what the trainer wanted it to do, or it didn't. I had never really dealt with the thought that perhaps a perceptron can learn different categories from the same training set that are equally valuable, just as we as humans may learn different lessons from the same experience. I believe the point was brought up (on the site) that we clearly learn from interaction among ourselves, in a sharing of experiences and knowledge. Perhaps then this is precisely what occurs in a neural net, then, but as I myself still feel somewhat in the dark about this, the idea intrigues me. How could two separate perceptrons trained on the same set of inputs, and arriving at different learned states, exchange their information, their "experience?" Could they integrate parts of the other system? In short, what exactly are the possibilities and limits for such interaction?

In thinking about interaction, I also began to contemplate last week's lab. It of course brought to mind again the question of intelligence really as a function of the environment. I wondered, then, is the robot we built really in an environment? Or part of an interaction with the user, who controls the light source? In categorizing the situation as an interaction, would we then choose to attribute greater or lesser intelligence to the system?


David Rothstein

Watching the program at the end of class on Thursday that was able to learn the letters of the alphabet was extremely interesting. I think I would like to see more details of how that program worked; since the neural net involved did not have that many elements, it seems as though the program would not be too hard to understand. I am still having a little trouble understanding the exact function of the middle layers in a neural network. I can see how they must be necessary, since perceptrons are incapable of implementing certain basic logical commands on their own (such as exclusive-or, as we discussed in class). However, I am not sure I understand the details of how the middle layers in a neural net can solve this problem. I just don't feel like I have an intuitive understand yet of how and when to use these middle layers, although I guess this will come in time. In any case, after learning about neural networks this week, I think they provide an extremely interesting approach to artificial intelligence. The idea that they can be simply programmed and then essentially "reprogram" themselves just by adjusting a few numbers (the weights) during a training stage seems very exciting, as does the fact that they represent a very basic model of how the human brain works. These facts suggest to me that neural networks are an extremely important step in the development of artificial intelligence.


Frank Rusch

Because the main operation in the TLU is the dot product, which is not expandable into its factors, the results seem mysterious to me. In other words, working backwards, if the weight vector wasn't known specifically, there would be no way to solve for it from just looking at the input vector and the desired output. The weight vector (in my understanding) could be numerous things and still produce the "right" answer for a given input. Am I correct in thinking this? In a simple example, if the input vector were (1,1,1), the dot product after multiplying through the corresponding 3 weights will always be the sum of those weights. When given new inputs, after training, it seems that some of these (formerly indiscriminate) weight vectors would perform worse than others.

To me it appears that, in order for a network to generalize, there needs to be a pattern of connection between the input and the output. In the example from class, we trained the computer program to recognize letters ACEQSTV and display their corresponding binary codes on sight. But there is nothing apparent in these binary codes themselves that confers an association with any particular letter-- I presume these codes were developed by our arbitrary ordering of the alphabet. So, the way in which 'A' differs from 'B' does not appear to be congruent with the way their corresponding binary codes differ. Maybe a multi-layered network is capable of uncovering (or imparting) a relationship that is too complex for us to be readily aware of. But it seems impossible that such a system could look at a entirely new input and return and output whose association is copmletely arbitrary.


One question I have is whether computers can be switched into training mode whenever they see something novel. I think people react that way, beocoming more attentive when we see something we haven't seen before. And how is an independently operating computer to know that a new input is a training exemplar and not a distorted (noisy) version of something it already knows?




Edina Sarajlic


Previous to taking this class, I was only aware that machine learning is possible, but I did not know much about the actual
implementation of the learning process. This weeks lectures on neural networks have been very useful in that respect, because they discussed a way in which an S-R agent can learn through training on a subset of possible inputs. I was surprised that such a simple device as a threshold logic gate can represent the basic learning unit. However, this is not so surprising if we recall that these gates were made to mimic the operation of neurons, which are the physiological basis of our own intelligent behavior. There are many applications of neural networks in AI, so I could\ not help but wonder about the extent of their limitations. From the class discussion and reading, we have seen that there is a minimum number of intermediate layer neurons, below which the training cannot be effective enough. There is also a requirement that the training input set has more members than there are degrees of freedom in the given neural net. Otherwise, as Nilsson explained, there is a possibility of overfitting the training set data. This implies that complex networks, which employ large numbers of neurons, need very large training sets, which is not always practical or even attainable. Neural nets are an example where naturally occurring intelligence is used as a model for AI artifacts. Watching the lego robot being trained to recognize light reminded me of the analogy between a learning child and a computer made in the MIT video. This analogy is most flattering to the computer, because it implies that the computer learning capabilities could be perfected and generalized to such an extent as to resemble our own. However, given that there are billions of neurons in our brain, I doubt that this kind of neural complexity can be matched by a machine learning program. Even if it were possible to create, what kind of training set and weight correction method would that program need to use?



Jim Speer

Last week the course material started to become confusing. I must say that the material in chapter 3 was very thick stuff, presented with almost no examples. Also, the book's material has diverged significantly from the hands- on experience in the robot labs. I found last week both encouraging due to my group's success in programming the lego bug, and discouraging in that the book ran so quickly ahead of my experience and understanding. I think what I would like to see is a lot more examples presented in the book, as in actually running some numbers through these formulae to show me how they do what they do and why.

Our lego bug seems to be the simplest possible S-R agent, since not only does it simply respond to a condition, the condition itself is only one of two things (light or darkness.) Our programs simply wait for the condition to change, alter the behavior, and wait for it to change back again. I attempted to visualize the bug as a very simple animal, with simple concerns, and this seemed to work for me. I started to see why some AI scientists model their programs' behaviors after those of insects, who's stimulus-response systems are very predictable and limited. I can see that the addition of a few more programmed responses would allow the lego bug to appear more bug-like in the future. But this brought to my mind a new question regarding what we consider to be intelligence: Is something intelligent only because it appears to be intelligent? The Turing Test's sole requirement for intelligence is that the subject is able to fool someone into believing it is smart. Could not an animal or an object possess an intelligence, but not be recognized a s such because it has no value or holds no interest to humans? Why is a ball not intelligent for rolling down a slope -- Is this because it holds no mystery, and no power to fool us?

Myself, I would not classify our lego robot's behavior as "intelligent" any more than I would a light bulb becoming bright when the switch is turned on. But I'm sure this will soon change. I'm looking forward to programming more complex behaviors and "personalities" in the bug as the labs go on. But I am also impatient to see the workings of error correction and neural networks, even simplified versions, so that I can get back my handle on the direction of the lectures and book.


Ben Sprecher

I found the demonstration of the backpropogational learning you ran in class very exciting. If we can teach a neural network to identify characters by example, then perhaps we can teach it many other recognition techniques as well. For example, given enough training time, it doesn't seem unreasonable that a sufficiently complex neural network could take an array of pixels (i.e. an image) and identify whether or not that image contains a human face. The training set may need to be several thousand images, but it could then generalize its findings to other faces never seen before. (this is different than the example in class, since you can't generalize recognition of an "A" to recognition of an "L".)

The really exciting part about this method of training is that we don't really understand what the weights mean when we look at them individually. This is analagous to our lack of understanding of how real neurons in the human brain function. It is easy to imagine that the human brain _does_ function in the same was as our artificial neural nets - only it is constantly undergoing training. This raises the question: if we build a neural network the size and complexity of the human brain, trained it for 20 years, and interacted with it as we would a normal human, would it attain human-level intelligence in its response to stimuli? Barring the issues of consciousness, I can certainly imagine that it would.

I can't wait to start implementing neural nets of our own.


Emily Sweeney-Samuelson

I was so glad to learn more about neural networks this week! (Already? Finally?) As probably all of you know by now, I was dying to find out how they work. I wasn't consciously expecting them to work in a certain way, but I was still surprised. It intrigues me that the notion was abandoned for so long; how far advanced would the field be, if Minsky and Papert hadn't published their book, and people had continued to feverishly work at producing more intelligent agents? It's unfortunate that we don't yet know, but it leaves more room for our generation to advance the field.

The AI frontier seems to be a very popular one to advance right now, but there is still so far to go. I'm glad we have an opportunity to learn about this field while we are still undergraduates, because any of us who does decide to go into AI will have an early start, and perhaps more importantly, knowledge about its history, philosophy, and related fields. We will have discussed the future of the field, and its impact on society.

The interactive online lesson in perceptrons was the best way for me to learn about them. It was a great way to explore some capability of perceptrons without having to construct that type of thing first, and all of the links explained (usually) fascinating related information that helped to complete my understanding of perceptrons. Among thinking machines in general, those with successful unsupervised learning would be the most exciting kind, of course. Now, I am anxious to learn about the possibilities for that behavior. Every week brings up another topic to wonder about for a while- there are so many things in AI that capture the imagination.


Tim Waring

first of all, i don't understand alot of the math, so sometimes reading the descriptions in the book can be tedious. i have never taken partial derivatives and the like. but from the philosphical standpoint that is all i am left to make judgements with, i beleive that neural networks are the best approach to acchieving consciousness. more than that recurrent neural networks allow the networks to process it's own machinations as inputs. in other words, it allows the network to affect its output in ways other than input directed output alone. it seems to me that humans do this sort of thing all the time. I sense my world, and comeup with a feature vector, of what the world is shaped like, then i apply my own ideas to the that and then i act, acording to how my own ideas and inputs were computed together. also a recurrent network, if implemented properly, gives the network the ability to continue computation AFTER the output. this is directly analogous to thinking without solving a problem. just by thinking. it seems to me the best strategy is to have a recurent very large, neural network, with massive amounts of sensory input, and a willing trainer.



Sarah Waziruddin

This weeks reading and lectures answered a lot of questions for me. I was skeptical about the idea of machines learning and wanted to know how they did this. I learnt that learning is not the same as thinking. The TLU's do learn through a set of given inputs and outputs, a program and a series of calculations but they dont actually think. They are given a set of inputs and a set of actions along with formulae used to calculate weight adjustment. This is not the same as examining a set of inputs, processing the information without a given formula and coming up of an appropriate output. Thinking means forming our own opinions and making up our own actions. TLU's and robots are not able to think.

I like the analogy that Nilsson make regarding the learning that TLU's achieve. He likens their learning to fitting a set of points on a curve. This is an act of processing a given set of data, not the act of thinking.

Again, I am surprised by the number of options AI scientists have when programming. There are several different methods to compute error and calculate weight adjustment. There are also two methods to constructing networks-- a recurrent method and a feel forward, or layered method.

I have a question regarding the adjustment of weights though. Are the weights in the TLU's adjusted manually by the AI scientist as Deepak showed us when using the learning program on the computer or are they adjusted by the TLU itself while processing the training set using given calculations? Or do we have both options?



Leslie Zavisca

I am having trouble with my hyper terminal so I have to use my worldnet account to write this week's response. I hope this is okay. Also, you might know this already but the link for week two's responses goes to week one's so nobody's been able to read last week's.

Of last week's reading and lecture, I was most interested in the demonstration of the computer program that learned to recognize certain letters of the alphabet under all sorts of noise and learning conditions. The thing that puzzles me the most is why we used more than one middle layer, or rather, why would anyone use more than one middle layer for such a program. Instinct suggests that more middles layers only makes room for more errors to occur and, therefore, slower learning. I am curious about how different numbers of layers affect different categories of learning. For example, what constitutes adding or subtracting layers in a learning program? I am guessing that the answer to my question probably has a lot to do with backpropogation. I also liked the reading on generalization (and the analogy with curve fitting) and cross validation. This appears to me to be the truest learning we've discussed so far for the course.

I had a lot of fun doing the week's lab exercises. At first, I was very intimidated (especially when my originally designated group didn't show up to the second lab meeting), but that feeling went away almost immediately and I started to have a lot of fun. All of the exercises were fairly straight forward and I could certainly come up with the pseudocode for them. The real problem was figuring out how to deal with the limitations of interactive c when I am so used to c++. I am very interested to see all of the different ways that other groups came up with to accomplish the same assignment. It was also fun to show our robot to other people even if they didn't think it profound or capable of someday taking over the world. I look forward to next weeks exercises.

Back to CS372 Course Materials