Curriculum Descant: Machine Learning for the (CS) Masses

Machine Learning for the (CS) Masses
Clare Bates Congdon
Computer Science Department
Colby College
Waterville, ME 04901
ccongdon@colby.edu

Curriculum Descant
From ACM Intelligence Magazine
Volume 12, Number 3, Summer 2001
ACM Press

To appear soon.

Formerly considered a somewhat esoteric subfield of computer science, machine learning is now seeing broad use in computer science applications, for example, in search engines, computer games, adaptive user interfaces, personalized assistants, web bots, and scientific applications. However, few colleges and universities require a course in machine learning as part of an undergraduate major. It is time for us as computer science educators to recast an introduction to machine learning concepts as a staple of a computer science education.

Of course, there are many possible flavors of machine learning that might be emphasized, and in a one-course introduction to the field, one must chart a course consistent with the educational environment and instructor's background. For example, one might focus a course on a particular approach to machine learning, such as neural nets, or might focus on a particular form of machine learning such as data mining. Regardless of the specific focus of the course, it is imperative that the course entail a significant hands-on component, working with actual systems. This component is essential for giving computer science students the concrete exposure to working with programs that adapt, rather than the abstract knowledge that such a thing is possible.

In a course focused on the data mining form of machine learning, students can work with a variety of systems. They can ask the question "is system A better than system B?", learn to refine that question to add "in this context" or "on this type of problem" to the question, and learn some simple statistical methods (such as the use of confidence intervals and t-tests) that are part of answering the question. Statistical approaches are often used within machine learning systems as well as to evaluate and compare the performance of different systems, so this provides a particularly nice context for introducing students to statistics. In addition to hands-on experience with data mining (and as time permits) students may do additional reading on other forms of machine learning to broaden their view of what the field includes, while not becoming overwhelmed with implementation details.

Useful data sets can often be obtained from faculty members in other departments on campus. Faculty members in other departments are likely to use other approaches (such as logistic regression) when analyzing their data, and some will welcome having a student try an entirely different approach to gain insight on the problem. The students, in turn, are able to see their computer science talents as useful to another discipline's questions.

Another reason the course is attractive is that one is able to give students the experience of using and modifying large software systems. There are several machine learning systems available for the downloading, for example, at the University of California, Irvine (UCI) Machine Learning Repository [2], the Carnegie Mellon AI Repository [3], and the AAAI Education Repository [1]. Students can be exposed to a culture where data sets and code are shared, and can be expected to get oriented to a large system and make changes or extensions to a specific part of such a system.

An undergraduate course in machine learning can be offered with minimal prerequisites, for example, to students who have completed the data structures course, which enables a broad range of students to take it. Not even an AI course need be a prerequisite, though of course key topics such as search, heuristics, and representation must be introduced. While a graduate-level machine learning course is often structured to read scores of journal papers that describe different systems (perhaps working with three or four systems), an undergraduate level course can eschew the "breadth" of machine learning and adopt a specific focus, such as the focus on the "data mining" form of machine learning as described here.

One possible structure for such as course proceeds as follows. At the beginning of the course, lectures focus on an overview of machine learning while students work on writing a "data preprocessor" that reads data in from a file and collects simple statistics on it, for example, the percentage of positive and negative examples. Next, students read about decision trees and continue their programming work to write their own decision tree program. Third, students do experiments with their system, evaluating the effects of varying the metric used to determine the split in the tree and on different data sets. For this project, students should be taught and expected to use confidence intervals and paired-t tests, and not to blithely state that one system is "best" based on simple average performance. Fourth, students can work with an alternate form of data mining, for example, to develop a genetic algorithm or neural net approach to data mining, refining "off the shelf" code to work with their problem. (Throughout most of the semester, students can work with publicly available datasets, such as those available at the UCI repository.) As final projects, students should choose a problem, ideally a real research problem from faculty members in other departments, and either experiment with different variations on a single machine learning system, or compare two or three different systems. (Data sets from the UCI repository can be used as a fallback.) Students should be expected at this point to be able to pose a research question (perhaps with some guidance), conduct their experiments according to the methodology described above (including the use of statistics), and present their findings as a research paper. It may be useful to pose this as "preliminary research" for a larger project, perhaps to investigate which of two or three approaches seems to hold the most promise for a specific data set.

Throughout the semester, short labs can be used to give students hands-on experience with systems written by others, such as Cobweb, Autoclass, and neural nets. (As you may guess, the course described here is one that I have designed and delivered. Online materials are available.)

An undergraduate course in machine learning provides an opportunity to introduce students to research, the experimental paradigm, and the use of statistical methods. Students can conduct actual original research (though modest) in the context of the class, which can lead to further independent study pursuit of larger projects. The course can also provide experience in working on a large software system and can expose students to a culture where data and code are shared. And, of course, an undergraduate course in machine learning introduces students to a vital and increasingly important area of computer science, providing them with skills, factual knowledge, and insights that are likely to serve them in many ways in their futures.

References
[1] The AAAI Education Repository (contains materials on neural nets).

[2] Blake, C. L. & Merz, C. J. (1998). The UCI Repository of machine learning databases (also contains software). Irvine, CA: University of California, Department of Information and Computer Science.

[3] The Carnegie Mellon University (CMU) Artificial Intelligence Repository: (particularly the "learning" subtree, which includes Genesis, Classweb, and source code and notes for a neural networks course).

Descants

Fall 1997
Welcome
Inaugural Installment of the new column.
(Deepak Kumar)

Summer 1998
Teaching about Embedded Agents
Using small robots in AI Courses
(Deepak Kumar)

Fall 1998
Robot Competitions as Class Projects
A report of the 1998 AAAI Robot Competition and how robot competitions have been successfully incorporated in the curriculum at Swarthmore College and The University of Arkansas
(Lisa Meeden & Doug Blank)

Winter 1998
Nilsson's New Synthesis
A review of Nils Nilsson's new AI textbook
(Deepak Kumar)

Spring 1999
Pedagogical Dimensions of Game Playing
The role of a game playing programming exercise in an AI course
(Deepak Kumar)

Summer 1999
A New Life for AI Artifacts
A call for the use of AI research software in AI courses
(Deepak Kumar)

Fall 1999
Beyond Introductory AI
The possibility of advanced AI courses in the undergraduate curriculum
(Deepak Kumar)

January 2000
The AI Education Repository
A look back at AAAI's Fall 1994 Symposium on Improving the Instruction of Introductory AI and the resulting educational repository
(Deepak Kumar)

Spring 2000
Interdisciplinary AI
A challenge to AI instructors for designing a truly interdisciplinary AI course
(Richard Wyatt)

Summer 2000
Teaching "New AI"
Authors of a new text (and a new take) on AI present their case
(Rolf Pfeifer)

Fall 2000
Ethical and Social Implications of AI: Stories and Plays
Descriptions of thought provoking stories and plays that raise ethical and social issues concerning the use of AI
(Richard Epstein)

January 2001
How much programming? What kind?
A discussion on the kinds of programming exercises in AI courses
(Deepak Kumar)

Spring 2001
Predisciplinary AI
A follow-up to Richard Wyatt's column (above) and a proposal for a freshman-level course on AI
(Deepak Kumar)

Spring 2001
Machine Learning for the Masses
Machine Learning comes of age in undergraduate AI courses
(Clare Congdon)

About Curriculum Descant
Curriculum Descant has been a regular column in ACM's Intelligence magazine (formerly published as ACM SIGART's Bulletin). The column is edited by Deepak Kumar. The column features short essays on any topic relating to the teaching of AI from any one willing to contribute. If you would like to contribute an essay, please contact Deepak Kumar.