Bryn Mawr College
CS 325: Computational Linguistics
Fall 2005
Course Materials
General Information
Instructor: Deepak Kumar, 248 Park Hall, 526-7485
E-Mail: dkumar at cs brynmawr dot edu
WWW: http://cs.brynmawr.edu/~dkumar
Lecture Hours: Tuesdays & Thursdays , 2:30 p.m. to 4:00.m.
Room: Park 336
Lab: Mon & Fri 10:00a - 11:00a in Room Park 230 (I will also be
available Fri 3-5p in the lab)
Laboratories:
- The Emergent Intelligence Lab (Park 230)
- Computer Science Lab Room 231 (Science Building)
Texts & Software
| Speech and Language Processing: An Introduction to Natural Language
Processing, Computational Linguistics, and Speech Recognition, by Daniel
Jurafsky and James Martin, Prentice Hall Publishers, 2000. |
 |
Important Dates
August 29: First lecture
September 22: Exam 1
November 3: Exam 2
December 8: Last lecture/Exam 3
Assignments
-
Homework (Due Thursday, September 8): Get access to Python
(either on CS servers or your own computer) and try out the examples in the Python
for Linguists tutorial. Links to download and install your own version
of Python are in the tutorial. There is nothing to submit for this homework,
just do the tutorial and get used to Python.
- Homework (Due in class on Thursday, September 15): Click
here for details.
- Homework (Due in class on Thursday, September 22): Click
here for details.
- Homework (Due in class on Tuesday, October 4): Click
here for details.
- Homework (Due in class on Tuesday, October 25): Click
here for details.
- Homework (Due in class on Tuesday, November 1): Click
here for details.
Lectures
- Week 1 (August 30, September 1)
August 30: Course Introduction. Overview of topics: Words, syntax,
semantics, discourse, etc.
Read: Chapter 1 from Jurafsky & Martin. You
can download a copy of Chapter 1 from here.
Links: Click
here to go to The Language Computer Q&A demo.
September 1 : Knowledge of Language, Ambuguity, Models and Algorithms. WORDS:
Regular Expressions: An Introduction.
Read: Chapter 1 from Jurafsky & Martin.
Links: Click
here for an online version of ELIZA.
- Week 2 (September 6, 8)
September 6: Regular Expressions: for searching and specifying languages.
Basic elements of regular expressions: expressions, anchors, counters,
operator precedence, substitution, memory, examples. Introduction to the
Python language. Regular expressions in Python.
Read: Chapter 2 from Jurafsky & Martin. A good Python-based
tutorial on regular expressions is available here.
There are several tutorials available on Python. Here is a list:
Python for Linguists (written by Deepak for this class, under development)
Another
Python intro for linguists...(at the NLTK site)
A quick Introduction to Python (mainly for those who know how to program)
Python for Computer Scientists
Also, see the official Python web site
Homework: Get access to Python (either on CS servers or
your own computer) and try out the examples in the Python
for Linguists tutorial. Links to download and install your own version of Python are
in the tutorial.
September 8: Finite State Automata: as language recognizers, as
language generators. Deterministic and Non-deterministic automata. Compositional
Machines/Languages. Introduction to Python, contd.
Homework (Due in class on Thursday, September
15): Click here for
details.
Read: Chapter 2 from Jurafsky & Martin.
- Week 3 (September 13, 15)
September 13: Regular Expressions in Python. Introduction to basic
morphology.
Read: Chapter 3 from Jurafsky & Martin.
September 15: Morphology: Affixes, inflection, derivation, compounding.
Finite state transducers for morphological parsing & recognition: lexicons,
morphotactics, orthographic rules.
Homework (Due in class on Thursday, September 22): Click
here for details.
- Week 4 (September 20, 22)
September 20: Word classes and Parts of speech. Tagging and tagsets.
Read: Chapter 8 from Jurafsky & Martin.
September 22: Exam 1 is today.
- Week 5 (September 27, 29)
September 27: Tokenization and tagging. Tokenization in Python.
Homework (Due in class on Tuesday, October 4): Click here for details.
September 29: NLTK_LITE: looking at and processing nltk_lite corpora.
Tagging: default, regular expression, affix, unigram, bigram, and N-gram
taggers.
- Week 6 (October 4, 6)
October 4: Tagging contd. Syntax: Context Free Grammars.
Note on Homework: Modify your program to use the processing regime discussed
in class and run it on the data files supplied on the assignment's
page (see text there in red)
Read: Chapter 9 from Jurafsky & Martin
October 6: More tagging. Cascading taggers in NLTK_LITE.
Enjoy your Fall Break!
- Week 7 (October 11, 13)
No classes, Fall Break!!
- Week 8 (October 18, 20)
October 18: Context Free Grammars: Ordering, constituency, generativity,
derivation, parsing, sentence types, common structures for NP, VP, and
PP, agreement, subcategorization, grammar equivalence, Chomsky Normal Form,
CFG's and finite-state-machines, recursion, RTN's (recursive Transition
Networks).
Read: Chapter 9 from Jurafsky & Martin
Homework (Due in class on Tuesday, October 25): Click
here for details.
October 20: Parsing using CFG's. Top-down and bottom-up parsing.
Practical top-down parsing: recursive descent parsers; using RTN's for
top-down parsing. Practical bottom-up parsing: Shift-Reduce Parser. Parsing
(and visualization) using NLTK-LITE.
Read: Chapter 10 from Jurafsky & Martin. Also see
Python For Linguists #5 (Context Free Grammars & Parsing)
- Week 9 (October 25, 27)
October 25: Discussion and comparison of taggers from Assignment#4.
Parsing, contd. Top-down chart parsing (Early Algorithm). Augmented Transition
Networks (ATN's).
Read: Handouts on RTNs and ATNs.
Homework (Due in class on Tuesday, November 1): Click
here for details.
October 27: POS-tagging revisited: Details of the Brown tagset (augmented),
Factors affecting tagger performance. Human parsing: ambiguity resolution,
lexical subcategorization, garden path sentences. Language and complexity:
Chomsky Hierarchy.
Read: Chapter 13 from Jurafsky & Martin.
- Week 10 (November 1, 3)
November 1 :
November 3: Exam 2 is today.
- Week 11 (November 8, 10)
November 8: Semantic Interpretation. Meaning Structure of Language.
Meaning representation languages. First-Order Predicate Calculus as a representation
language.
November 10: First-Order Predicate Calculus, contd. Inference
rules and inference engines. Tell-ask model of knowledge representation
and reasoning.
Read: Chapter 14 from Jurafsky & martin.
- Week 12 (November 15, 17)
November 15: Review of Exam#2. Semantic Networks for representing
meaning.
November 17: Examples and demos of Meaning Representation Languages:
Logic: SNePSLOG. Semsntic Networks: SNePS. ATN Parsing: SNaLPS.
- Week 13 (November 22, 24)
November 22: Discussion of future topics to cover. We decided on
semantics analysis (syntax-based) and then, time permitting, language generation,
and machine translation. Watched the Tour Guide robot video and discussed
its shortcomings vis-a-vis language use.
November 24: Happy Thanksgiving!!
- Week 14 (November 29, December 1)
November 29: Semantic analysis: compositional semantics, syntax-based
semantic analysis into FOPC meaning representation, lambda reductions,
handling quantifiers, issues with quantifier scoping.
Read: Chapter 15 from Jurafsky & Martin.
December 1: Semantic attachements for standard sentences and phrases.
Semantic analysis using ATN's (Demo). Architecture of a Natural Language
system. Language Generation.
- Week 15 (December 6, 8)
December 6: Machine Translation: Language issues: typology, morphology,
syntax, words, gender, cultural... Models of automated machinetranslation:
Transfer Model, Interlingua Model. Demos of machine translation systems.
Read: Chapter 21 from Jurafsky & Martin.
December 8: Exam 3 is today.
Grading
All graded work will receive a grade, 4.0, 3.7, 3.3, 3.0, 2.7, 2.3, 2.0, 1.7,
1.3, 1.0, or 0.0. At the end of the semester, final grades will be calculated
as a weighted average of all grades according to the following weights:
Exam 1: 15%
Exam 2: 15%
Exam 3: 15%
Labs & Written Work: 55%
Total: 100%
Links
Text's Home Page (Jurafsky & Martin)
The Association for Computational Linguistics (ACL)
The Language
Computer Q&A demo
An online version
of ELIZA
NLTK Home page
NLTK LITE Tutorials
NLTK LITE API Documentation
Created by dkumar@cs.brynmawr.edu on
August15, 2005.
Development of this course was supported in part by a Curriculum Development
Grant from the Provost's Office at Bryn Mawr College. Thanks!