back to classes

CS594 Special Topics: Detecting and Learning Processes and Behaviors

Syllabus

Winter 2009

cs594 website: http://www.calstatela.edu/faculty/vcrespi/CS/CS594/Lects/cs594.html
Lectures:

R 6:10-10:00pm, ET A331 (three hours and a half of lectures/presentations overall)

The course is structured as a mixture of lectures and student presentations based on readings.

Instructor:

Valentino Crespi
vcrespi@calstatela.edu
(323) 343-4596.
ET-A318

Office Hours:

R 5:00-6:00pm

Abstract: CS594 is a graduate seminar class focused on learning and detecting processes and behaviors. In particular we will study problems and methods about learning regular languages (DFAs), probability distributions over strings and stochastic processes (Probabilistic Finite State Automata, Hidden Markov models and Bayesian Graphical Models).

The course material includes several conference and journal publications (see references) as some of the applications and questions of our interest are still matter of current scientific investigation. Background concepts include Automata theory and languages, Stochastic processes (Markov chains, Hidden Markov Models, etc), Statistical learning (PAC learning), Information Theory (entropy and compression), Filtering (Viterbi algorithm, Kalman filtering) and Bayesian graphical models.

Great relevance will be given to concrete applications of these ideas to the current development of modern computer technologies.

Course Goals: At the end of the course, students are able to
  1. Understand basic concepts and ideas about detecting and learning processes, behaviors and signatures.
  2. Know important learning algorithms and inference techniques relevant in the specific domain of languages and stochastic processes.
  3. Understand the importance of those learning approaches in the development of advanced on-edge modern technologies.
Recommended Prerequisites: The class is self-contained. However knowledge of elements of Automata Theory, Theory of Computation, Basic Probability and Stochastic Processes is recommended.
Course Materials and Textbooks: Class References
Topics:
  • Review of Automata and regular languages. Learning DFAs.
  • Review of Probability and Stochastic Processes. Entropy and Entropy rates. Laws of large numbers. Ergodicity.
  • Computational Learning Theory: the PAC model. KL-PAC learning Automata.
  • Probabilistic Automata: nondeterministic and deterministic. Hidden Markov models. Probabilistic Suffix Trees and Probabilistic Suffix Automata.
  • Dirichlet Processes and Bayesian Graphical Models. Learning HMMs with unknown number of states.
Grading Policy: The course is structured as a mixture of lectures and student presentations based on readings. Student presentations will be graded based on their quality and also on the level of understanding of the subject being presented. There will be: one in-class midterm examination by the fifth week of class, a few programming assignments/projects given during the course and a final class presentation due at end of the course. No final exam.

In-class Midterm Exam (30%), Presentations (40%), Homework Assignments/Project (20%), Participation/Attendance (10%).
Score (%) Letter Grade
90-100 A
80-89 B
60-79 C
50-59 D
0-49 F

Academic Integrity: Students are allowed and encouraged to discuss reading materials with each other. However, homework assignments must be solved and written individually. If you obtain a solution with help then you should acknowledge your source in the paper and then write independently your own solution.

Cheating will not be tolerated. Cheating on any assignment or exam will be taken seriously. All parties involved will receive a grade of F for the course and be reported.

General Policies:
  • Makeup Exams: No.
  • Use of Cell Phones: forbidden.
  • Office:
    Students are warmly invited to visit the instructor (during the announced office hours) for questions and clarifications.
  • E-mail:
    E-mails addressed to vcrespi@calstatela.edu must have, in the subject, the keyword CS594 (e.g. Subject: CS594 ...). All the E-mails will be possibly processed in the evening and so will be answered with a minimum delay. Be careful, the keyword in the subject is important for automatic filtering. Wrong subjects may result in the accidental loss of the message.