Course Home
Syllabus
Lectures
Project
Bibliography
Software



CS 236601: Information Retrieval and Digital Libraries

Syllabus

  • Applications
    • Web indexing
    • Help systems
    • Email/document filtering/routing
  • Use models
    • Document search, with relevance ranking
    • Document clustering
    • Automatic topic hierarchy generation (e.g., automatic Yahoo!)
    • Document classification
  • Performance evaluation
    • Precision versus recall
    • Basic experiment design
  • Vector Space Model
    • Boolean
    • Weighted
  • Latent Semantic Indexing
  • Features
    • Word stemming
    • Case folding
    • Stop words
    • Thesauri
    • N-grams
  • Feature selection
  • Relevance ranking
    • Cosine
    • IDF, TFIDF
    • Link-based scoring
    • Structure-based scoring
  • Implementation issues
    • Inverted indexes
    • Dictionaries
    • Parsing
    • Distributed indexing/retrieval
    • Compression