Extracting meaningful patterns from random samples of large data sets. Statistical analysis of the resulting problems. Common algorithmic paradigms for such tasks. Central concepts: VC-dimension, margins of a classifier, sparsity and description length. Performance guarantees: generalization bounds, data dependent error bounds and computational complexity of learning algorithms. Common paradigms: neural networks, kernel methods and support-vector machines. Applications to data mining. [Note: Lab is not scheduled and students are expected to find time in open hours to complete their work.] Prereq: CM 339/CS 341 and (STAT 230 or 240); Computer Science students only