Course Outline
Optimal Bayesian classifiers: all probabilities are known.
When probabilities are not known we must learn them from the data:
- Parametric methods.
- Non-parametric methods.
The relations between learning and compression:
- Information theory and data compression.
Patterns and randomness: Kolmogorov complexity.
Sample complexity: how fast can we learn?
If time allows: Bayesian networks, Support vector machines.