Biography
Preface
Part 1
INTRODUCTION
Chapter 1. Statistical Machine Learning
1.1. Types of Learning
1.2. Examples of Machine Learning Tasks
1.3. Structure of This Textbook
Part 2
STATISTICS AND PROBABILITY
Chapter 2. Random Variables and Probability Distributions
2.1. Mathematical Preliminaries
2.2. Probability
2.3. Random Variable and Probability Distribution
2.4. Properties of Probability Distributions
2.5. Transformation of Random Variables
Chapter 3. Examples of Discrete Probability Distributions
3.1. Discrete Uniform Distribution
3.2. Binomial Distribution
3.3. Hypergeometric Distribution
3.4. Poisson Distribution
3.5. Negative Binomial Distribution
3.6. Geometric Distribution
Chapter 4. Examples of Continuous Probability Distributions
4.1. Continuous Uniform Distribution
4.2. Normal Distribution
4.3. Gamma Distribution, Exponential Distribution, and Chi-Squared Distribution
4.4. Beta Distribution
4.5. Cauchy Distribution and Laplace Distribution
4.6. t-Distribution and F-Distribution
Chapter 5. Multidimensional Probability Distributions
5.1. Joint Probability Distribution
5.2. Conditional Probability Distribution
5.3. Contingency Table
5.4. Bayes’ Theorem
5.5. Covariance and Correlation
5.6. Independence
Chapter 6. Examples of Multidimensional Probability Distributions
6.1. Multinomial Distribution
6.2. Multivariate Normal Distribution
6.3. Dirichlet Distribution
6.4. Wishart Distribution
Chapter 7. Sum of Independent Random Variables
7.1. Convolution
7.2. Reproductive Property
7.3. Law of Large Numbers
7.4. Central Limit Theorem
Chapter 8. Probability Inequalities
8.1. Union Bound
8.2. Inequalities for Probabilities
8.3. Inequalities for Expectation
8.4. Inequalities for the Sum of Independent Random Variables
Chapter 9. Statistical Estimation
9.1. Fundamentals of Statistical Estimation
9.2. Point Estimation
9.3. Interval Estimation
Chapter 10. Hypothesis Testing
10.1. Fundamentals of Hypothesis Testing
10.2. Test for Expectation of Normal Samples
10.3. Neyman-Pearson Lemma
10.4. Test for Contingency Tables
10.5. Test for Difference in Expectations of Normal Samples
10.6. Nonparametric Test for Ranks
10.7. Monte Carlo Test
Part 3
GENERATIVE APPROACH TO STATISTICAL PATTERN RECOGNITION
Chapter 11. Pattern Recognition via Generative Model Estimation
11.1. Formulation of Pattern Recognition
11.2. Statistical Pattern Recognition
11.3. Criteria for Classifier Training
11.4. Generative and Discriminative Approaches
Chapter 12. Maximum Likelihood Estimation
12.1. Definition
12.2. Gaussian Model
12.3. Computing the Class-Posterior Probability
12.4. Fisher’s Linear Discriminant Analysis (FDA)
12.5. Hand-Written Digit Recognition
Chapter 13. Properties of Maximum Likelihood Estimation
13.1. Consistency
13.2. Asymptotic Unbiasedness
13.3. Asymptotic Efficiency
13.4. Asymptotic Normality
13.5. Summary
Chapter 14. Model Selection for Maximum Likelihood Estimation
14.1. Model Selection
14.2. KL Divergence
14.3. AIC
14.4. Cross Validation
14.5. Discussion
Chapter 15. Maximum Likelihood Estimation for Gaussian Mixture Model
15.1. Gaussian Mixture Model
15.2. MLE
15.3. Gradient Ascent Algorithm
15.4. EM Algorithm
Chapter 16. Nonparametric Estimation
16.1. Histogram Method
16.2. Problem Formulation
16.3. KDE
16.4. NNDE
Chapter 17. Bayesian Inference
17.1. Bayesian Predictive Distribution
17.2. Conjugate Prior
17.3. MAP Estimation
17.4. Bayesian Model Selection
Chapter 18. Analytic Approximation of Marginal Likelihood
18.1. Laplace Approximation
18.2. Variational Approximation
Chapter 19. Numerical Approximation of Predictive Distribution
19.1. Monte Carlo Integration
19.2. Importance Sampling
19.3. Sampling Algorithms
Chapter 20. Bayesian Mixture Models
20.1. Gaussian Mixture Models
20.2. Latent Dirichlet Allocation (LDA)
Part 4
DISCRIMINATIVE APPROACH TO STATISTICAL MACHINE LEARNING
Chapter 21. Learning Models
21.1. Linear-in-Parameter Model
21.2. Kernel Model
21.3. Hierarchical Model
Chapter 22. Least Squares Regression
22.1. Method of LS
22.2. Solution for Linear-in-Parameter Model
22.3. Properties of LS Solution
22.4. Learning Algorithm for Large-Scale Data
22.5. Learning Algorithm for Hierarchical Model
Chapter 23. Constrained LS Regression
23.1. Subspace-Constrained LS
23.2. ℓ2-Constrained LS
23.3. Model Selection
Chapter 24. Sparse Regression
24.1. ℓ1-Constrained LS
24.2. Solving ℓ1-Constrained LS
24.3. Feature Selection by Sparse Learning
24.4. Various Extensions
Chapter 25. Robust Regression
25.1. Nonrobustness of ℓ2-Loss Minimization
25.2. ℓ1-Loss Minimization
25.3. Huber Loss Minimization
25.4. Tukey Loss Minimization
Chapter 26. Least Squares Classification
26.1. Classification by LS Regression
26.2. 0∕1-Loss and Margin
26.3. Multiclass Classification
Chapter 27. Support Vector Classification
27.1. Maximum Margin Classification
27.2. Dual Optimization of Support Vector Classification
27.3. Sparseness of Dual Solution
27.4. Nonlinearization by Kernel Trick
27.5. Multiclass Extension
27.6. Loss Minimization View
Chapter 28. Probabilistic Classification
28.1. Logistic Regression
28.2. LS Probabilistic Classification
Chapter 29. Structured Classification
29.1. Sequence Classification
29.2. Probabilistic Classification for Sequences
29.3. Deterministic Classification for Sequences
Part 5
FURTHER TOPICS
Chapter 30. Ensemble Learning
30.1. Decision Stump Classifier
30.2. Bagging
30.3. Boosting
30.4. General Ensemble Learning
Chapter 31. Online Learning
31.1. Stochastic Gradient Descent
31.2. Passive-Aggressive Learning
31.3. Adaptive Regularization of Weight Vectors (AROW)
Chapter 32. Confidence of Prediction
32.1. Predictive Variance for ℓ2-Regularized LS
32.2. Bootstrap Confidence Estimation
32.3. Applications
Chapter 33. Semisupervised Learning
33.1. Manifold Regularization
33.2. Covariate Shift Adaptation
33.3. Class-balance Change Adaptation
Chapter 34. Multitask Learning
34.1. Task Similarity Regularization
34.2. Multidimensional Function Learning
34.3. Matrix Regularization
Chapter 35. Linear Dimensionality Reduction
35.1. Curse of Dimensionality
35.2. Unsupervised Dimensionality Reduction
35.3. Linear Discriminant Analyses for Classification
35.4. Sufficient Dimensionality Reduction for Regression
35.5. Matrix Imputation
Chapter 36. Nonlinear Dimensionality Reduction
36.1. Dimensionality Reduction with Kernel Trick
36.2. Supervised Dimensionality Reduction with Neural Networks
36.3. Unsupervised Dimensionality Reduction with Autoencoder
36.4. Unsupervised Dimensionality Reduction with Restricted Boltzmann Machine
36.5. Deep Learning
Chapter 37. Clustering
37.1. k-Means Clustering
37.2. Kernel k-Means Clustering
37.3. Spectral Clustering
37.4. Tuning Parameter Selection
Chapter 38. Outlier Detection
38.1. Density Estimation and Local Outlier Factor
38.2. Support Vector Data Description
38.3. Inlier-Based Outlier Detection
Chapter 39. Change Detection
39.1. Distributional Change Detection
39.2. Structural Change Detection
References
Index
· · · · · · (
收起)