Lebeau, Hugo (2021) A random matrix analysis of on-line learning PFE - Project Graduation, ENSTA.
![]()
| PDF 4Mb |
Abstract
This work is an analysis of kernel spectral clustering in streaming context. Assuming data x1, x2, . . . comes as a continuous flow and only a few number L of them can be kept in the learning pipeline, the Gram kernel matrix can only be computed around its diagonal: K = X⊤ X ⊙ T where X ∈ Rp×n is the data matrix and T ∈ {0, 1}n×n is a mask that keeps only the elements which can be computed in this on-line setting. As n, p, L → +∞ with p/n → c ∈ ]0, +∞[ and (2L − 1)/n → ε ∈ ]0, +∞[, we show how the limiting spectral distribution of K can be computed and we study the behavior of isolated eigenvalues and eigenvectors, which carry the information. This analysis reveals that c affects much more the performance than ε, which can fortunately be kept small. We detail how to perform on-line kernel spectral clustering and apply our algorithm to an image classification task. This work is a new step towards a better understanding of machine learning in order to make cost-efficient algorithms.
Item Type: | Thesis (PFE - Project Graduation) |
---|---|
Subjects: | Mathematics and Applications |
ID Code: | 8914 |
Deposited By: | Hugo Lebeau |
Deposited On: | 30 sept. 2021 11:03 |
Dernière modification: | 30 sept. 2021 11:03 |
Repository Staff Only: item control page