Lebeau, Hugo (2021) *A random matrix analysis of on-line learning* PFE - Project Graduation, ENSTA.

| PDF 4Mb |

## Abstract

This work is an analysis of kernel spectral clustering in streaming context. Assuming data x1, x2, . . . comes as a continuous flow and only a few number L of them can be kept in the learning pipeline, the Gram kernel matrix can only be computed around its diagonal: K = X⊤ X ⊙ T where X ∈ Rp×n is the data matrix and T ∈ {0, 1}n×n is a mask that keeps only the elements which can be computed in this on-line setting. As n, p, L → +∞ with p/n → c ∈ ]0, +∞[ and (2L − 1)/n → ε ∈ ]0, +∞[, we show how the limiting spectral distribution of K can be computed and we study the behavior of isolated eigenvalues and eigenvectors, which carry the information. This analysis reveals that c affects much more the performance than ε, which can fortunately be kept small. We detail how to perform on-line kernel spectral clustering and apply our algorithm to an image classification task. This work is a new step towards a better understanding of machine learning in order to make cost-efficient algorithms.

Item Type: | Thesis (PFE - Project Graduation) |
---|---|

Subjects: | Mathematics and Applications |

ID Code: | 8914 |

Deposited By: | Hugo Lebeau |

Deposited On: | 30 sept. 2021 11:03 |

Dernière modification: | 30 sept. 2021 11:03 |

Repository Staff Only: item control page