CALANDRAS, Alexia (2016) Copula-based mixture models for clustering applications. PRE - Research Project, ENSTA.

Restricted to Registered users only



Data clustering is a statistical method for data analysis with many applications. The aim is to cluster objects in several groups (clusters) according to their observed characteristics. It is assumed that the composition of each group is an unobserved variable. The project focuses on a study conducted on a set of 493 NBA players and interest is in clustering them according to their performances. Thanks to the mixture model theory and Gaussian copula, but also thanks to the theory of the Cholesky decompositions of correlation matrices, 20 different models are considered for the task (four different initialisations of correlation matrices and from 1 to 5 clusters are considered in each case). There are many different unsupervised clustering algorithms. Here we focus on mixture modelling approaches and use the EM (Expectation-Maximisation) Algorithm in order to estimate the parameters of each model. Eventually, the best model of the sample is selected, calculating the Bayesian Information Criterion (BIC).

Item Type:Thesis (PRE - Research Project)
Uncontrolled Keywords:Clustering - Mixture model - Gaussian Copula - Correlation Matrix - Cholesky Decomposition - EM Algorithm -
Subjects:Mathematics and Applications
ID Code:6720
Deposited By:Alexia Calandras
Deposited On:07 mars 2017 10:59
Dernière modification:07 mars 2017 10:59

Repository Staff Only: item control page