OBLED, M. Antoine (2023) Cardinality and fairness constrained clustering using $K$-means PFE - Project Graduation, ENSTA.
Full text not available from this repository.
Abstract
In this report we study the $K$-clustering problem (which consists in distributing data points into $K$ clusters with a minimal cost) and more especially to its variants, such as cardinality constrained $K$-clustering, cardinality balanced $K$-clustering and fair $K$-clustering. These variants permit to tackle the main drawback of the original problem, which is to find empty or almost empty clusters. The cardinality constrained $K$-clustering imposes an upper and a lower bound on the cluster sizes, while the balanced $K$-clustering imposes that the clusters size varies at most by one data point. The fair $K$-clustering guarantees a fair distribution of element types in all the clusters. To solve these problems, we use the $K$-means algorithm structure and we formulate its assignment step as a minimum cost flow problem. To solve this minimum cost flow problem, we create a graph for each category of problem. These graphs are based on the creation of fake points which allow to reach the maximum capacity and monitoring the flow to impose constraints on the size of clusters. We solve the minimum cost flow problem with two methods : one based on the successive shortest path algorithm and the other one based on the resolution of a linear program. We also study the influence of the initialisation and the stopping criteria on our algorithms. After that, we compare empirically our methods with methods from literature.
Item Type: | Thesis (PFE - Project Graduation) |
---|---|
Subjects: | Mathematics and Applications |
ID Code: | 9919 |
Deposited By: | Antoine Obled |
Deposited On: | 23 nov. 2023 14:13 |
Dernière modification: | 23 nov. 2023 14:13 |
Repository Staff Only: item control page