CHORNA, Mme Sofiia (2025) Mapping atomistic datasets for machine learning potentials PFE - Projet de fin d'études, ENSTA.
Fichier(s) associé(s) à ce document :
![]() | PDF 7Mb |
Résumé
The increasing complexity of atomistic machine learning potentials (MLIPs) necessitates robust datasets and interpretable models to predict atomic-scale properties efficiently. This work presents a comprehensive analysis and visualization of the Massive Atomic Diversity (MAD) dataset, developed at the Laboratory of Computational Science and Modeling (COSMO, EPFL) for training universal MLIPs, such as PET-MAD. We introduce a generalizable approach to map atom- istic datasets into intuitive, low-dimensional representations by leveraging the last-layer features of ML models. This method directly compares MAD’s chemical and structural diversity against other established benchmarks. Furthermore, we investigate the latent spaces learned by prominent MLIPs to gain an understanding of their underlying atomic representations. The study demon- strates a systematic framework for the characterization, visualization, and integration of large-scale atomistic datasets, thereby advancing the development of more efficient and interpretable machine learning models in materials modeling.
Type de document: | Rapport ou mémoire (PFE - Projet de fin d'études) |
---|---|
Mots-clés libres: | Interatomic potentials, atomic representations, data visualization, latent spaces, material modeling |
Sujets: | Sciences et technologies de l'information et de la communication Science des matériaux, mécanique, génie mécanique |
Code ID : | 10835 |
Déposé par : | Sofiia CHORNA |
Déposé le : | 08 oct. 2025 10:34 |
Dernière modification: | 08 oct. 2025 10:34 |