BERNAS, M. Raphael (2025) On the learning representation of LLM: A concept-oriented study PFE - Project Graduation, ENSTA.

[img]PDF
1709Kb

Abstract

Explainable Artificial Intelligence (XAI) has emerged from the need for transparent and interpretable AI predictions, especially as models grow in complexity. This need is particularly present in the context of Deep Learning (DL), where model decisions are often opaque. Mechanistic interpretability, a subfield of XAI, addresses this challenge by seeking to understand the internal structure and functioning of Large Language Models (LLMs) to explain their outputs. However, most existing approaches focus on post-hoc analysis, applied after training, and relatively few have explored the use of mechanistic methods to monitor the learning process itself. In this work, we investigate the evolution of model internals during training using the EuroBERT model and its intermediate checkpoints. By applying mechanistic interpretability techniques throughout the training phase, we aim to derive insights into the learning dynamics of Large Language Models (LLMs), ultimately contributing to the development of more structured learning frameworks and learning dynamics.

Item Type:Thesis (PFE - Project Graduation)
Uncontrolled Keywords:Learning, LLM, Concept, Explainability, Mechanistic Interpretability, AI
Subjects:Mathematics and Applications
ID Code:10883
Deposited By:Raphaël BERNAS
Deposited On:31 oct. 2025 14:57
Dernière modification:31 oct. 2025 14:57

Repository Staff Only: item control page