Ghrabli, M Mehdi (2022) End-of-studies internship report : Data-driven deep neural network quantization PFE - Projet de fin d'études, ENSTA.

Fichier(s) associé(s) à ce document :

[img]
Prévisualisation
PDF
476Kb

Résumé

Research in deep neural networks (DNNs) is growing by the minute, allowing for such technologies to be used in many aspects of our everyday life. However, DNNs are notorious for being costly in computation time as well as energy and memory consumption, which is a major problem for both edge devices that use very limited resources and data centers for which costs are mostly due to power consumption. A possible solution is to use number representations that are more hardware-friendly such as fixed point representation with lower bit-width for faster calculations as well as memory consumption reduction. These methods, amongst others, form the core of neural network quantization. During my internship, I first implemented and combined various quantization methods to evaluate them on simple image classification problems. I focused on Wrapnet, which was published at ICLR 2021, an innovative quantization techniques that proposed to tackle an often overlooked problem in the domain: accumulators size. Reducing the accumulators size is of paramount importance for some industrial applications (e.g. cryptographic inference). A second article of great interest was DSQ, which allows one to train quantized networks more efficiently by improving upon the canonical solution for gradient estimation for the rounding operator. Ultimately, I tested the most relevant combination of said methods on harder problems such as image segmentation and object detection

Type de document:Rapport ou mémoire (PFE - Projet de fin d'études)
Sujets:Mathématiques et leurs applications
Code ID :9244
Déposé par :Mehdi Ghrabli
Déposé le :05 juin 2023 10:04
Dernière modification:05 juin 2023 10:04

Modifier les métadonnées de ce document.