Naud, Mr Maxence (2020) Neural Network Language Models for Speech Recognition: training and evaluation in a limited data setup PRE - Research Project, ENSTA.

[img]
Preview
PDF
1442Kb

Abstract

Automatic Speech Recognition systems have mainly been trained with performance concerns and huge amounts of data. However, as the demand for privacy driven applications keeps rising, traditional data collection for improving greedy Language Models has become a major problem. This report explores the training of Neural Network-based Language Models in limited data setups. It explores two sources of external data to improve the initial model based only on labelled in-domain data: labelled out-of-domain data and unlabelled in-domain data. It appears that the Language Model benefits more from unlabelled in-domain data. The best performing model is a threefold linear interpolation combining two linear interpolations on an n-gram LM and a LSTM LM trained on the same corpus. The first interpolated model is trained on labelled and the second on unlabelled in-domain data. A new method of loss weighing using the probabilities from the sausage outputted by the ASR system is explored and quite promising.

Item Type:Thesis (PRE - Research Project)
Uncontrolled Keywords:Neural Network, Language Model, N-Gram, adaptation, interpolation, domain, Natural Language Processing, limited data setup
Subjects:Information and Communication Sciences and Technologies
ID Code:8022
Deposited By:Maxence NAUD
Deposited On:17 mai 2021 14:14
Dernière modification:17 mai 2021 14:14

Repository Staff Only: item control page