The Weekly Newsletter of MIT Linguistics

CompLang 11/15 - Hao Tang (MIT CSAIL)

Speaker: Hao Tang (MIT CSAIL)

Title: Automatic speech recognition without linguistic knowledge?

Date and time: Thursday, 11/15, 5-6pm

Location: 46-5165


Building state-of-the-art speech recognizers, besides a large corpus of transcribed speech, require several additional ingredients, such as a phoneme inventory, a lexicon, and a language model.  These ingredients carry linguistic constraints to make training more feasible and more sample efficient. Recently, there has been a push towards building a speech recognizer end to end, i.e., using few or even none of the aforementioned ingredients.  This raises a fundamental question: is it possible to train a speech recognizer without any linguistic constraint?

  How much data do we need to make it possible?  What linguistic constraints are necessary for building a speech recognizer?  In this talk, I will review the inner workings of conventional and end-to-end speech recognizers, and to help answer some of those questions, I will present empirical results in training end-to-end speech recognizers without any linguistic constraints.