Voice-driven automated agents such as personal assistants are becoming increasingly popular. However, in a multilingual and multi-cultural country like India, deploying such agents to engage with large sections of the population is highly challenging. A major hindrance in this regard is the difficulty the agents would face in understanding varying speech accents of the users. Even when the language of interaction with the underlying automatic speech recognition (ASR) system is restricted to a lingua franca (such as English), the accent of the speaker can vary dramatically based on their cultural and linguistic background, posing a fundamental challenge for ASR systems. Tackling this challenge will be a necessary first step towards building socially accepted and commercially successful agents in the Indian context.
The main focus of this project will be to take this first step, by improving the state-of-the-art performance of ASR systems on accented speech - specifically, speech with Indian accents. We shall develop deep neural network based acoustic models that will be trained using not only accented speech data but also speech in the native languages associated with the accent. We shall also develop a tool that will be trained to identify various Indian accents automatically. Finally, we shall investigate how accented-speech-ASR can be effectively incorporated into intelligent agents to help them act in socio-culturally appropriate ways.