Deep Learning for Speech Recognition
Deep Learning for Speech Recognition


In the field of Automatic Speech Recognition (ASR), the state of the art for generic conversations have reached super human levels. However, things are not nearly as good in specialized knowledge domains: attempting to transcribe vendor-customer or intra-vendor conversations often results in high double-digit error rates. Considering the low performance of ASR on real data, it becomes imperative to customize the end-to-end probabilistic model instead of analyzing Language Model and Acoustic Model separately. This session will focus on discussing grapheme based end-to-end Recurrent Neural Network architectures which can transcribe audios directly. We will also have a reality check to reduce latency by tweaking the model during inference time.


Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google