IITM Speech Recognition System

1. This service will help you generate subtitles file and verbatim notes (both plain text files) for technical lectures.
2. Please Upload your recorded audio/video of technical lectures. This demo system accepts .wav , .mp4 and, .mp3 files only (default being .wav file)
3. Once uploaded, we will share links from which you can download the subtitles and verbatim file. Please note/copy link location before closing browser/tab.
4. The average turn around time for text generation (depending on load) is 0.1x input audio duration, i.e., one hour audio will take about 10 minutes to decode
5. Our ASR system is trained for Computer Science, Electrical Engineering, Humanities and Mechanical Domains (but works okayish for other streams)
6. The quality of ASR outputs depends on the quality of input audio (especially noise). It works best for wired close-talking headset mics, and basic courses.
7. In many evaluations our system performs better or is competitive to commercial systems such Temi, Otter, Google, Microsoft, Amazon, HappyScribe, Vocalmatic etc

Upload the Audio or Video File

Instruction Video

Contact Us

Contact us at speechiitm@ee.iitm.ac.in

About us

Speech lab IIT Madras is headed by Prof. S. Umesh and is part of the Dept. of Electrical Engg. Our focus is on building state of the art speech recognition systems, especially in Indian languages. Our research interests are in low-resource modelling, multilingual speech recognition and speaker normalisation.

Contact Info

  • Speech lab, Department of Electrical Engineering, IIT Madras, Chennai, Tamil Nadu, India

  • 044 2257 5477

  • speechiitm@ee.iitm.ac.in / speechiitm@gmail.com