Offline ASR System

IITM Speech Recognition System

1. This service will help you generate subtitles file and verbatim notes (both plain text files) for technical lectures.
2. Please Upload your recorded audio/video of technical lectures. This demo system accepts .wav , .mp4 and, .mp3 files only (default being .wav file)
3. Once uploaded, we will share links from which you can download the subtitles and verbatim file. Please note/copy link location before closing browser/tab.
4. The average turn around time for text generation (depending on load) is 0.1x input audio duration, i.e., one hour audio will take about 10 minutes to decode
5. Our ASR system is trained for Computer Science, Electrical Engineering, Humanities and Mechanical Domains (but works okayish for other streams)
6. The quality of ASR outputs depends on the quality of input audio (especially noise). It works best for wired close-talking headset mics, and basic courses.
7. In many evaluations our system performs better or is competitive to commercial systems such Temi, Otter, Google, Microsoft, Amazon, HappyScribe, Vocalmatic etc

IITM Speech Recognition System

Upload the Audio or Video File

Instruction Video