Rev AI’s Streaming Speech-to-Text API enables real-time transcription for streaming audio. It can be used with both WebSocket and RTMP streams, and has a time limit of 3 hours per stream.
While this is more than sufficient for most scenarios, there are cases where live streams can run longer than 3 hours – for example, live transcription of commentary for a day-long sporting event. When this happens, the recommended practice is to initialize a new concurrent connection to the API and switch to it.
This sounds simple but in practice, application developers often struggle with implementing it. Two of the most common problems we’ve heard about from developers are:
- Losing audio data during the switchover
- Re-aligning transcript timestamps correctly to the audio stream
If you’ve ever found yourself in this situation, read our tutorial on recovering from connection errors and timeouts in Rev AI streaming sessions. It proposes solutions for the above problems and includes sample code that you can adapt for your specific use case. And if you found it helpful, don’t forget to write in and tell us!