Accessibility is crucial to ensure inclusively.
Discord is a popular platform for online events and friend groups, but it’s lack of captions in voice calls limits accessibility.
To address this problem, I built Echo, a Discord bot that provides real-time caption generation for voice calls.
Echo uses discord.js, a JavaScript wrapper for Discord’s API, to get audio data from calls. Vosk, a lightweight, offline transcription model to provide captions both in-chat and in a web UI built with SvelteKit.
Echo is designed to be lightweight and self-hostable while focusing on privacy. It even works on a Raspberry Pi, lowering the barrier to entry.
Note: Echo is a proof-of-concept and is not intended for production use. It is not affiliated with Discord.