Echo's web UI.
Echo proving transcription in chat.

Accessibility is crucial to ensure inclusively.

Discord is a popular platform for online events and friend groups, but it’s lack of captions in voice calls limits accessibility.

To address this problem, I built Echo, a Discord bot that provides real-time caption generation for voice calls.

Echo uses discord.js, a JavaScript wrapper for Discord’s API, to get audio data from calls. Vosk, a lightweight, offline transcription model to provide captions both in-chat and in a web UI built with SvelteKit.

Echo is designed to be lightweight and self-hostable while focusing on privacy. It even works on a Raspberry Pi, lowering the barrier to entry.

Note: Echo is a proof-of-concept and is not intended for production use. It is not affiliated with Discord.