A Discord bot that uses Ollama 2 and TTS to create natural voice conversations in real time.
Project Details
Discord AI Voice Chat Bot is a personal project designed to enable natural, real-time conversations in Discord voice channels using LLMs and speech tech.
Architecture & Flow:
- User speaks in a voice channel, which is captured and transcribed via a speech-to-text module
- The transcribed input is sent to Ollama 2 (7B parameter model) running locally for LLM response generation
- The response is then converted to speech using the Windows Japanese voice pack (for an intentionally quirky vibe)
- The spoken response is played back directly into the Discord voice channel
Key Features:
- Supports both voice-based interaction and text command prompts
- Flexible interaction modes — users can switch between speaking or typing to the bot
- Lightweight and fully local LLM handling using Ollama 2
- Customizable voice output (with Japanese voice pack as default for charm)
Technical Stack:
- Node.js and Discord.js for bot framework and event handling
- Ollama 2 (7B) for on-device LLM processing
- Speech-to-text for capturing and interpreting voice commands
- Windows TTS (Japanese voice pack) for audio responses
Outcome:
- Enabled seamless AI-driven voice conversations in Discord
- Flexible enough to run on local machines without needing cloud APIs
- Fun and interactive — ideal for community servers or tech demos