# text-to-speech
3 tools tagged
Showing 3 of 3 tools
MiniMax MCP
MCP server for MiniMax speech, video, and image APIs
Official MiniMax Model Context Protocol server enabling AI applications and code editors to access text-to-speech, voice cloning, image generation, video generation, and music creation APIs. Designed for Claude Desktop, Cursor, and Windsurf integration with stdio and SSE transport support, regional API endpoints for global and China regions, and flexible resource handling for seamless generative AI workflows.
Coqui TTS
Open-source deep learning text-to-speech toolkit
Coqui TTS is an open-source deep learning toolkit for text-to-speech synthesis, originally built by former Mozilla TTS engineers. It supports multi-speaker and multilingual synthesis, voice cloning from just six seconds of audio, and ships pre-trained models for 20+ languages. After Coqui shut down in 2023, the Idiap Research Institute forked and actively maintains it. With 45K+ GitHub stars, it remains the most popular open-source TTS framework in Python.
Chatterbox
State-of-the-art open-source text-to-speech with emotion control
Chatterbox is an open-source text-to-speech model by Resemble AI that delivers state-of-the-art voice synthesis with fine-grained emotion and style control. The model supports zero-shot voice cloning from short audio samples, produces natural-sounding speech across multiple speaking styles, and runs locally without cloud dependencies. With over 24,000 GitHub stars, it has become the leading open-source alternative to commercial TTS services for developers building voice-enabled AI applications.