Text to Voice API — Clone & Speak
Clone a voice from reference audio and synthesize speech in that voice. Multilingual.
Text to Voice (Cloning)
Clone a target voice from a reference clip, then synthesize new speech in that voice.
curl -X POST 'https://stablediffusionapi.com/api/v6/text_to_voice' \
-d '{
"key": "YOUR_API_KEY",
"text": "Speak this in the cloned voice.",
"reference_audio": "https://example.com/voice-sample.wav",
"language": "en"
}'
Best results: 5-10 second clean reference audio, single speaker, no music.