Skip to main content

Text to Audio Endpoint

Overview

Text to audio endpoint allows you to create an audio by passing in the text and a valid audio url. The result produces an audio with the same sound as the audio url that was passed

Request

--request POST 'https://stablediffusionapi.com/api/v5/text_to_audio' \

Make a POST request to https://stablediffusionapi.com/api/v5/text_to_audio endpoint and pass the required parameters as a request body.

Body Attributes

ParameterDescription
keyYour API Key used for request authorization
promptText prompt with description of the audio you want to generate
init_audioA valid audio url you want it voice cloned
languageThe language of the voice. As of now, only english is supported
decoder_iterationsDecorator iteration number
webhookSet an URL to get a POST API call once the image generation is complete.
track_idThis ID is returned in the response to the webhook API call. This will be used to identify the webhook request.

Example

Body

Body
{   
"key": "",
"prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"language":"english",
"decoder_iterations": 30,
"webhook": null,
"track_id": null
}

Request

var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
"key": "",
"prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"language":"english",
"decoder_iterations": 30,
"webhook": null,
"track_id": null
});

var requestOptions = {
method: 'POST',
headers: myHeaders,
body: raw,
redirect: 'follow'
};

fetch("https://stablediffusionapi.com/api/v5/text_to_audio", requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));

Response

{
"status": "success",
"generationTime": 6.007939100265503,
"id": 621,
"output": [
"https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/generations/3fdf4b44-7617-4175-b64f-0fd5eda947a5.wav"
],
"meta": {
"decoder_iterations": 30,
"file_prefix": "3fdf4b44-7617-4175-b64f-0fd5eda947a5",
"init_audio": "https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"prompt": "Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"language": "arabic",
"outdir": "out",
"seed": 1
}
}