Audio Model - ELKAPI Docs

Text to Speech
Speech to Text
Speech to Speech
Related Links

Text to Speech

Endpoint: /audio/speech Main request parameters:

Parameter	Description
`model`	Model used for speech synthesis, supported model list.
`input`	Text content to be converted into audio.
`voice`	Reference voice, supports system preset voices, user preset voices, and user dynamic voices.

curl https://api.elkapi.com/v1/audio/speech \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini-tts",
    "input": "The quick brown fox jumped over the lazy dog.",
    "voice": "alloy"
  }' \
  --output speech.mp3

Speech to Text

Endpoint: /audio/transcriptions Content-Type: multipart/form-data Main request parameters:

Parameter	Description
`model`	Model used for speech-to-text, supported model list.
`file`	Audio file to be converted to text.

curl https://api.elkapi.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \
  -F model="gpt-4o-transcribe"

Speech to Speech

This scene is currently only supported by Elevenlabs models. Please refer to the corresponding documentation.

Set OPENAI_BASE_URL to https://api.elkapi.com/v1
OPENAI_API_KEY should be set to your API Key
Most models have been adapted to the OpenAI mapping interface. Some models have not been adapted. Please refer to the model documentation.

OpenAI Official Docs

OpenAI Audio API

OpenAI Official Docs

OpenAI TTS Guide

Picture Model Async Tasks

⌘I

User Guide

​Text to Speech

​Speech to Text

​Speech to Speech

​Related Links

OpenAI Official Docs

OpenAI Official Docs

Text to Speech

Speech to Text

Speech to Speech

Related Links