Inputs:
- Text prompt
- Reference audio
Limits:
- Reference audio is required
- Supported reference audio formats: MP3, OGG, WAV, M4A, AAC
- Output sample rate: 48 kHz
Tips:
- Use it for voice cloning, narration, dialogue, and character speech
- Use clean reference audio with minimal noise, music, or background speech
- Keep the reference voice consistent in tone and distance from the microphone
- Keep text well-punctuated for better rhythm and pauses
- Use shorter text chunks for better control over delivery