Fish Audio S2 Pro Long-Form TTS Lab
Build 1-60 minute multilingual TTS tests with inline prosody control, optional voice cloning, and Space-ready long-form chunking.
- Model: fishaudio/s2-pro
- Source runtime: Fish Speech
- Strengths surfaced here: 80+ languages, free-form
[tag]controls, low-latency oriented presets, long-form narration planning
Generation preset
1 60
Common control tag
Preset note: Recommended default for multi-minute multilingual narration and medium-length exports.
Long-form plan
Add text to preview sectioning, timing, and guidance.
256 2048
96 384
0.1 1
0.9 2
0.1 1
0 500
Control tips
- Put style tags directly into the script:
[whisper],[excited],[professional broadcast tone] - Use
Smart sectionsfor multi-minute and hour-scale narration so long passages are synthesized in stable chunks - Use a clean 5-10 second reference clip plus an exact transcript for the best cloning behavior
- One-hour exports will create many sections and can take a long time on hosted GPUs
- First run will be slower because the runtime downloads checkpoints and warms the model
Runtime compatibility
Checking GPU fit...
Result summary
Generate audio to see render details.
Multilingual examples