声音克隆工具大汇总 | 川流不息 Dev

少女祈祷中...

2026-05-05

声音克隆一直是 AI 应用的最大场景之一，在 Reddit 上 Stop searching for free voice cloning tools — here are the ones that actually work (2026) 中推荐了一些支持本地自托管的开源项目和免费的在线服务（部分需要收费）。

针对 Reddit 的那篇帖子内容做一下补充、说明，汇总一下个人推荐的开源或免费支持声音克隆（Voice Cloning）的服务。

开源项目

Qwen3-TTS

https://github.com/QwenLM/Qwen3-TTS

阿里巴巴的开源 TTS 模型，支持专业声音克隆、设计和情感控制。目前开源声音克隆无可争议的王者。

Voicebox

https://voicebox.sh/

https://github.com/jamiepine/voicebox

本地优先的语音克隆开源方案，提供 macOS / Windows 客户端，基于 Qwen3-TTS

VibeVoice

https://github.com/microsoft/VibeVoice

微软开源的语音生成框架，很适合用于生成长篇对话（最长 90 分钟）、多角色播客或极致真实感的语音内容。

微软担心 deepfake 滥用，因此将 VibeVoice 的 voice prompt 限制为 embedded format（参考说明）

如果要用于声音克隆，可以使用社区微调版。

VibeVoice 社区版：

社区版：https://github.com/vibevoice-community/VibeVoice

VibeVoice 官方仓库曾因 Responsible AI 风险删除部分代码，社区 fork 版本：

恢复模型和代码
新增 fine-tuning 和训练能力
可训练新语言 / 新声音

通过 fine-tune + speaker embedding 就能实现声音克隆。

为方便使用，主流的基于 VibeVoice 的声音克隆方案是 VibeVoice 社区版 + VibeVoice‑ComfyUI

VibeVoice‑ComfyUI：https://github.com/Enemyx-net/VibeVoice-ComfyUI

其他一些不错的支持声音克隆开源项目

index-tts：https://github.com/index-tts/index-tts B 站开源项目

GPT-SoVITS：https://github.com/RVC-Boss/GPT-SoVITS

F5-TTS：https://github.com/SWivid/F5-TTS

Fish Speech：https://github.com/fishaudio/fish-speech

CosyVoice：https://github.com/FunAudioLLM/CosyVoice

KokoClone：https://huggingface.co/PatnaikAshish/kokoclone

VoxCPM：https://github.com/OpenBMB/VoxCPM

MOSS-TTS：https://github.com/OpenMOSS/MOSS-TTS

ChatTTS：https://github.com/2noise/ChatTTS

Higgs Audio：https://github.com/boson-ai/higgs-audio

Chatterbox TTS：https://github.com/resemble-ai/chatterbox

Pocket-TTS：https://github.com/kyutai-labs/pocket-tts

在线声音克隆服务

Twoshot：https://twoshot.app/coproducer 基于 Qwen3-TTS

NiceVoice：https://nicevoice.org/

TTSMaker：https://ttsmaker.com/

KikiVoice：https://kikivoice.ai/

Fish Audio：https://fish.audio 提供免费额度

MiniMax Voice Clone：https://www.minimax.io/audio/voices-cloning 提供免费额度

ElevenLabs：https://elevenlabs.io 免费额度有限，声音克隆、唇音同步的王者

说些什么吧！

valine