XTTSv2 notes

Nov 12, 2023

XTTS v2, Coqui's new version of open-source text-to-speech model.

1 Comment

Thanks for sharing! I have question about training vocoder with GPT outputs

How did you make GPT output for training vocoder?

GPT model has input like <condition> <text token> <mel token> and final layer output will be used in vocoder training, but how condition is selected?

In XTTS v1 technical report, condition mel was shuffled for training GPT, how condition mel is processed when training vocoder?

Expand full comment

Machine Learns Substack