Train custom voice instead of the default ljs speaker.

I am attempting to train a custom speaker to be used with the provided inference script, replacing the default ljs speaker. However, I've encountered an issue where the inference outputs are muffled, similar to the problem described in this [issue](https://github.com/NVIDIA/radtts/issues/7). I'm uncertain about the appropriate course of action. In my current training pipeline, I train the decoder, and the checkpoints are saved to `/decoder_checkpoints/ag_decoder`. Subsequently, I perform a Warm start training for the dap model in the directory `/dap_checkpoints/rad_ag`. During inference, I use the `rad_ag` checkpoint as the `rad_tts` checkpoint and utilize the provided vocoder checkpoint `hifigan_libritts100360_generator0p5.pt`. As a result, the `ag_decoder` checkpoint seems to be unused. Am I making a mistake in my approach? Should I train the decoder and the dap on the same checkpoint path? You can refer to this [colab notebook](https://colab.research.google.com/drive/1RrCplzElkFA_7xNK9fCTWWcvdB9Rxh-p?usp=sharing) for more details. I would greatly appreciate your guidance through the process or any relevant documentation you can provide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Train custom voice instead of the default ljs speaker. #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Train custom voice instead of the default ljs speaker. #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions