This is the demo page for the NANSY++ unofficial implementation of the open-source repository published by MWM.

🦜 Zero-shot voice conversion

Following section showcases zero-shot voice conversion ability of backbone checkpoints trained on HifiTTS with the open-source repo (”OS repo”) and another trained on internal data (”our best”). The inferencer class was used to synthesize results. Source and target audio samples are unseen examples from VCTK corpus that can be either extracted from full dataset or found in the static sub-directory.

Source

Target

NANSY++ (OS repo)

NANSY++ (our best)

p225.wav

p226.wav

p225-to-p226-400k.wav

high-res-p225-to-p226.wav

p225.wav

p227.wav

p225-to-p227-400k.wav

high-res-p225-to-p227.wav