Skip to content

nicolalandro/autovc

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License: MIT Open In Colab

AUTOVC

This repo is a fork of AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss. It aim to write an easy usable demo of this models.

Run Notebooks

  • Open on Colab: Open In Colab
  • Use in local environment (with GPU: cuda):
    • Install dependencies
    • clone the project
    • download models into cloned folder
    • run jupyter
    • Demo.ipynb: take two audio and generate the new one with the voice of the second and the words of the first
  • Train on Colab: Open In Colab

Dependencies

  • Python 3
  • jupyter
  • Numpy
  • PyTorch >= v0.4.1
  • wavenet_vocoder pip install wavenet_vocoder for more information, please refer to https://github.com/r9y9/wavenet_vocoder
  • librosa
  • soundfile
  • scipy
  • tqdm
  • (matplotlib ?)

Pre-trained models

Train on VoxCeleb

# make spect files (do it one times)
nohup python make_spect_for_vox_cel.py \
    --root-dir="/home/super/datasets-nas/Vox2celeb/vox2celeb-1/wav" \
    --target-dir="/home/super/datasets-nas/Vox2celeb/vox2celeb-1/spmel" \
    > make_spec.log 2>&1 &!

# make train.pkl file (do it one times)
wget https://github.com/nicolalandro/autovc/releases/download/0.1/3000000-BL.ckpt
CUDA_VISIBLE_DEVICES="0"  nohup python make_metadata.py --root-dir="/home/super/datasets-nas/Vox2celeb/vox2celeb-1/spmel" \
    > make_metadata.log 2>&1 &!

!wget https://github.com/nicolalandro/autovc/releases/download/0.1/autovc.ckpt
CUDA_VISIBLE_DEVICES="0"  nohup python main.py --data_dir="/home/super/datasets-nas/Vox2celeb/vox2celeb-1/spmel" \
    --outfile-path="/home/super/Models/autovc_voxceleb/generator.pth" \
    --num_iters 10000 --batch_size=10 --dim_neck 32 --dim_emb 256 --dim_pre 512 --freq 32 --pretrained "autovc.ckpt" \
     > train.log 2>&1 &!

Train with new vocoder

python3.8 make_spect.py # create folder spmel
python3.8 make_spect_other_vocoder.py # create the folder spmel_other
CUDA_VISIBLE_DEVICES="0" python3.8 make_metadata.py --root-dir="./spmel" # create the spmel/train.pkl # use speaker encoder on /spmel
cp spmel/train.pkl spmel_other # copy the spmel/train.pkl into spmel_other/train.pkl
CUDA_VISIBLE_DEVICES="0" python3.8 main.py --data_dir="spmel_other" \
    --outfile-path="/home/super/Models/autovc_simple/generator.pth" \
    --num_iters 10000 --batch_size=6 --dim_neck 32 --dim_emb 256 --dim_pre 512 --freq 32
CUDA_VISIBLE_DEVICES="0" python3.8 test_audio.py

About

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 98.2%
  • Python 1.8%