s2vc's People
Forkers
chenchy trendingtechnology olegjakushkin hamaguchikazuki zge turian kunzhou9646 bprihasto ahmeftah mmmmichaelzhang ga642381 shaun95 andy-lau-boy come880412 amorjnyh chienlinhuang1116s2vc's Issues
Could you provide ppg-extracting code?
Dear author,
In your paper, you mentioned you extracted ppg and SSL features by s3prl toolkit. However, I cannot find in s3prl on how to extract ppg. Could you provide the code or guideline on extracting ppgs? Thanks a lot!
What are vocoder-ckpt-*.pt?
You release the following vocoder checkpoints:
vocoder-ckpt-apc.pt
vocoder-ckpt-cpc.pt
vocoder-ckpt-wav2vec2.pt
What are they?
Are they vocoders fine-tuned on the output of a particular model? I didn't see that described in the paper. Why is this needed, if the S2VC output is a mel? If it's because different models produce different mels, do you use vocoder-ckpt-cpc.pt
when target model is cpc? And if so, how did you do the fine-tuning?
Cannot find f2114342ff9e813e18a580fa41418aee9925414e in https://github.com/s3prl/s3prl
Running convert_batch.py throws ValueError: Cannot find f2114342ff9e813e18a580fa41418aee9925414e in https://github.com/s3prl/s3prl
that originates from
Line 18 in 8a6dceb
File "convert_batch.py", line 61, in main
src_feat_model = FeatureExtractor(src_feat_name, wav2vec_path, device)
File "/deepmind/experiments/howard1337/s2vc/data/feature_extract.py", line 18, in __init__
torch.hub.load("s3prl/s3prl:f2114342ff9e813e18a580fa41418aee9925414e", feature_name, refresh=True).eval().to(device)
File "/storage/usr/conda/envs/s2vc/lib/python3.8/site-packages/torch/hub.py", line 402, in load
repo_or_dir = _get_cache_or_reload(repo_or_dir, force_reload, verbose, skip_validation)
File "/storage/usr/conda/envs/s2vc/lib/python3.8/site-packages/torch/hub.py", line 190, in _get_cache_or_reload
_validate_not_a_forked_repo(repo_owner, repo_name, branch)
File "/storage/usr/conda/envs/s2vc/lib/python3.8/site-packages/torch/hub.py", line 160, in _validate_not_a_forked_repo
raise ValueError(f'Cannot find {branch} in https://github.com/{repo_owner}/{repo_name}. '
ValueError: Cannot find f2114342ff9e813e18a580fa41418aee9925414e in https://github.com/s3prl/s3prl. If it's a commit from a forked repo, please call hub.load() with forked repo directly.
Any idea on how to solve this?
Can you provide a pre-trained model
Checkpoints for cpc-mel and mel-cpc?
Do you mind providing checkpoints for cpc-mel and mel-cpc, and describing how to use them?
Is SourceEncoder dead code?
I don't see any code using SourceEncoder.
Is this dead code? Or is it part of the paper's model, with more code to be released?
Trying your pre trained model to convert a wav file to another's voice
Hi,
I am trying your pre trained model to convert a voice to another voice.
The convert_batch file's changed parts are as below ( I changed the paths ...):
def parse_args():
"""Parse command-line arguments."""
parser = ArgumentParser()
parser.add_argument("info_path", type=str)
parser.add_argument("output_dir", type=str, default=".")
parser.add_argument("-c", "/content/S2VC/chckpt",
default="checkpoints/cpc-cpc.pt")
parser.add_argument("-s", "src_feat_name", default="cpc")
parser.add_argument("-r", "ref_feat_name", default="cpc")
parser.add_argument("-w", "/content/S2VC/wav2vec_small.pt",
default="checkpoints/wav2vec_small.pt")
parser.add_argument("-v", "/content/S2VC/wav2vec_small.pt",
default="checkpoints/vocoder.pt")
parser.add_argument("--sample_rate", type=int, default=16000)
return vars(parser.parse_args())
the error is below too:
File "", line 1
python /content/S2VC/convert_batch.py
^
SyntaxError: incomplete input
What should I do to fix it?
Training of other features (apc, timit_posteriorgram etc.) do not work
I have tried training with other than the cpc feature on my prepared corpus.
However, the training script fails when the loss function (train.py
, line 69).
I found that the size of the output vector out
is hard-coded, which is inconsistent with the size of the target Mel spectrogram of other features.
The size of some vectors of the model are:
- apc case:
Input dim: 512, Reference dim: 512, Target dim: 240
- cpc case:
Input dim: 256, Reference dim: 256, Target dim: 80
I prepared the input feature vectors by using preprocess.py
, e.g. python .\preprocess.py (my own corpus) apc .\checkpoints\wav2vec_small.pt processed/apc
.
I have modified the model by changing the size of the vectors and can run train.py
now.
In the model.py
, __init__()
of S2VC
function, I replace 80
with a function argument and pass the size of Mel vector size.
But I cannot determine the modification is appropriate, for I am not familiar with NLP.
convert_batch.py
with pre-trained models works well as you described in README.md
.
Other details of my situation are:
- Windows 10, PowerShell
- pytorch 1.7.1 + cu110
- torchaudio 0.7.1
- sox 1.4.1
- tqdm 4.42.0
- librosa 0.8.1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.