gemelo-ai / vocos Goto Github PK
View Code? Open in Web Editor NEWVocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Home Page: https://gemelo-ai.github.io/vocos/
License: MIT License
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Home Page: https://gemelo-ai.github.io/vocos/
License: MIT License
Thanks a lot for this repository! This is very useful and thanks a lot for great notebook about Bark+Vocos integration!
I tried to follow notebook Bark+Vocos.ipynb but encountered the following error:
Traceback (most recent call last):
File ".../bark_vocos_usage.py", line 66, in <module>
torchaudio.save("encodec.mp3", encodec_output[None, :], 44100, compression=128)
File ".../venv/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 312, in save
return backend.save(
File ".../venv/lib/python3.10/site-packages/torchaudio/_backend/ffmpeg.py", line 351, in save
raise ValueError(
ValueError: ('FFmpeg backend expects non-`None` value for argument `compression` to be of ', "type `torchaudio.io.CodecConfig`, but received value of type <class 'int'>")
For me it works if I replace these two last lines:
torchaudio.save("encodec.mp3", encodec_output[None, :], 44100, compression=128)
torchaudio.save("vocos.mp3", vocos_output, 44100, compression=128)
with these:
torchaudio.save("encodec.mp3", encodec_output[None, :], 44100, compression=torchaudio.io.CodecConfig(bit_rate=320))
torchaudio.save("vocos.mp3", vocos_output, 44100, compression=torchaudio.io.CodecConfig(bit_rate=320))
Just in case this will help somebody someday - for me it works after installing torch with pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
and IPython with pip install ipython
. And these packages were finally installed:
annotated-types==0.6.0 asttokens==2.4.1 audioread==3.0.1 boto3==1.29.3 botocore==1.32.3 certifi==2022.12.7 cffi==1.16.0 charset-normalizer==2.1.1 cmake==3.25.0 decorator==5.1.1 einops==0.7.0 encodec==0.1.1 exceptiongroup==1.1.3 executing==2.0.1 filelock==3.9.0 fsspec==2023.10.0 funcy==2.0 huggingface-hub==0.19.4 idna==3.4 inflect==7.0.0 ipython==8.17.2 jedi==0.19.1 Jinja2==3.1.2 jmespath==1.0.1 joblib==1.3.2 lazy_loader==0.3 librosa==0.10.1 lit==15.0.7 llvmlite==0.41.1 MarkupSafe==2.1.3 matplotlib-inline==0.1.6 mpmath==1.3.0 msgpack==1.0.7 networkx==3.0 numba==0.58.1 numpy==1.24.1 packaging==23.2 parso==0.8.3 pexpect==4.8.0 Pillow==9.3.0 platformdirs==4.0.0 pooch==1.8.0 progressbar==2.5 prompt-toolkit==3.0.41 ptyprocess==0.7.0 pure-eval==0.2.2 pycparser==2.21 pydantic==2.5.1 pydantic_core==2.14.3 Pygments==2.17.1 python-dateutil==2.8.2 PyYAML==6.0.1 regex==2023.10.3 requests==2.28.1 rotary-embedding-torch==0.3.5 s3transfer==0.7.0 safetensors==0.4.0 scikit-learn==1.3.2 scipy==1.11.4 six==1.16.0 soundfile==0.12.1 soxr==0.3.7 stack-data==0.6.3 suno-bark @ git+https://github.com/suno-ai/bark.git@773624d26db84278a55aacae9a16d7b25fbccab8 sympy==1.12 threadpoolctl==3.2.0 tokenizers==0.13.3 torch==2.1.1+cu118 torchaudio==2.1.1+cu118 torchvision==0.16.1+cu118 tortoise-tts @ git+https://github.com/neonbjb/tortoise-tts@80f89987a5abda5e2b082618cd74f9c7411141dc tqdm==4.66.1 traitlets==5.13.0 transformers==4.31.0 triton==2.1.0 typing_extensions==4.8.0 Unidecode==1.3.7 urllib3==1.26.13 vocos==0.1.0 wcwidth==0.2.10
Thank you for open sourcing this great work!
One of the great advantages I see in vocoders operating in the time domain is how easy it is to combine the vocoding task with superresolution. You just upsample some more and simply use an audio with a higher samplingrate as the target signal. Is the same somehow possible with Vocos? Could I train a model that uses 16kHz spectrograms as the input but produces a 24kHz wave?
Hi
I trained a model based on Matcha TTS, and I tried to use Vocos with it. Unfortunately, vocoding using a checkpoint trained with the default config of Vocos gives a robotic output with very low volume.
The only config values I changed are sample_rate (=22050) and n_mels (=80).
I asumed that there is a mismatch between Matcha TTS-generated melspectrogram and Vocos expected melspectrogram in terms of parameters.
I wrote a feature extractor class to generate melspectogram using same parameters of Matcha TTS. Most of the code is copied directly from Matcha's source code.
import numpy as np
import torch
from librosa.filters import mel as librosa_mel_fn
from vocos.feature_extractors import FeatureExtractor
class MatchaMelSpectrogramFeatures(FeatureExtractor):
"""
Generate MelSpectrogram from audio using same params
as Matcha TTS (https://github.com/shivammehta25/Matcha-TTS)
This is also useful with tacatron, waveglow..etc.
"""
def __init__(
self,
*,
mel_mean,
mel_std,
sample_rate=22050,
n_fft=1024,
win_length=1024,
n_mels=80,
hop_length=256,
center=False,
f_min=0,
f_max=8000,
):
super().__init__()
self.sample_rate = sample_rate
self.n_mels = n_mels
self.n_fft = n_fft
self.win_length = win_length
self.hop_length = hop_length
self.center = center
self.f_min = f_min
self.f_max = f_max
# Data-dependent
self.mel_mean = mel_mean
self.mel_std = mel_std
# Cache
self._mel_basis = {}
self._hann_window = {}
def forward(self, audio: torch.Tensor, **kwargs) -> torch.Tensor:
mel = self.mel_spectrogram(audio).squeeze()
mel = normalize(mel, self.mel_mean, self.mel_std)
return mel.unsqueeze(0)
def mel_spectrogram(self, y):
mel_basis_key = str(self.f_max) + "_" + str(y.device)
han_window_key = str(y.device)
if mel_basis_key not in self._mel_basis:
mel = librosa_mel_fn(
sr=self.sample_rate,
n_fft=self.n_fft,
n_mels=self.n_mels,
fmin=self.f_min,
fmax=self.f_max
)
self._mel_basis[mel_basis_key] = torch.from_numpy(mel).float().to(y.device)
self._hann_window[han_window_key] = torch.hann_window(self.win_length).to(y.device)
pad_vals = (
(self.n_fft - self.hop_length) // 2,
(self.n_fft - self.hop_length) // 2,
)
y = torch.nn.functional.pad(
y.unsqueeze(1),
pad_vals,
mode="reflect"
)
y = y.squeeze(1)
spec = torch.stft(
y,
self.n_fft,
hop_length=self.hop_length,
win_length=self.win_length,
window=self._hann_window[han_window_key],
center=self.center,
pad_mode="reflect",
normalized=False,
onesided=True,
return_complex=True,
)
spec = torch.view_as_real(spec)
spec = torch.sqrt(spec.pow(2).sum(-1) + (1e-9))
spec = torch.matmul(self._mel_basis[mel_basis_key], spec)
spec = spectral_normalize_torch(spec)
return spec
def spectral_normalize_torch(magnitudes):
output = dynamic_range_compression_torch(magnitudes)
return output
def dynamic_range_compression_torch(x, C=1, clip_val=1e-5):
return torch.log(torch.clamp(x, min=clip_val) * C)
def normalize(data, mu, std):
if not isinstance(mu, (float, int)):
if isinstance(mu, list):
mu = torch.tensor(mu, dtype=data.dtype, device=data.device)
elif isinstance(mu, torch.Tensor):
mu = mu.to(data.device)
elif isinstance(mu, np.ndarray):
mu = torch.from_numpy(mu).to(data.device)
mu = mu.unsqueeze(-1)
if not isinstance(std, (float, int)):
if isinstance(std, list):
std = torch.tensor(std, dtype=data.dtype, device=data.device)
elif isinstance(std, torch.Tensor):
std = std.to(data.device)
elif isinstance(std, np.ndarray):
std = torch.from_numpy(std).to(data.device)
std = std.unsqueeze(-1)
return (data - mu) / std
And I used it with the following config:
# pytorch_lightning==1.8.6
seed_everything: 4444
data:
class_path: vocos.dataset.VocosDataModule
init_args:
train_params:
filelist_path: ./datasets/train.txt
sampling_rate: 22050
num_samples: 16384
batch_size: 16
num_workers: 4
val_params:
filelist_path: ./datasets/val.txt
sampling_rate: 22050
num_samples: 48384
batch_size: 16
num_workers: 4
model:
class_path: vocos.experiment.VocosExp
init_args:
sample_rate: 22050
initial_learning_rate: 5e-4
mel_loss_coeff: 45
mrd_loss_coeff: 0.1
num_warmup_steps: 0 # Optimizers warmup steps
pretrain_mel_steps: 0 # 0 means GAN objective from the first iteration
# automatic evaluation
evaluate_utmos: true
evaluate_pesq: true
evaluate_periodicty: true
feature_extractor:
class_path: matcha_feature_extractor.MatchaMelSpectrogramFeatures
init_args:
sample_rate: 22050
n_fft: 1024
n_mels: 80
hop_length: 256
win_length: 1024
f_min: 0
f_max: 8000
center: False
mel_mean: -6.38385
mel_std: 2.541796
backbone:
class_path: vocos.models.VocosBackbone
init_args:
input_channels: 80
dim: 512
intermediate_dim: 1536
num_layers: 8
head:
class_path: vocos.heads.ISTFTHead
init_args:
dim: 512
n_fft: 1024
hop_length: 256
padding: same
trainer:
logger:
class_path: pytorch_lightning.loggers.TensorBoardLogger
init_args:
save_dir: /content/drive/MyDrive/vocos/logs
callbacks:
- class_path: pytorch_lightning.callbacks.LearningRateMonitor
- class_path: pytorch_lightning.callbacks.ModelSummary
init_args:
max_depth: 2
- class_path: pytorch_lightning.callbacks.ModelCheckpoint
init_args:
monitor: val_loss
filename: vocos_checkpoint_{epoch}_{step}_{val_loss:.4f}
save_top_k: 2
save_last: true
- class_path: vocos.helpers.GradNormCallback
# Lightning calculates max_steps across all optimizer steps (rather than number of batches)
# This equals to 1M steps per generator and 1M per discriminator
max_steps: 2000000
# You might want to limit val batches when evaluating all the metrics, as they are time-consuming
limit_val_batches: 128
accelerator: gpu
strategy: ddp
devices: [0]
log_every_n_steps: 100
I trained Vocos using the above feature extractor and config, but this also fails with even worse vocoding quality and even lower volume.
head
expects melspectograms generated using certain parameters?I believe many open-source TTS models use the same code to extract melspectogram. So resolving this will help with training Vocos for use with these TTS models.
Best
And another question, I notice that in the paper you said "the replacement of ResBlocks with ConvNeXt further improves performance.", I want to know how much performance has ConvNeXt version improved compared to ResBlock1 (HIFIGANv1) version?
Thanks!
Can't debug in vscode. I put breakpoints in feature extractor. Not stopping in the breakpoints inside code during training. . Tried with pdb still doesn't work. Any ideas how to do this? Or is this a problem of lightning? Can't find any documentation there.
Any help is appreciated.
Thanks for your opensource, when I try to run this model, I face the following problem:
File "/mnt/nfs/dev-aigc-0/data1/xiuyuanqin/work/vocos/vocos/modules.py", line 109, in ResBlock1
dilation: tuple[int] = (1, 3, 5),
TypeError: 'type' object is not subscriptable
could you help me with this problem?
I tried the pretrained model with mel spectrograms directly generated from singing samples and it does not sound as good as: https://github.com/yl4579/HiFTNet
The linked vocoder uses neural source-filter (https://nii-yamagishilab.github.io/samples-nsf/) like some other singing voice conversion models.
So could the architecture be the reason for the difference in quality or does Vocos just need to be additionaly trained on singing to reach the same quality?
Thanks for sharing this project!
I've followed the instructions to train a custom model. The tensorboard is showing decent progress and audio predictions are starting to sound good. But I am unable to load custom model checkpoint for inference. Can you share how to use the custom trained checkpoints for inference?
Thanks,
Emmanuel
I'm training a Vocos decoder for my DAC autoencoder. When I set hop length = 256 and n_fft = 1024 in the iSTFT head the discriminators quickly win within 1000 steps. However, this doesn't happen when I set n_fft = 512, 768, or 1026. Do you know why this is happening and whether using 1026 would affect quality? I don't completely understand the COLA property.
@hubertsiuzdak Hi, thank for ur great work! But I met a error when I am trying to train a mel model.
Here's error log:
Epoch 0: 0%| | 0/853 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/zj/workspace/TTS/vocos/train.py", line 6, in <module>
cli.trainer.fit(model=cli.model, datamodule=cli.datamodule)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
call._call_and_handle_interrupt(
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 90, in launch
return function(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
results = self._run_stage()
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
self._run_train()
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
self.fit_loop.run()
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 214, in advance
batch_output = self.batch_loop.run(kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
outputs = self.optimizer_loop.run(optimizers, kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 200, in advance
result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 247, in _run_optimization
self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 357, in _optimizer_step
self.trainer._call_lightning_module_hook(
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1342, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 1661, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 169, in step
step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/strategies/ddp.py", line 281, in optimizer_step
optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 234, in optimizer_step
return self.precision_plugin.optimizer_step(
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 121, in optimizer_step
return optimizer.step(closure=closure, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
return wrapped(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/optim/optimizer.py", line 373, in wrapper
out = func(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/optim/optimizer.py", line 76, in _use_grad
ret = func(self, *args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/optim/adamw.py", line 161, in step
loss = closure()
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 107, in _wrap_closure
closure_result = closure()
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 147, in __call__
self._result = self.closure(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 133, in closure
step_output = self._step_fn()
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 406, in _training_step
training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1480, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/strategies/ddp.py", line 352, in training_step
return self.model(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1519, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1355, in _run_ddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/pytorch_lightning/overrides/base.py", line 98, in forward
output = self._forward_module.training_step(*inputs, **kwargs)
File "/home/zj/workspace/TTS/vocos/vocos/experiment.py", line 142, in training_step
loss_fm_mp = self.feat_matching_loss(fmap_r=fmap_rs_mp, fmap_g=fmap_gs_mp) / len(fmap_rs_mp)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/zj/anaconda3/envs/vocos/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/zj/workspace/TTS/vocos/vocos/loss.py", line 112, in forward
loss += torch.mean(torch.abs(rl - gl))
RuntimeError: The size of tensor a (669) must match the size of tensor b (667) at non-singleton dimension 2
and this is my config( I am using vocos-imdct.yaml) :
# pytorch_lightning==1.8.6
seed_everything: 4444
data:
class_path: vocos.dataset.VocosDataModule
init_args:
train_params:
filelist_path: /home/zj/workspace/TTS/vocos/filelist.train
sampling_rate: 16000
num_samples: 12041
batch_size: 16
num_workers: 8
val_params:
filelist_path: /home/zj/workspace/TTS/vocos/filelist.val
sampling_rate: 16000
num_samples: 4201
batch_size: 16
num_workers: 8
model:
class_path: vocos.experiment.VocosExp
init_args:
sample_rate: 16000
initial_learning_rate: 5e-4
mel_loss_coeff: 45
mrd_loss_coeff: 0.1
num_warmup_steps: 0 # Optimizers warmup steps
pretrain_mel_steps: 0 # 0 means GAN objective from the first iteration
# automatic evaluation
evaluate_utmos: true
evaluate_pesq: true
evaluate_periodicty: true
feature_extractor:
class_path: vocos.feature_extractors.MelSpectrogramFeatures
init_args:
sample_rate: 16000
n_fft: 2048
hop_length: 200
n_mels: 80
padding: center
backbone:
class_path: vocos.models.VocosBackbone
init_args:
input_channels: 80
dim: 400
intermediate_dim: 2448
num_layers: 8
head:
class_path: vocos.heads.IMDCTCosHead
init_args:
dim: 400
mdct_frame_len: 400 # mel-spec hop_length * 2
padding: center
trainer:
logger:
class_path: pytorch_lightning.loggers.TensorBoardLogger
init_args:
save_dir: logs/
callbacks:
- class_path: pytorch_lightning.callbacks.LearningRateMonitor
- class_path: pytorch_lightning.callbacks.ModelSummary
init_args:
max_depth: 2
- class_path: pytorch_lightning.callbacks.ModelCheckpoint
init_args:
monitor: val_loss
filename: vocos_checkpoint_{epoch}_{step}_{val_loss:.4f}
save_top_k: 3
save_last: true
- class_path: vocos.helpers.GradNormCallback
# Lightning calculates max_steps across all optimizer steps (rather than number of batches)
# This equals to 1M steps per generator and 1M per discriminator
max_steps: 2000000
# You might want to limit val batches when evaluating all the metrics, as they are time-consuming
limit_val_batches: 100
accelerator: gpu
strategy: ddp
devices: [3]
log_every_n_steps: 100
Hello, does the repository support multiple GPU inference with DataParallel? Or some other method?
Primarily, I'm looking for encoding with encodec and running that on multiple GPU (using codes and feature extractor, etc). Multi GPU with decoding via vocos would also be great.
Thanks for nice work!
I have a question about the VISQOL.
For the evaluation, you utilized an audio mode of VISQOL.
However, input waveform should be a 48kHz sampling rate to do this. I hope to know how to upsample the GT samples and generated samples, respectively
Thanks!
i follow the remead install the vocos, but when testing samples, there is a problem:
LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
i also download the the pytorch file, but vocos.from_pretrain() how to load the local files?
hi, I met this error:
I fill train and valid path in configs/vocos.yaml and use cli: "python train.py -c configs/vocos.yaml", error happens: "train.py: error: Parser key "data": 'type' object is not subscriptable".
how should i slove it? many thanks.
@hubertsiuzdak
I want to fix some old recordings, so they sound crisp, can someone help me with a dumbed down version on how to install this and create a vocos pth with my own 48 dataset? Maybe training recommendation settings would help as well.
Training Loss, Generated Outputs.
I hope this will be a reference for model training.
I noticed that when saving feature maps for the GAN loss there is the condition if i > 0, that means that the feature maps of the first convolutional layer are not considered. Is this an optimization trick? Does the training work better this way?
Hey, i am curious what was the reason to pick "1" instead of commonly used "2" for vocos?
Recently bark supported long form with multiple speakers enabled. link. Can the example notebook be modified to also take in consideration long form text
Hello, I'm trying to train Vocos for Generating 32kHz Waveform.
I simply change Mel Loss and Head`s Parameter to hop_size=1600, n_fft=400, sample_rate=32000, mel_channels=120, segment_length=81 witch is fit for my mel spectrogram format.
But when Vocos model converged well after 420k training steps, I can see some stripes at 28k~32kHZ frequency.
Is there something I have to change? like VocosBackbone Module?
Any advice or help would be appreciated.
Thank you.
I'm trying to import "vocos" module, but I'm getting the following traceback error.
Is there anyone who can help me solve this issue?
fyi, all dependencies are installed and I just want to try inference with pretrained models.
Thanks in advance.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from vocos import Vocos
File ~/conda/codec/lib/python3.8/site-packages/vocos/__init__.py:1
----> 1 from vocos.pretrained import Vocos
4 __version__ = "0.0.3"
File ~/conda/codec/lib/python3.8/site-packages/vocos/pretrained.py:7
5 from huggingface_hub import hf_hub_download
6 from torch import nn
----> 7 from vocos.feature_extractors import FeatureExtractor, EncodecFeatures
8 from vocos.heads import FourierHead
9 from vocos.models import Backbone
File ~/conda/codec/lib/python3.8/site-packages/vocos/feature_extractors.py:8
5 from encodec import EncodecModel
6 from torch import nn
----> 8 from vocos.modules import safe_log
11 class FeatureExtractor(nn.Module):
12 """Base class for feature extractors."""
File ~/conda/codec/lib/python3.8/site-packages/vocos/modules.py:89
85 x = x * scale + shift
...
112 ):
113 super().__init__()
114 self.lrelu_slope = lrelu_slope
TypeError: 'type' object is not subscriptable
Hi
this is a great work. I noticed that the ISTFT head does not use the same padding that you implemented, but uses the center padding which is provided by torch api according your config file.
for neural vocoding, the spectrum frames and samples should be time aligned(num_frames* hop_len = samples), why did not you use the same padding you implemented for that property?
The paper says there's 2M of iterations
The readme says there's 2.5M of iterations
What is the truth?
Hi,
Any plans to enable ONNX export for Vocos?
I developed a script to do it, but it has some issues with some pytorch
operators that Vocos uses.
# coding: utf-8
import argparse
import logging
import os
import random
from pathlib import Path
import numpy as np
import torch
import yaml
from torch import nn
from vocos.pretrained import Vocos
DEFAULT_OPSET_VERSION = 18
_LOGGER = logging.getLogger("export_onnx")
class VocosGen(nn.Module):
def __init__(self, vocos):
super().__init__()
self.vocos = vocos
def forward(self, mels):
x = self.vocos.backbone(mels)
audio_output = self.vocos.head(x)
return audio_output
def export_generator(config_path, checkpoint_path, output_dir, opset_version):
with open(config_path, "r") as f:
config = yaml.safe_load(f)
class_module, class_name = config["model"]["class_path"].rsplit(".", 1)
module = __import__(class_module, fromlist=[class_name])
vocos_cls = getattr(module, class_name)
components = Vocos.from_hparams(config_path)
params = config["model"]["init_args"]
vocos = vocos_cls(
feature_extractor=components.feature_extractor,
backbone=components.backbone,
head=components.head,
sample_rate=params["sample_rate"],
initial_learning_rate=params["initial_learning_rate"],
num_warmup_steps=params["num_warmup_steps"],
mel_loss_coeff=params["mel_loss_coeff"],
mrd_loss_coeff=params["mrd_loss_coeff"],
)
model = VocosGen(vocos)
model.eval()
Path(output_dir).mkdir(parents=True, exist_ok=True)
epoch = 200
global_step = 1000000
onnx_filename = f"vocos-epoch={epoch}.step={global_step}.onnx"
onnx_path = os.path.join(output_dir, onnx_filename)
dummy_input = torch.rand(1, vocos.backbone.input_channels, 64)
dynamic_axes = {
"mels": {0: "batch_size", 2: "time"},
"audio": {0: "batch_size", 1: "time"},
}
# Conventional ONNX export
#torch.onnx.export(
# model=model,
# args=dummy_input,
# f=onnx_path,
# input_names=["mels"],
# output_names=["audio"],
# dynamic_axes=dynamic_axes,
# opset_version=opset_version,
# export_params=True,
# do_constant_folding=True,
# )
# Using the new dynamo export
export_output = torch.onnx.dynamo_export(model, dummy_input)
export_output.save(onnx_path)
return onnx_path
def main():
logging.basicConfig(level=logging.DEBUG)
parser = argparse.ArgumentParser(
prog="export_onnx",
description="Export a vocos checkpoint to onnx",
)
parser.add_argument("--config", type=str, required=True)
parser.add_argument("--checkpoint", type=str, required=True)
parser.add_argument("--output-dir", type=str, required=True)
parser.add_argument("--seed", type=int, default=1234, help="random seed")
parser.add_argument("--opset", type=int, default=DEFAULT_OPSET_VERSION)
args = parser.parse_args()
random.seed(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)
torch.cuda.manual_seed(args.seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
_LOGGER.info("Exporting model to ONNX")
_LOGGER.info(f"Config path: `{args.config}`")
_LOGGER.info(f"Using checkpoint: `{args.checkpoint}`")
onnx_path = export_generator(
config_path=args.config,
checkpoint_path=args.checkpoint,
output_dir=args.output_dir,
opset_version=args.opset
)
_LOGGER.info(f"Exported ONNX model to: `{onnx_path}`")
if __name__ == '__main__':
main()
Hi , thanks for your work. I would like to ask if Vocos can decode with streaming when reconstructing audio from EnCodec tokens?
Hi,
I'm looking to train on a single-speaker dataset similar to LJSpeech, and I'm looking for guidance. I have a few questions.
Has any experimentation been done on single-speaker datasets such as LJSpeech with vocos and if so, what were the metrics at convergence? How many steps do I train for for a single-speaker dataset? Also, what metrics do I focus on to tell if the model has converged?
Any help regarding this would be very valuable to me.
Thanks!
I've been trying to train my own model with VOCOS using Google Colab but came across the following error when I ran the train.py file
``'utf-8' codec can't decode byte 0xff in position 0: invalid start byte```
The Full log is below but as far as I know I have everything configured right according to the README.md. I'm a bit new to Audio Synthesis so any help is great!
Full Log:
Global seed set to 4444
Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.model_summary.ModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[rank: 0] Global seed set to 4444
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
2023-07-09 17:26:37.331298: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
| Name | Type | Params
---------------------------------------------------------------------------------
0 | feature_extractor | MelSpectrogramFeatures | 0
1 | feature_extractor.mel_spec | MelSpectrogram | 0
2 | backbone | VocosBackbone | 13.0 M
3 | backbone.embed | Conv1d | 358 K
4 | backbone.norm | LayerNorm | 1.0 K
5 | backbone.convnext | ModuleList | 12.6 M
6 | backbone.final_layer_norm | LayerNorm | 1.0 K
7 | head | ISTFTHead | 526 K
8 | head.out | Linear | 526 K
9 | head.istft | ISTFT | 0
10 | multiperioddisc | MultiPeriodDiscriminator | 41.1 M
11 | multiperioddisc.discriminators | ModuleList | 41.1 M
12 | multiresddisc | MultiResolutionDiscriminator | 600 K
13 | multiresddisc.discriminators | ModuleList | 600 K
14 | disc_loss | DiscriminatorLoss | 0
15 | gen_loss | GeneratorLoss | 0
16 | feat_matching_loss | FeatureMatchingLoss | 0
17 | melspec_loss | MelSpecReconstructionLoss | 0
18 | melspec_loss.mel_spec | MelSpectrogram | 0
---------------------------------------------------------------------------------
55.2 M Trainable params
0 Non-trainable params
55.2 M Total params
220.950 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]Traceback (most recent call last):
File "/content/drive/MyDrive/Colab Notebooks/train.py", line 9, in <module>
cli.trainer.fit(model=cli.model, datamodule=cli.datamodule)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
call._call_and_handle_interrupt(
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 90, in launch
return function(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
results = self._run_stage()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
self._run_train()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1190, in _run_train
self._run_sanity_check()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1255, in _run_sanity_check
val_loop._reload_evaluation_dataloaders()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 234, in _reload_evaluation_dataloaders
self.trainer.reset_val_dataloader()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1635, in reset_val_dataloader
self.num_val_batches, self.val_dataloaders = self._data_connector._reset_eval_dataloader(
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 357, in _reset_eval_dataloader
dataloaders = self._request_dataloader(mode)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 446, in _request_dataloader
dataloader = source.dataloader()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 524, in dataloader
return method()
File "/usr/local/lib/python3.10/dist-packages/vocos/dataset.py", line 38, in val_dataloader
return self._get_dataloder(self.val_config, train=False)
File "/usr/local/lib/python3.10/dist-packages/vocos/dataset.py", line 28, in _get_dataloder
dataset = VocosDataset(cfg, train=train)
File "/usr/local/lib/python3.10/dist-packages/vocos/dataset.py", line 44, in __init__
self.filelist = f.read().splitlines()
File "/usr/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
Hi, thank you for the great project you have made available!
I added it to my one click installed package of AI based audio generators. Link
Here's the notebook I quickly created:
https://github.com/rsxdalv/tts-generation-webui/blob/main/notebooks/vocos.ipynb
I wonder if using this in a pipeline with SunoAI/Bark has a different impact than with something else. I couldn't manage to link up the raw encodec codes so I used the final wav files.
I saw the best result when using 12kbps bandwidth although if I remember correctly Bark model runs on 6kbps.
In my small sample size I didn't see a unsupervised improvement although I found an example where it gives more "quality" to a sound sample (I included it next to the notebook).
I would love to see how would it go if I could link it up with the encodec tokens from Bark and how to best go about using it.
Hi. Impressive speed up. Do you plan to release the source code?
Hi @hubertsiuzdak, I am trying to figure out how to convert my ckpt to pytorch_model.bin so that I can load model by vocos.pretrained or any idea to load ckpt for inferencing directly?
RuntimeError: torchaudio.sox_effects.sox_effects.apply_effects_tensor requires sox extension, but TorchAudio is not compiled with it. Please build TorchAudio with libsox support.
How can we fix this error on windows?
Hi!
If it is possible to train vocos with 22.05 kHz I would like to see config parameters, because my attempt has failed.
I have the following config:
# pytorch_lightning==1.8.6
seed_everything: 4444
data:
class_path: vocos.dataset.VocosDataModule
init_args:
train_params:
filelist_path: /home/yehor/Work/github/vocos/tetiana-dataset/filelist.train
sampling_rate: 22050
num_samples: 15053
batch_size: 16
num_workers: 8
val_params:
filelist_path: /home/yehor/Work/github/vocos/tetiana-dataset/filelist.val
sampling_rate: 22050
num_samples: 44453
batch_size: 16
num_workers: 8
model:
class_path: vocos.experiment.VocosExp
init_args:
sample_rate: 22050
initial_learning_rate: 2e-4
mel_loss_coeff: 45
mrd_loss_coeff: 0.1
num_warmup_steps: 0 # Optimizers warmup steps
pretrain_mel_steps: 0 # 0 means GAN objective from the first iteration
# automatic evaluation
evaluate_utmos: false
evaluate_pesq: false
evaluate_periodicty: false
feature_extractor:
class_path: vocos.feature_extractors.MelSpectrogramFeatures
init_args:
sample_rate: 22050
n_fft: 1024
hop_length: 256
n_mels: 80
padding: center
backbone:
class_path: vocos.models.VocosBackbone
init_args:
input_channels: 80
dim: 512
intermediate_dim: 1536
num_layers: 8
head:
class_path: vocos.heads.ISTFTHead
init_args:
dim: 512
n_fft: 1024
hop_length: 256
padding: center
trainer:
logger:
class_path: pytorch_lightning.loggers.TensorBoardLogger
init_args:
save_dir: logs/
callbacks:
- class_path: pytorch_lightning.callbacks.LearningRateMonitor
- class_path: pytorch_lightning.callbacks.ModelSummary
init_args:
max_depth: 2
- class_path: pytorch_lightning.callbacks.ModelCheckpoint
init_args:
monitor: val_loss
filename: vocos_checkpoint_{epoch}_{step}_{val_loss:.4f}
save_top_k: 3
save_last: true
- class_path: vocos.helpers.GradNormCallback
# Lightning calculates max_steps across all optimizer steps (rather than number of batches)
# This equals to 1M steps per generator and 1M per discriminator
max_steps: 2000000
# You might want to limit val batches when evaluating all the metrics, as they are time-consuming
limit_val_batches: 100
accelerator: gpu
strategy: ddp
devices: [0]
log_every_n_steps: 100
Fails with the following error:
File "/home/yehor/Work/github/vocos/vocos/experiment.py", line 142, in training_step
loss_fm_mp = self.feat_matching_loss(fmap_r=fmap_rs_mp, fmap_g=fmap_gs_mp) / len(fmap_rs_mp)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yehor/Tools/anaconda3/envs/vocos/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yehor/Work/github/vocos/vocos/loss.py", line 112, in forward
loss += torch.mean(torch.abs(rl - gl))
~~~^~~~
RuntimeError: The size of tensor a (837) must match the size of tensor b (825) at non-singleton dimension 2
Has anyone observed vibration artifacts when reconstructing speech? I can reproduce this with the provided pretrained model by converting the original audio into mel spectrogram and using vocos to reconstruct (no TTS in between). Example (pay attention to the word "carnival"): Original vs Generated
I trained my own model with own dataset that has the exact speech in question but still observed the artifacts. Does anyone have suggestions on how we may fix the artifact?
I trained a vocos model with a model I trained myself, how do I use it? I see that the existing model loaded is in pytorch_model.bin format, and the parameters in it are named differently than the directly trained one? Can you give me an example of how to use it, thanks!
I was trying to train a vocos and I ran the command train.py -c configs/vocos.yaml
, while it showed this error:
(autodl_test) root@autodl-container-da7148a975-fb23289d:~/autodl-tmp/PycharmProjects/vocos-main# python train.py -c configs/vocos.yaml usage: train.py [-h] [-c CONFIG] [--print_config[=flags]] [--seed_everything SEED_EVERYTHING] [--trainer CONFIG] [--trainer.logger.help CLASS_PATH_OR_NAME] [--trainer.logger LOGGER] [--trainer.enable_checkpointing {true,false}] [--trainer.callbacks.help CLASS_PATH_OR_NAME] [--trainer.callbacks CALLBACKS] [--trainer.default_root_dir DEFAULT_ROOT_DIR] [--trainer.gradient_clip_val GRADIENT_CLIP_VAL] [--trainer.gradient_clip_algorithm GRADIENT_CLIP_ALGORITHM] [--trainer.num_nodes NUM_NODES] [--trainer.num_processes NUM_PROCESSES] [--trainer.devices DEVICES] [--trainer.gpus GPUS] [--trainer.auto_select_gpus {true,false}] [--trainer.tpu_cores TPU_CORES] [--trainer.ipus IPUS] [--trainer.enable_progress_bar {true,false}] [--trainer.overfit_batches OVERFIT_BATCHES] [--trainer.track_grad_norm TRACK_GRAD_NORM] [--trainer.check_val_every_n_epoch CHECK_VAL_EVERY_N_EPOCH] [--trainer.fast_dev_run FAST_DEV_RUN] [--trainer.accumulate_grad_batches ACCUMULATE_GRAD_BATCHES] [--trainer.max_epochs MAX_EPOCHS] [--trainer.min_epochs MIN_EPOCHS] [--trainer.max_steps MAX_STEPS] [--trainer.min_steps MIN_STEPS] [--trainer.max_time MAX_TIME] [--trainer.limit_train_batches LIMIT_TRAIN_BATCHES] [--trainer.limit_val_batches LIMIT_VAL_BATCHES] [--trainer.limit_test_batches LIMIT_TEST_BATCHES] [--trainer.limit_predict_batches LIMIT_PREDICT_BATCHES] [--trainer.val_check_interval VAL_CHECK_INTERVAL] [--trainer.log_every_n_steps LOG_EVERY_N_STEPS] [--trainer.accelerator.help CLASS_PATH_OR_NAME] [--trainer.accelerator ACCELERATOR] [--trainer.strategy.help CLASS_PATH_OR_NAME] [--trainer.strategy STRATEGY] [--trainer.sync_batchnorm {true,false}] [--trainer.precision PRECISION] [--trainer.enable_model_summary {true,false}] [--trainer.num_sanity_val_steps NUM_SANITY_VAL_STEPS] [--trainer.resume_from_checkpoint RESUME_FROM_CHECKPOINT] [--trainer.profiler.help CLASS_PATH_OR_NAME] [--trainer.profiler PROFILER] [--trainer.benchmark {true,false,null}] [--trainer.deterministic DETERMINISTIC] [--trainer.reload_dataloaders_every_n_epochs RELOAD_DATALOADERS_EVERY_N_EPOCHS] [--trainer.auto_lr_find AUTO_LR_FIND] [--trainer.replace_sampler_ddp {true,false}] [--trainer.detect_anomaly {true,false}] [--trainer.auto_scale_batch_size AUTO_SCALE_BATCH_SIZE] [--trainer.plugins.help CLASS_PATH_OR_NAME] [--trainer.plugins PLUGINS] [--trainer.amp_backend AMP_BACKEND] [--trainer.amp_level AMP_LEVEL] [--trainer.move_metrics_to_cpu {true,false}] [--trainer.multiple_trainloader_mode MULTIPLE_TRAINLOADER_MODE] [--trainer.inference_mode {true,false}] [--model.help CLASS_PATH_OR_NAME] --model CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE [--data.help CLASS_PATH_OR_NAME] [--data CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE] [--optimizer.help CLASS_PATH_OR_NAME] [--optimizer CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE] [--lr_scheduler.help CLASS_PATH_OR_NAME] [--lr_scheduler CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE] error: Parser key "data": Problem with given class_path 'vocos.dataset.VocosDataModule': No module named 'encodec'
I am a newer in model training. Anyone has had the same problem, please?
Are there any ancestor or inspiration source of the model naming 'Vocos'?
I succesfully train the Vocos with other datasets (e.g. japanese), the model is so good.
At first, thanks so much for your great model and its open-sourced code/weight.
Just out of curiosity, why the model is named as 'Vocos'?
The naming Vocos reminds me of "vocoder" or "cosine", but. at least in the paper, there seems to be no description about the naming.
i tried to run it on MPS, and hit some issue in decode that in 1j * y one matrix is complex and other isn't which triggered assert in mps backend: binarygemm. Any ideas how can i solve this?
Hi, how to understand the influence of metrics of UTMOS and periodicity?
AttributeError: 'WeightNorm' object has no attribute 'name'. Did you mean: 'ne'?
single mel can work,but multiple loops not work
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.