Giter VIP home page Giter VIP logo

ctcnet's Introduction

Hey 👋🏽, I'm Kai Li!


       

GIF

My name is Kai Li (Chinese name: 李凯). I'm a second-year master student at Department of Computer Science and Technology, Tsinghua University, supervised by Prof. Xiaolin Hu (胡晓林). I am also a member of TSAIL Group directed by Prof. Bo Zhang (张拨) and Prof. Jun zhu (朱军). I am an intern at Tencent AI Lab, mainly doing research on causal speech separation, supervised by Yi Luo (罗艺).

🤗   These works are open source to the best of my ability.

🤗   I am currently doing research on multimodal speech separation, and am interested in other speech tasks (e.g., pre-training models and neuralscience). If you would like to collaborate, please contact me. Many thanks.

🔖 Homepages

: Kai Li     : Jusper Lee     : cslikai.cn

📅 News

  • 2023.07: 🎲 One paper is accepted by ECAI 2023.
  • 2023.05: 🧩 Two papers are accepted by Interspeech 2023.
  • 2023.05: 🎉 We won the first prize 🥇 of the Cinematic Sound Demixing Track 23 in the Leaderboard A and B.
  • 2023.05: 🎉 We won the first prize 🥇 of the ASC23 and Best Application Award.
  • 2023.04: 🎲 One paper is appeared by Arxiv.
  • 2023.02: 🧩 One paper is accepted by ICASSP 2023.
  • 2023.01: 🧩 One paper is accepted by ICLR 2023.

📰 Selected Publications:

See Google Scholar for a full list of publications.

Speech Separation

Neuroscience

Cloud Removal

Super Resolution

ctcnet's People

Contributors

jusperlee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ctcnet's Issues

A probable typo in the code?

Hello!When I try to train the model, there is an error said AttributeError: 'VideoBlock' object has no attribute 'get_block_block' and there is no get_block_block in class VideoBlock in videosubnetwork.py. Can you help fix this? Thanks in advance.

AVSpeech Dataset

Hi, I have downloaded videos from the AVSpeech Datasets, how can I preprocess that to train this model ?

about frcnn_128_512.backbone.pth.tar

Thanks for sharing your great work.

I was trying to run the model, however I didn't find the pretrained frcnn_128_512.backbone.pth.tar for videonet. Could you please share it? Thanks.

Misspelling in model code

File ctcnet.py contain few "self.video_block.get_block_block" calls. Obviously this is "self.video_block.get_video_block" instead.

mix.json

Hello, when I want to train the model, it has a error that I don't have a mix.json. How can I create the mix.json?

ctcnet,py里的def fuse

'VideoBlock' object has no attribute 'get_block_block'. Did you mean: 'get_concat_block'?

关于资料集的下载

请问可以分享一下LRS2-2mix、LRS3-2mix,Voceleb2-mix这三个资料集的链接吗,我看说因为权限问题被移除~感谢。

Error for pytorch-lightning

Thanks for sharing your excellent work! But I confronted an error and thus asking for help. When I run the trainer.fit(system) in train_ctc.py, there is an error said: The LightningModule.on_epoch_end hook was removed in v1.8. Please use LightningModule.on_<train/validation/test>_epoch_end instead.

But I can't find any possible code in train_ctc.py to fix it. I think this happens to the wrong version of pytorch-lightning, but the version is exactly what you wrote in the readme. So can you help me fix this problem?

Alternative to Baidu driver

Hi,

Can you please provide an alternative link for the Baidu driver? Or share the code to generate the test sets (LRS2/LRS3/Vox2)?

Thanks

LightningModule.on_epoch_end was removed in v1.8

Hello, when I want to train the model, it has an error that The LightningModule.on_epoch_end hook was removed in v1.8. How can I use LightningModule.on_<train/validation/test>_epoch_end instead?

train.py missing & custom data training

Hi @JusperLee, thank you for your amazing work!

after taking a look at the README.md and the files inside this repository i could not find the train.py file and im wondering if it is possible to train the model only on an audio data (mixed & separated) without videos

Multi channel Data

Hello,

I am interested in using your model for a project focused on multi-channel source separation. I was wondering if you could provide any guidance, best practices, or documentation that would help me get started effectively? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.