markhershey / audiodeepfakedetection Goto Github PK

View Code? Open in Web Editor NEW

59.0 3.0 17.0 201.23 MB

SUTD 50.039 Deep Learning Course Project (2022 Spring)

Home Page: https://markhh.com/AudioDeepFakeDetection/

License: MIT License

Python 59.16% HTML 17.56% TeX 23.28%

audio-deepfake-detection deepfake-detection audio deep-learning

audiodeepfakedetection's Introduction

Hi there 👋

🔭 I’m currently working on computer vision and machine learning.

Fun projects:

arxiv-dl: command-line tool to download papers from arXiv.org, CVF Open Access.
dotfiles: automatic dotfiles setup + machine configuration script for macOS/Ubuntu. [Highly Recommended!]
mxshell: centralized status monitoring for multiple Linux workstations
pc-builds: my personal desktop computer build history.
gmail-paylah: a simple Python script to extract transaction details from email receipts. Supports Fave, PayLah, and Grab.
hdb-price-bar-chart-race: bar chart race with d3.js to visualize housing resale price data from 2012 to 2023.

audiodeepfakedetection's People

Contributors

Stargazers

Watchers

Forkers

jamestiotio simrit1 yihe1003 jatin2020-24 lesamo inc0mple playerberny12 suryapratap9 thenumanahmed shahjahan0275 adityakansara8 maximusarthur anmol2059 vignesh2004vasu imvision2341 noori03 ooinoing

audiodeepfakedetection's Issues

while running train with debug error

audioread==3.0.1
certifi==2023.11.17
cffi==1.16.0
charset-normalizer==3.3.2
cmake==3.27.7
colorlog==6.7.0
contourpy==1.2.0
cycler==0.12.1
decorator==5.1.1
filelock==3.13.1
fonttools==4.45.0
fsspec==2023.10.0
idna==3.4
Jinja2==3.1.2
joblib==1.3.2
kiwisolver==1.4.5
lazy_loader==0.3
librosa==0.10.1
lit==17.0.5
llvmlite==0.41.1
MarkupSafe==2.1.3
matplotlib==3.8.2
mpmath==1.3.0
msgpack==1.0.7
networkx==3.2.1
numba==0.58.1
numpy==1.26.2
nvidia-cublas-cu11==11.10.3.66
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==8.5.0.96
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.2.10.91
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.4.91
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu11==2.14.3
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu11==11.7.91
nvidia-nvtx-cu12==12.1.105
packaging==23.2
Pillow==10.1.0
platformdirs==4.0.0
pooch==1.8.0
puts==0.0.8
pycparser==2.21
pyparsing==3.1.1
python-dateutil==2.8.2
requests==2.31.0
scikit-learn==1.3.2
scipy==1.11.4
six==1.16.0
soundfile==0.12.1
soxr==0.3.7
sympy==1.12
threadpoolctl==3.2.0
torch==2.0.1
torchaudio==2.0.2
torchinfo==1.8.0
triton==2.0.0
typing_extensions==4.8.0
urllib3==2.1.0

This is the modules installed and
i am getting this error
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory
this error coming after downgraded the torchaudio to 2.0.2

i had the a different error while training with torchaudio 2.1.1 the error was raise RuntimeError(
RuntimeError: apply_effects_file requires sox extension which is not available. Please refer to the stacktrace above for how to resolve this. how to resolve these two errors

Can I use my own voice clip?

I was wondering if after I trained my data, is it possible to use my own voice clip to detect if it is fake or not?

Issues with eval_one Function and Guidance on Using lfcc ShallowCNN Model

Hi,

I've made some modifications to the code to ensure compatibility with Windows, specifically addressing the Sox dependency with torchaudio by using Sox independently. Other than that, I've kept the original setup intact.

However, I've encountered an issue when using the eval_one function. Every audio input, whether genuine or fake, is consistently classified as fake (output always equals 1). This behavior is observed even when testing on authentic audio files.

I'd like to understand the correct procedure to utilize the lfcc ShallowCNN model that you've provided for evaluating audio files. Since I don't have an NVIDIA GPU, training is quite time-intensive for me, and I'm keen on testing the pre-trained model you've developed on new RVC voice-generated samples and others.

Any guidance on how to successfully test the model would be greatly appreciated. Thank you!

evaluate_error

when i run the evaluation code it is showing this error

2023-11-23 12:08:00,216 - ERROR - 'bool' object is not callable
Traceback (most recent call last):
File "/home/pradeep/AudioDeepFakeDetection/train.py", line 593, in main
experiment(
File "/home/pradeep/AudioDeepFakeDetection/train.py", line 430, in experiment
eval_only(
TypeError: 'bool' object is not callable

.

In that train.py script i got an issue regarding eval() missing 1 required positional argument: 'fake_dir' how to solve it i tried out so many things but its not working

Inference a single audio file using the PyTorch model

Inference a single audio file using the PyTorch model and add into the GUI

Given the model and a audio file, print real or fake voice

I have been trying to use your model in the google drive (best.pt) model and tried to preprocess it however I keep getting different types of errors whenever I try changing something. Such as:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x241152 and 15104x128)
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [1, 1, 1, 40, 64600]

I am trying to do something just like your website where given the audio it displays real or fake. Is there any code or reference?

markhershey / audiodeepfakedetection Goto Github PK

audiodeepfakedetection's Introduction

Hi there 👋

audiodeepfakedetection's People

Contributors

Stargazers

Watchers

Forkers

audiodeepfakedetection's Issues

while running train with debug error

Can I use my own voice clip?

Issues with eval_one Function and Guidance on Using lfcc ShallowCNN Model

evaluate_error

.

In that train.py script i got an issue regarding eval() missing 1 required positional argument: 'fake_dir' how to solve it i tried out so many things but its not working

Inference a single audio file using the PyTorch model

Given the model and a audio file, print real or fake voice

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent