Giter VIP home page Giter VIP logo

audiodeepfakedetection's Introduction

Hi there ๐Ÿ‘‹

  • ๐Ÿ”ญ Iโ€™m currently working on computer vision and machine learning.

Fun projects:

  • arxiv-dl: command-line tool to download papers from arXiv.org, CVF Open Access.
  • dotfiles: automatic dotfiles setup + machine configuration script for macOS/Ubuntu. [Highly Recommended!]
  • mxshell: centralized status monitoring for multiple Linux workstations
  • pc-builds: my personal desktop computer build history.
  • gmail-paylah: a simple Python script to extract transaction details from email receipts. Supports Fave, PayLah, and Grab.
  • hdb-price-bar-chart-race: bar chart race with d3.js to visualize housing resale price data from 2012 to 2023.

audiodeepfakedetection's People

Contributors

jamestiotio avatar madhu-balaji-01 avatar markhershey avatar shsr2001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

audiodeepfakedetection's Issues

while running train with debug error

audioread==3.0.1
certifi==2023.11.17
cffi==1.16.0
charset-normalizer==3.3.2
cmake==3.27.7
colorlog==6.7.0
contourpy==1.2.0
cycler==0.12.1
decorator==5.1.1
filelock==3.13.1
fonttools==4.45.0
fsspec==2023.10.0
idna==3.4
Jinja2==3.1.2
joblib==1.3.2
kiwisolver==1.4.5
lazy_loader==0.3
librosa==0.10.1
lit==17.0.5
llvmlite==0.41.1
MarkupSafe==2.1.3
matplotlib==3.8.2
mpmath==1.3.0
msgpack==1.0.7
networkx==3.2.1
numba==0.58.1
numpy==1.26.2
nvidia-cublas-cu11==11.10.3.66
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==8.5.0.96
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.2.10.91
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.4.91
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu11==2.14.3
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu11==11.7.91
nvidia-nvtx-cu12==12.1.105
packaging==23.2
Pillow==10.1.0
platformdirs==4.0.0
pooch==1.8.0
puts==0.0.8
pycparser==2.21
pyparsing==3.1.1
python-dateutil==2.8.2
requests==2.31.0
scikit-learn==1.3.2
scipy==1.11.4
six==1.16.0
soundfile==0.12.1
soxr==0.3.7
sympy==1.12
threadpoolctl==3.2.0
torch==2.0.1
torchaudio==2.0.2
torchinfo==1.8.0
triton==2.0.0
typing_extensions==4.8.0
urllib3==2.1.0

This is the modules installed and
i am getting this error
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory
this error coming after downgraded the torchaudio to 2.0.2

i had the a different error while training with torchaudio 2.1.1 the error was raise RuntimeError(
RuntimeError: apply_effects_file requires sox extension which is not available. Please refer to the stacktrace above for how to resolve this. how to resolve these two errors

Can I use my own voice clip?

I was wondering if after I trained my data, is it possible to use my own voice clip to detect if it is fake or not?

Issues with eval_one Function and Guidance on Using lfcc ShallowCNN Model

Hi,

I've made some modifications to the code to ensure compatibility with Windows, specifically addressing the Sox dependency with torchaudio by using Sox independently. Other than that, I've kept the original setup intact.

However, I've encountered an issue when using the eval_one function. Every audio input, whether genuine or fake, is consistently classified as fake (output always equals 1). This behavior is observed even when testing on authentic audio files.

I'd like to understand the correct procedure to utilize the lfcc ShallowCNN model that you've provided for evaluating audio files. Since I don't have an NVIDIA GPU, training is quite time-intensive for me, and I'm keen on testing the pre-trained model you've developed on new RVC voice-generated samples and others.

Any guidance on how to successfully test the model would be greatly appreciated. Thank you!

evaluate_error

when i run the evaluation code it is showing this error

2023-11-23 12:08:00,216 - ERROR - 'bool' object is not callable
Traceback (most recent call last):
File "/home/pradeep/AudioDeepFakeDetection/train.py", line 593, in main
experiment(
File "/home/pradeep/AudioDeepFakeDetection/train.py", line 430, in experiment
eval_only(
TypeError: 'bool' object is not callable

Given the model and a audio file, print real or fake voice

I have been trying to use your model in the google drive (best.pt) model and tried to preprocess it however I keep getting different types of errors whenever I try changing something. Such as:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x241152 and 15104x128)
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [1, 1, 1, 40, 64600]

I am trying to do something just like your website where given the audio it displays real or fake. Is there any code or reference?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.