Giter VIP home page Giter VIP logo

torchvggish's People

Contributors

dfan avatar harritaylor avatar stevenguh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

torchvggish's Issues

Modify torchvggish.vggish_params.EXAMPLE_HOP_SECONDS after (or before) model load?

I would like to modify global variable torchvggish.vggish_params.EXAMPLE_HOP_SECONDS after (or before) loading the model.

However, I cannot import torchvggish.vggish_params because I don't have it installed on my system and it's tricky to install. There is no pypi module and no setup.py file.

What would be the simplest way to modify torchvggish.vggish_params.EXAMPLE_HOP_SECONDS?

Original vggish vs this..

Hey doesn't the original tf implementation have only four convolution layers and two fc layers? this one has 6, 3...why the difference? How could the embeddings be identical then?

How to use this code for training on my own dataset?

Hello, I have some .wav files, and I want to train the classification model on my own datasets.

How can I use this code? Extract embeddings and train a sequence model? Is it possible to finetune VGGish feature extractor when training the classifier?

Appreciate for any advise. Thank you !!

The url link of the weights of VGGish model has been out of work.

I am using VGGish model as a part of my model to extract the features of input audio. However, I can not open the url link you have post on the Github. So could you please update the url link or tell me how to import the weights of the pretrained VGGish model?

Thanks a lot.

Output size

Hello,

I tried this implementation along with Usage.
The size of output tensor is [19, 128].
Do I need to fuse the output tensor in order to convert the output tensor from [19,128] to [1,128] ?
Is the audio-embedding obtained in each audiofile, or in each batch?

#In my understanfing, It can be obtained in each audiofile.

how to feed model batch inputs ?

hi, how can i feed the model batch inputs ? i just know how to feed one audio to the model,but if i want to feed batch?can you tell me ? thanks.

how to use this model

hi, thanks for your contribution and sharing, I ran into some issues when I tried to use it locally.So can you provide an example of how to use it? I mean how to load the model locally(Because I need to use it without the Internet)

Pre-activation as output of VGGish

Hello there,

when comparing this code to the one placed in tensorflow/models I've found that implementations use different layers as output of VGGish model (if considering activation as a separate layer),

yours:

nn.ReLU(True))

google's: https://github.com/tensorflow/models/blob/f32dea32e3e9d3de7ed13c9b16dc7a8fea3bd73d/research/audioset/vggish/vggish_slim.py#L104-L106 (activation_fn=None)

Also, it's mentioned in README

Note that the embedding layer does not include a final non-linear activation, so the embedding value is pre-activation

Changing output layer of VGGish in your implementation to pre-activation one (w/o RELU) makes embeddings (almost) equal in both cases, - raw and PCA'ed ones.

Thanks for porting though, great work!

questions about VGGISH_WEIGHTS

Thank you for your code and I wonder if your VGGISH_WEIGHTS from the path in the code is purely an adaption from google's checkpoint or your retrained result?

How to load raw audio file

Can you provide an example of how to load a raw audio file and using vggish as feature extractor in pytorch?

PCA post-processing removes gradient from embeddings

As the tensorflow code uses numpy to PCA the output embeddings, it is not possible to take advantage of this when adding torch-vggish to other networks (the usecase for this is relatively small). It would be useful to reimplement the PCA algorithm so that it can operate on Torch tensors.

Diff between the pytorchvggish and tensorflowvggish

Hi @harritaylor ,
I have inputed the piano.wav into tensorflow vggish, but the pca embedding is diff from pytorchvggish. Do you verify the output after the conversion?

def get_vggish_input(self, wav_file):
    try:
        examples_batch = vggish_input.wavfile_to_examples(wav_file)
        # Prepare a postprocessor to munge the model embeddings.
        pproc = vggish_postprocess.Postprocessor(pca_params)
        return examples_batch, pproc
    except:
        traceback.print_exc()
    return None, None

def get_features(self, examples_batch, pproc):
    try:
        # Run inference and postprocessing.
        [embedding_batch] = self.sess.run([self.embedding_tensor],
                                    feed_dict={self.features_tensor: examples_batch})
        postprocessed_batch = pproc.postprocess(embedding_batch)
        # cv2.imwrite("test.bmp", postprocessed_batch)
        return postprocessed_batch
    except:
        traceback.print_exc()
    return None

Provide better documentation

Instead of having notebooks it will be better to provide simple documentation as part of the readme, as the interface is significantly slimmed down now.

URL Error

Hello,

I just encountered this error today. Everything worked fine yesterday and now when I try to use the vggish embeddings I get this error:
urllib.error.URLError: <urlopen error [Errno 11001] getaddrinfo failed>

This is what I've tried and it worked before: vggish_model = torch.hub.load('harritaylor/torchvggish', 'vggish')

I am using windows 10, pycharm, python 3.8

Here is the full Traceback in case that helps:

  File "C:\Program Files\Python38\lib\urllib\request.py", line 1319, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "C:\Program Files\Python38\lib\http\client.py", line 1230, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Program Files\Python38\lib\http\client.py", line 1276, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Program Files\Python38\lib\http\client.py", line 1225, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Program Files\Python38\lib\http\client.py", line 1004, in _send_output
    self.send(msg)
  File "C:\Program Files\Python38\lib\http\client.py", line 944, in send
    self.connect()
  File "C:\Program Files\Python38\lib\http\client.py", line 1392, in connect
    super().connect()
  File "C:\Program Files\Python38\lib\http\client.py", line 915, in connect
    self.sock = self._create_connection(
  File "C:\Program Files\Python38\lib\socket.py", line 787, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "C:\Program Files\Python38\lib\socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/Bachelor Arbeit/dcase-2022-baseline-main/clotho_preprocessing.py", line 99, in <module>
    preprocess_dataset(config)
  File "D:/Bachelor Arbeit/dcase-2022-baseline-main/clotho_preprocessing.py", line 16, in preprocess_dataset
    vggish_model = torch.hub.load('harritaylor/torchvggish', 'vggish')
  File "C:\Users\tincu\AppData\Roaming\Python\Python38\site-packages\torch\hub.py", line 539, in load
    repo_or_dir = _get_cache_or_reload(repo_or_dir, force_reload, trust_repo, "load",
  File "C:\Users\tincu\AppData\Roaming\Python\Python38\site-packages\torch\hub.py", line 180, in _get_cache_or_reload
    repo_owner, repo_name, ref = _parse_repo_info(github)
  File "C:\Users\tincu\AppData\Roaming\Python\Python38\site-packages\torch\hub.py", line 134, in _parse_repo_info
    with urlopen(f"https://github.com/{repo_owner}/{repo_name}/tree/main/"):
  File "C:\Program Files\Python38\lib\urllib\request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Program Files\Python38\lib\urllib\request.py", line 525, in open
    response = self._open(req, data)
  File "C:\Program Files\Python38\lib\urllib\request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "C:\Program Files\Python38\lib\urllib\request.py", line 502, in _call_chain
    result = func(*args)
  File "C:\Program Files\Python38\lib\urllib\request.py", line 1362, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "C:\Program Files\Python38\lib\urllib\request.py", line 1322, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 11001] getaddrinfo failed>

Process finished with exit code 1

GPU version support?

Thank you for your work, I would like to ask if you can add GPU support options in torch.hub?Another question is whether the obtained embedding_size must be a fixed value of 128, is there a way to convert to 2048 dimensions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.