Giter VIP home page Giter VIP logo

anwarvic / speaker-recognition Goto Github PK

View Code? Open in Web Editor NEW
108.0 4.0 32.0 23.5 MB

This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1

Python 64.46% Makefile 0.88% HTML 7.26% Java 8.81% M4 5.51% C 5.49% MATLAB 0.08% C++ 7.51%
sidekit speaker-recognition speaker-verification speaker-identification gmm gmm-ubm i-vector ubm identity-verification identity-vector

speaker-recognition's Issues

ValueError: need at least one array to concatenate

Hi!!!
I have the following error training the model :

File ".\ubm.py", line 202, in 
ubm.train()
File ".\ubm.py", line 50, in train
iterations=(1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8)
File "C:\Users\jrodenas\Desktop\SpeakerRecognition\Speaker-Recognition\sidekit\mixture.py", line 672, in EM_split
self._init(features_server, feature_list, num_thread)
File "C:\Users\jrodenas\Desktop\SpeakerRecognition\Speaker-Recognition\sidekit\mixture.py", line 629, in _init
features = features_server.stack_features_parallel(feature_list, num_thread=num_thread)
File "C:\Users\jrodenas\Desktop\SpeakerRecognition\Speaker-Recognition\sidekit\features_server.py", line 666, in stack_features_parallel
return numpy.concatenate(output, axis=0)
ValueError: need at least one array to concatenate

I have done the data_init first and it was well created. Then I ran extrac_feature and finally, ubm train.
How can I solve that?
Thank you in advance!!

Feature Server

How can I use this library, I got stuck in the parameter for feature server. I dont see your implementation of sidekit.

EM_split train commented out code

Hi,

are we supposed to uncomment this snippet so as to execute it properly?

if i > 0:
# gain = llk[-1] - llk[-2]
# if gain < llk_gain:
# logging.debug(
# 'EM (break) distrib_nb: %d %i/%d gain: %f -- %s, %d',
# self.mu.shape[0], i + 1, it, gain, self.name,
# len(cep))
# break
# else:
# logging.debug(
# 'EM (continu) distrib_nb: %d %i/%d gain: %f -- %s, %d',
# self.mu.shape[0], i + 1, it, gain, self.name,
# len(cep))
# break
pass
else:
# logging.debug(
# 'EM (start) distrib_nb: %d %i/%i llk: %f -- %s, %d',
# self.mu.shape[0], i + 1, it, llk[-1],
# self.name, len(cep))
pass

First issue

import os
import sidekit
import numpy as np
import warnings
warnings.filterwarnings("ignore")
import logging
logging.basicConfig(level=logging.INFO)

from model_interface import SidekitModel



class UBM(SidekitModel):
    """Universal Background Model"""
    
    def __init__(self, conf_filepath):
        super().__init__(conf_filepath)
        # Number of Guassian Distributions
        self.NUM_GAUSSIANS = self.conf['num_gaussians']


    def train(self, SAVE=True):
        """
        This method is used to train our UBM model by doing the following:
        - Create FeatureServe for the enroll features
        - create use EM algorithm to train our UBM over the enroll features
        - create StatServer to save trained parameters
        - if Save arugment is True (which is by default), then it saves that
          StatServer.
        Args:
            SAVE (boolean): if True, then it will save the StatServer. If False,
               then the StatServer will be discarded.
        """
        #SEE: https://projets-lium.univ-lemans.fr/sidekit/tutorial/ubmTraining.html
        #D:\Merged_Arabic_Corpus_of_Isolated_Words\Allah\feat\enroll
        train_list = os.listdir(os.path.join(self.BASE_DIR, "feat", "enroll"))
        for i in range(len(train_list)):
            train_list[i] = train_list[i].split(".h5")[0]
        server = self.createFeatureServer("enroll")
        logging.info("Training...")
        ubm = sidekit.Mixture()
        # Set the model name
        ubm.name = "ubm_{}.h5".format(self.NUM_GAUSSIANS) 
        # Expectation-Maximization estimation of the Mixture parameters.
        ubm.EM_split(
            features_server=server, #sidekit.FeaturesServer used to load data
            feature_list=train_list, #list of feature files to train the model
            distrib_nb=self.NUM_GAUSSIANS, #number of Gaussian distributions
            num_thread=1, # number of parallel processes
            save_partial=True, # if False, it only saves the last model
            iterations=(1, 2)
            )
            # -> 2 iterations of EM with 2    distributions
            # -> 2 iterations of EM with 4    distributions
            # -> 4 iterations of EM with 8    distributions
            # -> 4 iterations of EM with 16   distributions
            # -> 4 iterations of EM with 32   distributions
            # -> 4 iterations of EM with 64   distributions
            # -> 8 iterations of EM with 128  distributions
            # -> 8 iterations of EM with 256  distributions
            # -> 8 iterations of EM with 512  distributions
            # -> 8 iterations of EM with 1024 distributions
        model_dir = os.path.join(self.BASE_DIR, "ubm")
        logging.info("Saving the model {} at {}".format(ubm.name, model_dir))
        ubm.write(os.path.join(model_dir, ubm.name))

        # Read idmap for the enrolling data
        enroll_idmap = sidekit.IdMap.read(os.path.join(self.BASE_DIR, "task", "enroll_idmap.h5"))
        # Create Statistic Server to store/process the enrollment data
        enroll_stat = sidekit.StatServer(statserver_file_name=enroll_idmap,
                                         ubm=ubm)
        logging.debug(enroll_stat)

        server.feature_filename_structure = os.path.join(self.BASE_DIR, "feat", "{}.h5")
        # Compute the sufficient statistics for a list of sessions whose indices are segIndices.
        #BUG: don't use self.NUM_THREADS when assgining num_thread as it's prune to race-conditioning
        enroll_stat.accumulate_stat(ubm=ubm,
                                    feature_server=server,
                                    seg_indices=range(enroll_stat.segset.shape[0])
                                   )
        if SAVE:
            # Save the status of the enroll data
            filename = "enroll_stat_{}.h5".format(self.NUM_GAUSSIANS)
            enroll_stat.write(os.path.join(self.BASE_DIR, "stat", filename))


if __name__ == "__main__":
    conf_filename = "conf.yaml"
    ubm = UBM(conf_filename)
    ubm.train()
#     ubm.evaluate()
#     ubm.plotDETcurve()
#     print( "Accuracy: {}%".format(ubm.getAccuracy()) )

I run until this part of code hero using sidekit from pip all code i dont change any thing
it give me this error

Exception                                 Traceback (most recent call last)
<ipython-input-4-80b9baf34ccf> in <module>
    202     conf_filename = "conf.yaml"
    203     ubm = UBM(conf_filename)
--> 204     ubm.train()
    205 #     ubm.evaluate()
    206 #     ubm.plotDETcurve()

<ipython-input-4-80b9baf34ccf> in train(self, SAVE)
     77         enroll_stat.accumulate_stat(ubm=ubm,
     78                                     feature_server=server,
---> 79                                     seg_indices=range(enroll_stat.segset.shape[0])
     80                                    )
     81         if SAVE:

~\Anaconda3\lib\site-packages\sidekit\sidekit_wrappers.py in wrapper(*args, **kwargs)
    227         else:
    228             logging.debug("No Parallel processing with this module")
--> 229             func(*args, **kwargs)
    230 
    231     return wrapper

~\Anaconda3\lib\site-packages\sidekit\statserver.py in accumulate_stat(self, ubm, feature_server, seg_indices, channel_extension, num_thread)
    705             show = show[:show.rfind(channel_extension[channel])]
    706 
--> 707             cep, vad = feature_server.load(show, channel=channel)
    708             stop = vad.shape[0] if self.stop[idx] is None else min(self.stop[idx], vad.shape[0])
    709             logging.info('{} start: {} stop: {}'.format(show, self.start[idx], stop))

~\Anaconda3\lib\site-packages\sidekit\features_server.py in load(self, show, channel, input_feature_filename, label, start, stop)
    437                                                    input_feature_filename=feature_filename,
    438                                                    label=label,
--> 439                                                    start=start, stop=stop)
    440         else:
    441             logging.info('Extract tandem features from multiple sources')

~\Anaconda3\lib\site-packages\sidekit\features_server.py in get_features(self, show, channel, input_feature_filename, label, start, stop)
    482                                                                  label=label,
    483                                                                  start=start, stop=stop,
--> 484                                                                  global_cmvn=self.global_cmvn)
    485         # Post-process the features and return the features and vad label
    486         if global_cmvn:

~\Anaconda3\lib\site-packages\sidekit\frontend\io.py in read_hdf5_segment(file_handler, show, dataset_list, label, start, stop, global_cmvn)
    592 
    593     if show not in h5f:
--> 594         raise Exception('show {} is not in the HDF5 file'.format(show))
    595 
    596     # Get the selected segment

Exception: show enroll/S01.01.digits.wav is not in the HDF5 file

run dada_init.py

Why can't I find the converted audio in the \audio\data? but, I have installed sox.

HDF5 Exception

When I try to run model training with "python ubm.py" in step 5, I got "Exception: show enroll/S01.01.digits.wav is not in the HDF5 file". the error located in the following code, seg_indices=range(enroll_stat.segset.shape[0]), all the data are prepared as follows, ubm_16.h5 is under dir ubm.
data
├── audio
│   ├── data
│   ├── enroll
│   └── test
├── feat
│   ├── enroll
│   └── test
├── feat.zip
├── task
│   ├── enroll_idmap.h5
│   ├── plda_idmap.h5
│   ├── test_idmap.h5
│   ├── test_ndx.h5
│   ├── test_trials.txt
│   └── tv_idmap.h5
└── ubm
└── ubm_16.h5
can anyone help me?

Not found enroll_idmap.h5

I have downloaded your prepared features and then try to continue with the step N4 and run ubm.py.
In the process I've got the following error:

INFO:root:Feature-Server is created INFO:root:Training... Iteration #1: 100%|███████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.51s/it] Iteration #2: 100%|███████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:23<00:00, 11.74s/it] Iteration #2: 100%|███████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:23<00:00, 11.88s/it] Iteration #4: 100%|███████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:48<00:00, 12.08s/it] Iteration #4: 100%|███████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:52<00:00, 13.18s/it] Iteration #4: 100%|███████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:50<00:00, 12.64s/it] INFO:root:Saving the model ubm_64.h5 at /home/algernone/Speaker-Recognition/outpath/ubm Traceback (most recent call last): File "ubm.py", line 202, in <module> ubm.train() File "ubm.py", line 67, in train enroll_idmap = sidekit.IdMap.read(os.path.join(self.BASE_DIR, "task", "enroll_idmap.h5")) File "/home/algernone/Speaker-Recognition/sidekit/bosaris/idmap.py", line 281, in read with h5py.File(input_file_name, "r") as f: File "/home/algernone/anaconda3/envs/scienv/lib/python3.6/site-packages/h5py/_hl/files.py", line 408, in __init__ swmr=swmr) File "/home/algernone/anaconda3/envs/scienv/lib/python3.6/site-packages/h5py/_hl/files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = '/home/algernone/Speaker-Recognition/outpath/task/enroll_idmap.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

So to have the file (enroll_idmap.h5) have I to run all the steps or it can be achieved by another approach?

createFeatureServer() method is ambiguous

In evaluate the model ,you call createFeatureServer() without argument , the implementaion in SidekitModel class when no parameter then:

feat_dir = os.path.join(self.BASE_DIR, "feat")

But the feat folder contains two folder enroll and test ,isn't i? And every one contaisn features (file.h5) , so i want to know in this case the object FeaturesServer when run, to any of features files? to enroll features files or test features files or both? And thanks

Using Kaldi model

Hi Dear
I ask you this question since you are familiar sidekit.
Is it possible to use Kaldi models in sidekit?
I want to use PLDA model of for example this model (https://kaldi-asr.org/models/m8) in sidekit.
I mean I want to extract i-vectors and using sidekit plda-scoring with kaldi PLDA. (not train plda model in sidekit)
Do you have any idea? how can export kaldi plda model in sidekit?

best regards

Test on a new speaker

when the test be on a speaker(audio file for speaker not from the 50 sp) it will give the nearest speaker ,how i can calculate the thresold that when a new speaker not in data ,the output must unknown.

How to add a new speaker without retraining

hello,i would to know when a new speaker (not in data) ,how can deal with this state ,i read about this in paper (Speaker Verification Using Adapted Gaussian Mixture Models) to get final UBM combie Individual subpopulation models ,i mean that is ,instead of add audio files to data(for a new speaker) then train from zero model, instead of that in paper they say:combine first UBM(before add new file to data) and combine it with second (new speaker) ,is this true or false,and if is true how can i do this using sidekit or other method,becuse not logical to train model from zero.

Accuracy metrics

What's about your accuracy metrics?
Did you check how accurate is your approach?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.