Giter VIP home page Giter VIP logo

speech-processing-hmm-digits-recognition's Introduction

Speech Processing CS 566

Assignment 05 (HMM)

Roll No: 214101058 MTech CSE'23 IITG | Vijay Purohit


Input: 
	*	input_lamda/* = Contains: PRESENTLY USED "Lamda Model for Each Digit"
		**	      = Contains: "Universe.csv" of Cepstral Coefficients of Training Files
		**	      = Contains: "Codebook" file to be read.
	*	input_live_voice_data/*	= Contains: Live Recording Files, 
						Their Observation Sequences, 
						Their Test Result Using Model (input or newly converged)
					 (Can Clean These Files Except the Folders Present)
		**	input_live_voice_data/TRAINING	= Contains Digits Recordings generated Using Application for Training Purpose 
		**	input_live_voice_data/TESTING	= Contains: Digits Recordings generated Using Application for Testing Purpose
		*** 	in the end, replace manually
				input_live_voice_data/TRAINING/* --> input_voice_training_data/*
				input_live_voice_data/TESTING/* --> input_voice_testing_data/*
	*	input_voice_training_data/*	= Contains: Training Utterance Recordings for Input into model
	*	input_voice_testing_data/*	= Contains: Test Utterance Recordings for Input into model
	* 	RecordingModule/*		= Contains: Recording Module Files

Output:  
	*	output/*	= Contains:	Result Analysis of Converged Model for Each Digit.
		**	output/Models/*	= Contains: Newly Generated Model using Input Trainning Files.
	*	output_voice_recordings_analysis_files	= Contains: Recording Analysis files which 
							shows Frames used, Samples used, STE Marker, Cepstral Coefficients etc
							For Files of Input Training, Input Testing, Live Recordings
	*	output_voice_recordings_normalised_segregated	= Contains: Segragated Speech Part using Start and End Marker
							For Files of Input Training, Input Testing, Live Recordings
Debug Variables:	
	* 	segregate_speech :	True: to segreagate speech data with respect to start and end marker in its output folder (output_voice_recordings_normalised_segregated). 
	*	segregate_Live_speech :	True: to segreagate Live Recording data with respect to start and end marker in its output folder (output_voice_recordings_normalised_segregated). 
	*	showCoefficientsInFile :	True: show Coefficients Values R, A, C's of each frames in its analysised files (output_voice_recordings_analysis_files).
	*	showAlphaBetaPstarInConsole :	True: to show alpa, beta probabilities in the console for each observation sequence. (also saved in files in (output/) )
	*	showStateSeqAlphaBetaInFileForEachObsAfterConverge :	True: It will save each utterance alpha, beta probabilites and state sequence in the file in (output/).

FILES:
  • main_hmm.cpp = Main File Contains Menu
  • hmm_functions.h = Contains HMM Functions and Solution
  • observation_sequence.h = Observation Sequence Functions
For Error: PlaySound() is not Identified: Do:
  • Right Click Project Name in Solution Explorer
  • Select Propertes --> Linker --> Input
  • Select Additional Dependencies --> Edit
  • Add name " winmm.lib "

Instructions to execute Code.

  1. Open it in Visual Studio 2010.
    • Main file: main_hmm.cpp
  2. Compile it and Run. Console window will show up.
    • Interact With Menu
      • Output will be shown on the Console.
      • Detailed Output *.txt will be present in their respective folder.
  3. Take Care:
    • To generate The Respective Sequence (Training/Testing) before Converging or Testing.

THE END.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.