Giter VIP home page Giter VIP logo

lpc_formant_analysis's Introduction

LPC_Formant_Analysis

Use of Linear Predictive Coding for Formant Analysis of Concurrent Vowels

The project focuses on understanding the effects of noise on the formant representations of both single and concurrent vowels. The dataset used for this purpose included values of single and concurrent vowels at two fundamental frequencies, 100 Hz and 126 Hz respectively. With the help of Linear Predictive Coding (LPC), the formant frequencies were first found for these vowels. Following this, using various speech signals, Speech Spectrum Shaped Noise was generated. After this, the noise was added to the vowels for 3 different SNR values. The formant frequencies were then again estimated for these corrupted signals having different SNR levels. An analysis has been carried out to investigate the effects of the different SNR levels on the formants. Further, an attempt has been made to understand which vowels (both single and concurrent) are more susceptible to noise.

Authors

Taru Kapoor, Kshitij Khandelwal, Anubhav Sachan

Introduction

The general character of the speech signal varies at the phoneme rate, which is on the order of 10 phonemes per second, while the detailed time variations of the speech waveform are at a much higher rate. That is, the changes in vocal tract configuration occur relatively slowly compared to the detailed time variation of the speech signal. The sounds created in the vocal tract are shaped in the frequency domain by the frequency response of the vocal tract. The resonance frequencies resulting from a particular configuration of the articulators are instrumental in forming the sound corresponding to a given phoneme. These resonance frequencies are called the formant frequencies of the sound. On analyzing the frequency spectrum of a speech signal, various peaks can be observed. The formant frequencies correspond to local maxima in the spectrum. To identify these formants, linear predictive coding proves extremely useful. On carrying out LPC analysis for a speech signal, a prediction polynomial A(z) in z^(-1) is obtained. In the pole-zero plot of A(z), the zeros lying close to or on the unit circle correspond to the formant frequencies of the speech signal. In the LPC spectrum of the signal, the peaks correspond to the formant frequencies.

Acknowledgement

The authors thank Dr. Ananthakrishna Chintanpalli, Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani for providing them an opportunity to undertake this project work. The authors also thank him for imparting the signal processing knowledge necessary for this endeavor and for his guidance and clarifications.

The authors would also like to express their sincere gratitude to Mr. Harshavardhan S, Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani for his support and guidance.

References

[1] L.Rabiner and R.Schafer, Introduction to Digital Speech Processing. Hanover, Now Publishers, 2007.

[2] Shodhganga.inflibnet.ac.in, 2017. [Online]. Available: http://shodhganga.inflibnet.ac.in/bitstream/10603/50843/9/09_chapter%202.pdf [Accessed: 15- Nov- 2017]

[3] "Speech Spectrum Shaped Noise - File Exchange - MATLAB Central", In.mathworks.com, 2017. [Online]. Available: https://in.mathworks.com/matlabcentral/fileexchange/55701-speech-spectrum-shaped-noise [Accessed: 17- Nov- 2017]

[4] R. Snell and F. Milinazzo, "Formant location from LPC analysis data", IEEE Transactions on Speech and Audio Processing, vol. 1, no. 2, pp. 129-134, 1993.

[5] "NOIZEUS: Noisy speech corpus - Univ. Texas-Dallas", Ecs.utdallas.edu, 2017. [Online]. Available: http://ecs.utdallas.edu/loizou/speech/noizeus/ [Accessed: 17- Nov- 2017].

[6] T. Kapoor, K. Khandelwal and A. Sachan, "khandelwalkshitij/LPC_Formant_Analysis", GitHub, 2017. [Online]. Available: https://github.com/khandelwalkshitij/LPC_Formant_Analysis

[7] Catford, J.C. (1988) A Practical Introduction to Phonetics, Oxford University Press, p. 161. ISBN 978-0198242178

[8] Assmann, P. F. (1996). Modeling the perception of concurrent vowels: Role of formant transitions. The Journal of the Acoustical Society of America, 100(2, Pt 1), 1141-1152. [Online] http://dx.doi.org/10.1121/1.416299

[9] Cheveigné, A. (1997). Concurrent vowel identification. III. A neural model of harmonic interference cancellation. The Journal of the Acoustical Society of America, 101, 2857. [Online] https://doi.org/10.1121/1.419480

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.