Marcel TUDOR
14.07.2013, 18:33
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1227085&isnumber=27542&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_ all.jsp%3Farnumber%3D1227085%26isnumber%3D27542
"Using Neural Networks and LPCC to Improve Speech Recognition"
Linear Predictive Coding (LPC), powerful speech analysis technique, is very useful for encoding speech at a low bit rate and provides extremely accurate estimates of speech parameters - based on the assumption that speech signal is produced by a buzzer at the end of the tube (the glottis produces the buzz, characterized by its intensity and frequency, and the vocal tract forms the tube, characterized by resonance frequencies (formants) according to Calliope(1989), is very efficient for the vocalic areas. The model is less efficient for transient, unvowel or not stationary regions according to R. Lawrence and B. Hwang Juang (1993). A Radial Basis Function network is able to recognize in a satisfying percent a set of phonemes pronounced by different speakers, using LPC sets as input.
Published in:
Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on (http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8698) (Volume:2 )
Date of Conference: 10-11 July 2003
Page(s):445 - 448 vol.2Print ISBN:0-7803-7979-9INSPEC Accession Number:7938084Digital Object Identifier :10.1109/SCS.2003.1227085 (http://dx.doi.org/10.1109/SCS.2003.1227085)
"Using Neural Networks and LPCC to Improve Speech Recognition"
Linear Predictive Coding (LPC), powerful speech analysis technique, is very useful for encoding speech at a low bit rate and provides extremely accurate estimates of speech parameters - based on the assumption that speech signal is produced by a buzzer at the end of the tube (the glottis produces the buzz, characterized by its intensity and frequency, and the vocal tract forms the tube, characterized by resonance frequencies (formants) according to Calliope(1989), is very efficient for the vocalic areas. The model is less efficient for transient, unvowel or not stationary regions according to R. Lawrence and B. Hwang Juang (1993). A Radial Basis Function network is able to recognize in a satisfying percent a set of phonemes pronounced by different speakers, using LPC sets as input.
Published in:
Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on (http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8698) (Volume:2 )
Date of Conference: 10-11 July 2003
Page(s):445 - 448 vol.2Print ISBN:0-7803-7979-9INSPEC Accession Number:7938084Digital Object Identifier :10.1109/SCS.2003.1227085 (http://dx.doi.org/10.1109/SCS.2003.1227085)