TY - JOUR
T1 - Vocal-tract modeling
T2 - Fractional elongation of segment lengths in a waveguide model with half-sample delays
AU - Mathur, Siddharth
AU - Story, Brad H.
AU - Rodríguez, Jeffrey J.
N1 - Funding Information:
Manuscript received January 27, 2004; revised April 25, 2005. This work was supported by NIDCD Grant R01-DC04789. This paper was presented in part at IEEE-ISSPIT ’03. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. T. Dutoit. S. Mathur is with Nellymoser, Inc., Arlington, MA 02476 USA (e-mail: [email protected]). B. H. Story is with the Department of Speech and Hearing Sciences, University of Arizona, Tucson, AZ 85721 USA (e-mail: [email protected]). J. J. Rodríguez is with the Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TSA.2005.858550 Fig. 1. Cross-sectional areas for the 44 equal-length cylindrical segments in a neutral vowel. They are labeled A (glottal end) through A (lip end).
PY - 2006/9
Y1 - 2006/9
N2 - Digital waveguide models are commonly used for simulating vocal-tract acoustics based on physiological data. In particular, waveguide models with half-sample delays are known to be well suited for speech production research. This paper presents enhancements to such a model, aimed at improved accuracy in mapping physiological vocal-tract data (shape and length of the airway) to waveguide parameters. The enhancements allow the length of the vocal tract to be continuously varied, thus enabling more realistic synthesis. This is achieved by smoothly varying the individual segment lengths of a piecewise-cylindrical representation of ;the airway, without altering the system sampling frequency. Fractional-delay filters are used for spatial interpolation of the digital waveguide model. The algorithms are validated by modeling the protrusion of lips, lowering of larynx and lengthening of intermediate segments for a static vowel shape.
AB - Digital waveguide models are commonly used for simulating vocal-tract acoustics based on physiological data. In particular, waveguide models with half-sample delays are known to be well suited for speech production research. This paper presents enhancements to such a model, aimed at improved accuracy in mapping physiological vocal-tract data (shape and length of the airway) to waveguide parameters. The enhancements allow the length of the vocal tract to be continuously varied, thus enabling more realistic synthesis. This is achieved by smoothly varying the individual segment lengths of a piecewise-cylindrical representation of ;the airway, without altering the system sampling frequency. Fractional-delay filters are used for spatial interpolation of the digital waveguide model. The algorithms are validated by modeling the protrusion of lips, lowering of larynx and lengthening of intermediate segments for a static vowel shape.
KW - Articulatory speech synthesis
KW - Digital waveguides
KW - Vocal tract modeling
UR - http://www.scopus.com/inward/record.url?scp=34047258319&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34047258319&partnerID=8YFLogxK
U2 - 10.1109/TSA.2005.858550
DO - 10.1109/TSA.2005.858550
M3 - Article
AN - SCOPUS:34047258319
SN - 1558-7916
VL - 14
SP - 1754
EP - 1762
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
IS - 5
ER -