A psychoacoustic model of the voice source spectrum is proposed. The model is characterized by four spectral slope parameters: the difference in amplitude between the first two harmonics (H1-H2), the second and fourth harmonics (H2-H4), the fourth harmonic and the harmonic nearest 2 kHz in frequency (H4-2 kHz), and the harmonic nearest 2 kHz and that nearest 5 kHz (2 kHz-5 kHz). As a step toward model validation, experiments were conducted to establish the acoustic and perceptual independence of these parameters. In experiment 1, the model was fit to a large number of voice sources. Results showed that parameters are predictable from one another, but that these relationships are due to overall spectral roll-off. Two additional experiments addressed the perceptual independence of the source parameters. Listener sensitivity to H1-H2, H2-H4, and H4-2 kHz did not change as a function of the slope of an adjacent component, suggesting that sensitivity to these components is robust. Listener sensitivity to changes in spectral slope from 2 kHz to 5 kHz depended on complex interactions between spectral slope, spectral noise levels, and H4-2 kHz. It is concluded that the four parameters represent non-redundant acoustic and perceptual aspects of voice quality.
ASJC Scopus subject areas
- Arts and Humanities (miscellaneous)
- Acoustics and Ultrasonics