Search results

Filters

  • Journals

Search results

Number of results: 2
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

Phantom sources are known to be perceived similar to real sound sources but with some differences. One of the differences is an increase of the perceived source width. This article discusses the perception, measurement, and modeling of source width for frontal phantom sources with different symmetrical arrangements of up to three active loudspeakers. The perceived source width is evaluated on the basis of a listening test. The test results are compared to technical measures that are applied in room acoustics: the inter-aural cross correlation coefficient (IACC) and the lateral energy fraction (LF). Adaptation of the latter measure makes it possible to predict the results by considering simultaneous sound incidence. Finally, a simple model is presented for the prediction of the perceived source width that does not require acoustic measurements as it is solely based on the loudspeaker directions and gains.
Go to article

Authors and Affiliations

Matthias Frank
Download PDF Download RIS Download Bibtex

Abstract

Glottal waveform models have long been employed in improving the quality of speech synthesis. This paper presents a new approach for modeling the glottal flow. The model is based on three control volumes that strike a one-mass and two-springs system sequentially and generate a glottal pulse. The first, second and third control volumes represent the opening, closing and closed phases of the vocal folds, respectively. The masses of the three control volumes and the size of the first one are the four parameters that define the shape, pitch and amplitude of the glottal pulse. The model may be viewed as parametric approach governed by second order differential equations rather than analytical functions and is very flexible for designing a glottal pulse. The glottal pulse generated by the present model, when compared with those generated by Rosenberg, LF and mucosal wave propagation models demonstrates that it appropriately represents the opening, closing and closed phases of the vocal fold oscillation. This leads to the validity of our model. Numerical solution of the present model has been found to be very efficient as compared to its analytical solution and two other well-known parametric models Rosenberg++ and LF. The accuracy of the numerical solution has been illustrated with the help of analytical solution. It has been observed that the accuracy improves by increasing the size of the first control volume and may decrease insignificantly with increase in the mass of any of the control volumes. Two experiments with the present model support its successful implementation as a voice source in speech synthesis. Thus our model renders itself as an efficient, accurate and realistic choice as a voice source to be employed in real-time speech production.

Go to article

Authors and Affiliations

Tahir Qureshi
Khalid Syed

This page uses 'cookies'. Learn more