Voice Conversion Based on Hybrid SVR and GMM

Peng Song; Yun Jin; Li Zhao; Cairong Zou

Details

PDF BIBTEX RIS

Title

Voice Conversion Based on Hybrid SVR and GMM

Journal title

Archives of Acoustics

Yearbook

2012

Volume

vol. 37

Issue

No 2

Authors

Peng Song ; Yun Jin ; Li Zhao ; Cairong Zou

Keywords

voice conversion ; support vector regression ; Gaussian mixture model ; F0 prediction ; speaker-specific information

Divisions of PAS

Nauki Techniczne

Coverage

143-149

Publisher

Polish Academy of Sciences, Institute of Fundamental Technological Research, Committee on Acoustics

Date

2012

Type

Artykuły / Articles

Identifier

DOI: 10.2478/v10168-012-0020-9

Source

Archives of Acoustics; 2012; vol. 37; No 2; 143-149

References

Abe M. (1998), Voice conversion through vector quantization, null, 655. ; Chen Y. (2003), Voice conversion with smoothed GMM and MAP adaptation, null, 2413. ; Desai S. (2010), Spectral mapping using artificial neural networks for voice conversion, IEEE Transactions on Audio, Speech, and Language Processing, 18, 5, 954, doi.org/10.1109/TASL.2010.2047683 ; En-Najjary T. (2003), A new method for pitch prediction from spectral envelope and its application in voice conversion, null, 1753. ; Erro D. (2007), Frame Alignment Method for Cross-lingual Voice Conversion, null, 1969. ; Inanoglu Z. (2003), <i>Transforming pitch in a voice conversion framework</i>, Master Thesis, St. Edmund's College, University of Cambridge. ; Kain A. (1998), Spectral voice conversion for text-to-speech synthesis, null, 285. ; Kominek J. (2004), The CMU Arctic speech databases, null, 223. ; Kawahara H. (1999), Restructuring speech representation using pitch adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds, Speech Communication, 27, 3, 187, doi.org/10.1016/S0167-6393(98)00085-5 ; Misra H. (2003), Speaker-specific mapping for text-independent speaker recognition, Speech Communication, 39, 3-4, 301, doi.org/10.1016/S0167-6393(02)00046-8 ; Mizuno H. (2005), Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt, Speech Communication, 16, 2, 153, doi.org/10.1016/0167-6393(94)00052-C ; Mouchtaris A. (2004), Non-parallel training for voice conversion by maximum likelihood constrained adaptation, null, 1. ; Perez-Cruz F. (2002), Multi-dimensional function approximation and regression estimation, null, 757. ; Perez-Cruz F. (2000), An IRWLS procedure for SVR, null, 725. ; Shao X. (2004), Pitch prediction from MFCC vectors for speech reconstruction, null, 97. ; Smits G. (2002), Improved SVM regression using mixtures of kernels, null, 2785. ; Stylianou Y. (1998), Continuous probabilistic transform for voice conversion, IEEE Transactions Speech and Audio Processing, 6, 2, 131, doi.org/10.1109/89.661472 ; Toda T. (2001), Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum, null, 841. ; Toda T. (2005), Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter, null, 9. ; Ye H. (2006), Quality-enhanced voice morphing using maximum likelihood transformations, IEEE Transactions on Audio, Speech and Language Processing, 14, 4, 1301, doi.org/10.1109/TSA.2005.860839