Search for: [Keywords = "speaker recognition"]

Search results

Number of results: 3

items per page: 25 50 75

Sort by:

of 1

Impact of the Passage of Time on the Correct Identification of the Speaker Using the Auditory Method

Stefan Brachmański Bartosz Hus Piotr Staroniewicz

Archives of Acoustics | 2024 | vol. 49 | No 1 | 141-147 | DOI: 10.24425/aoa.2023.146823

Keywords speaker recognition crime acoustics aural identification

Download PDF Download RIS Download Bibtex

Abstract

Courts in Poland, as well as in most countries in the world, allow for the identification of a person on the basis of his/her voice using the so-called voice presentation method, i.e., the auditory method. This method is used in situations where there is no sound recording and the perpetrator of the criminal act was masked and the victim heard only his or her voice. However, psychologists, forensic acousticians, as well as researchers in the field of auditory perception and forensic science more broadly describe many cases in which such testimony resulted in misjudgement. This paper presents the results of an experiment designed to investigate, in a Polish language setting, the extent to which the passage of time impairs the correct identification of a person. The study showed that 31 days after the speaker’s voice was first heard, the correct identification for a female voice was 30% and for a male voice 40%.

Go to article

Authors and Affiliations

Stefan Brachmański

e-mail:

ORCID:

Bartosz Hus

Piotr Staroniewicz

e-mail:

ORCID:

Faculty of Electronics, Photonics and Microsystems, Department of Acoustics, Multimedia and Signal Processing Wrocław University of Science and Technology

SpeakerNet for Cross-lingual Text-Independent Speaker Verification

Hafsa Habib Huma Tauseef Muhammad Abuzar Fahiem Saima Farhan Ghousia Usman

Archives of Acoustics | 2020 | vol. 45 | No 4 | 573-583 | DOI: 10.24425/aoa.2020.134073

Keywords Convolutional Neural Network Deep learning Siamese network speaker verification text-independent binary operation Urdu speaker recognition

Download PDF Download RIS Download Bibtex

Abstract

Biometrics provide an alternative to passwords and pins for authentication. The emergence of machine learning algorithms provides an easy and economical solution to authentication problems. The phases of speaker verification protocol are training, enrollment of speakers and evaluation of unknown voice. In this paper, we addressed text independent speaker verification using Siamese convolutional network. Siamese networks are twin networks with shared weights. Feature space can be learnt easily by training these networks even if similar observations are placed in proximity. Extracted features from Siamese then can be classified using difference or correlation measures. We have implemented a customized scoring scheme that utilizes Siamese’ capability of applying distance measures with the convolutional learning. Experiments made on cross language audios of multi-lingual speakers confirm the capability of our architecture to handle gender, age and language independent speaker verification. Moreover, our designed Siamese network, SpeakerNet, provided better results than the existing speaker verification approaches by decreasing the equal error rate to 0.02.

Go to article

Authors and Affiliations

Hafsa Habib

Huma Tauseef

Muhammad Abuzar Fahiem

Saima Farhan

Ghousia Usman

Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods

Yunfei Zi Shengwu Xiong

Archives of Acoustics | 2024 | vol. 49 | No 1 | 37-48 | DOI: 10.24425/aoa.2024.148768

Keywords Bark-scaled Gauss linear filter filter bank superposition multi-dimensional central difference speaker recognition

Download PDF Download RIS Download Bibtex

Abstract

This work aims to further compensate for the weaknesses of feature sparsity and insufficient discriminative acoustic features in existing short-duration speaker recognition. To address this issue, we propose the Bark-scaled Gauss and the linear filter bank superposition cepstral coefficients (BGLCC), and the multidimensional central difference (MDCD) acoustic feature extracted method. The Bark-scaled Gauss filter bank focuses on low-frequency information, while linear filtering is uniformly distributed, therefore, the filter superposition can obtain more discriminative and richer acoustic features of short-duration audio signals. In addition, the multi-dimensional central difference method captures better dynamics features of speakers for improving the performance of short utterance speaker verification. Extensive experiments are conducted on short-duration text-independent speaker verification datasets generated from the VoxCeleb, SITW, and NIST SRE corpora, respectively, which contain speech samples of diverse lengths, and different scenarios. The results demonstrate that the proposed method outperforms the existing acoustic feature extraction approach by at least 10% in the test set. The ablation experiments further illustrate that our proposed approaches can achieve substantial improvement over prior methods.

Go to article

Authors and Affiliations

Yunfei Zi

Shengwu Xiong

School of Computer and Artificial Intelligence, Wuhan University of Technology Wuhan, China

Search results

Filters

Search results

Impact of the Passage of Time on the Correct Identification of the Speaker Using the Auditory Method

Abstract

Authors and Affiliations

SpeakerNet for Cross-lingual Text-Independent Speaker Verification

Abstract

Authors and Affiliations

Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods

Abstract

Authors and Affiliations