Search results

Filters

  • Journals
  • Authors
  • Keywords
  • Date
  • Type

Search results

Number of results: 3
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

A robust and highly imperceptible audio watermarking technique is presented to secure the electronic patient record of Parkinson’s Disease (PD) affected patient. The proposed DCT-SVD based watermarking technique introduces minimal changes in speech such that the accuracy in classification of PD affected person’s speech and healthy person’s speech is retained. To achieve high imperceptibility the voiced part of the speech is considered for embedding the watermark. It is shown that the proposed watermarking technique is robust to common signal processing attacks. The practicability of the proposed technique is tested: by creating an android application to record & watermark the speech signal. The classification of PD affected speech is done using Support Vector Machine (SVM) classifier in cloud server.

Go to article

Authors and Affiliations

Aniruddha Kanhe
Aghila Gnanasekaran

Authors and Affiliations

Rana M. Nassar
1
Ashraf A. M. Khalaf
1
ORCID: ORCID
Ghada M. El-Banby
2
Fathi E. Abd El-Samie
3 4
Aziza I. Hussein
5
ORCID: ORCID
Walid El-Shafai
3 6

  1. Department of Electrical Engineering, Faculty of Engineering, Minia University, Minia 61111, Egypt
  2.   Department of Industrial Electronics and Control Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
  3. Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
  4. Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdurrahman University, Riyadh 84428, Saudi Arabia
  5. Electrical and Computer Engineering Department, Effat University, Jeddah, Kingdom of Saudi Arabia
  6.  Security Engineering Laboratory, Department of Computer Science, Prince Sultan University, Riyadh 11586, Saudi Arabia
Download PDF Download RIS Download Bibtex

Abstract

In building speech recognition based applications, robustness to different noisy background condition is an important challenge. In this paper bimodal approach is proposed to improve the robustness of Hindi speech recognition system. Also an importance of different types of visual features is studied for audio visual automatic speech recognition (AVASR) system under diverse noisy audio conditions. Four sets of visual feature based on Two-Dimensional Discrete Cosine Transform feature (2D-DCT), Principal Component Analysis (PCA), Two-Dimensional Discrete Wavelet Transform followed by DCT (2D-DWT- DCT) and Two-Dimensional Discrete Wavelet Transform followed by PCA (2D-DWT-PCA) are reported. The audio features are extracted using Mel Frequency Cepstral coefficients (MFCC) followed by static and dynamic feature. Overall, 48 features, i.e. 39 audio features and 9 visual features are used for measuring the performance of the AVASR system. Also, the performance of the AVASR using noisy speech signal generated by using NOISEX database is evaluated for different Signal to Noise ratio (SNR: 30 dB to −10 dB) using Aligarh Muslim University Audio Visual (AMUAV) Hindi corpus. AMUAV corpus is Hindi continuous speech high quality audio visual databases of Hindi sentences spoken by different subjects.
Go to article

Authors and Affiliations

Prashant Upadhyaya
Omar Farooq
M.R. Abidi
Priyanka Varshney

This page uses 'cookies'. Learn more