Search results

Filters

  • Journals

Search results

Number of results: 2
items per page: 25 50 75
Sort by:

Abstract

A robust and highly imperceptible audio watermarking technique is presented to secure the electronic patient record of Parkinson’s Disease (PD) affected patient. The proposed DCT-SVD based watermarking technique introduces minimal changes in speech such that the accuracy in classification of PD affected person’s speech and healthy person’s speech is retained. To achieve high imperceptibility the voiced part of the speech is considered for embedding the watermark. It is shown that the proposed watermarking technique is robust to common signal processing attacks. The practicability of the proposed technique is tested: by creating an android application to record & watermark the speech signal. The classification of PD affected speech is done using Support Vector Machine (SVM) classifier in cloud server.
Go to article

Abstract

In building speech recognition based applications, robustness to different noisy background condition is an important challenge. In this paper bimodal approach is proposed to improve the robustness of Hindi speech recognition system. Also an importance of different types of visual features is studied for audio visual automatic speech recognition (AVASR) system under diverse noisy audio conditions. Four sets of visual feature based on Two-Dimensional Discrete Cosine Transform feature (2D-DCT), Principal Component Analysis (PCA), Two-Dimensional Discrete Wavelet Transform followed by DCT (2D-DWT- DCT) and Two-Dimensional Discrete Wavelet Transform followed by PCA (2D-DWT-PCA) are reported. The audio features are extracted using Mel Frequency Cepstral coefficients (MFCC) followed by static and dynamic feature. Overall, 48 features, i.e. 39 audio features and 9 visual features are used for measuring the performance of the AVASR system. Also, the performance of the AVASR using noisy speech signal generated by using NOISEX database is evaluated for different Signal to Noise ratio (SNR: 30 dB to −10 dB) using Aligarh Muslim University Audio Visual (AMUAV) Hindi corpus. AMUAV corpus is Hindi continuous speech high quality audio visual databases of Hindi sentences spoken by different subjects.
Go to article

This page uses 'cookies'. Learn more