Search for: [References = "HintonG \(2006\), A fast learning algorithm for deep belief nets, Neural Computation, 18, 1527, doi.org\/10.1162\/neco.2006.18.7.1527"]

Search results

Search for: [References = "HintonG \(2006\), A fast learning algorithm for deep belief nets, Neural Computation, 18, 1527, doi.org\/10.1162\/neco.2006.18.7.1527"]

Filters

Journals
- Archives of Acoustics (2)

Search results

Number of results: 2

items per page: 25 50 75

Sort by:

of 1

Deep Belief Neural Networks and Bidirectional Long-Short Term Memory Hybrid for Speech Recognition

Łukasz Brocki Krzysztof Marasek

Archives of Acoustics | 2015 | vol. 40 | No 2 | 191-195 | DOI: 10.1515/aoa-2015-0021

Keywords deep belief neural networks long-short term memory bidirectional recurrent neural networks speech recognition large vocabulary continuous speech recognition

Download PDF Download RIS Download Bibtex

Abstract

This paper describes a Deep Belief Neural Network (DBNN) and Bidirectional Long-Short Term Memory (LSTM) hybrid used as an acoustic model for Speech Recognition. It was demonstrated by many independent researchers that DBNNs exhibit superior performance to other known machine learning frameworks in terms of speech recognition accuracy. Their superiority comes from the fact that these are deep learning networks. However, a trained DBNN is simply a feed-forward network with no internal memory, unlike Recurrent Neural Networks (RNNs) which are Turing complete and do posses internal memory, thus allowing them to make use of longer context. In this paper, an experiment is performed to make a hybrid of a DBNN with an advanced bidirectional RNN used to process its output. Results show that the use of the new DBNN-BLSTM hybrid as the acoustic model for the Large Vocabulary Continuous Speech Recognition (LVCSR) increases word recognition accuracy. However, the new model has many parameters and in some cases it may suffer performance issues in real-time applications.

Go to article

Authors and Affiliations

Łukasz Brocki

Krzysztof Marasek

Laughter Classification Using Deep Rectifier Neural Networks with a Minimal Feature Subset

Gábor Gosztolya András Beke Tilda Neuberger László Tóth

Archives of Acoustics | 2016 | vol. 41 | No 4 | 669-682 | DOI: 10.1515/aoa-2016-0064

Keywords speech recognition speech technology computational paralinguistics laughter detection deep neural networks

Download PDF Download RIS Download Bibtex

Abstract

Laughter is one of the most important paralinguistic events, and it has specific roles in human conversation. The automatic detection of laughter occurrences in human speech can aid automatic speech recognition systems as well as some paralinguistic tasks such as emotion detection. In this study we apply Deep Neural Networks (DNN) for laughter detection, as this technology is nowadays considered state-of-the-art in similar tasks like phoneme identification. We carry out our experiments using two corpora containing spontaneous speech in two languages (Hungarian and English). Also, as we find it reasonable that not all frequency regions are required for efficient laughter detection, we will perform feature selection to find the sufficient feature subset.

Go to article

Authors and Affiliations

Gábor Gosztolya

András Beke

Tilda Neuberger

László Tóth