Search results

Filters

  • Journals
  • Authors
  • Keywords
  • Date
  • Type

Search results

Number of results: 5
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

At present, most of the existing target detection algorithms use the method of region proposal to search for the target in the image. The most effective regional proposal method usually requires thousands of target prediction areas to achieve high recall rate.This lowers the detection efficiency. Even though recent region proposal network approach have yielded good results by using hundreds of proposals, it still faces the challenge when applied to small objects and precise locations. This is mainly because these approaches use coarse feature. Therefore, we propose a new method for extracting more efficient global features and multi-scale features to provide target detection performance. Given that feature maps under continuous convolution lose the resolution required to detect small objects when obtaining deeper semantic information; hence, we use rolling convolution (RC) to maintain the high resolution of low-level feature maps to explore objects in greater detail, even if there is no structure dedicated to combining the features of multiple convolutional layers. Furthermore, we use a recurrent neural network of multiple gated recurrent units (GRUs) at the top of the convolutional layer to highlight useful global context locations for assisting in the detection of objects. Through experiments in the benchmark data set, our proposed method achieved 78.2% mAP in PASCAL VOC 2007 and 72.3% mAP in PASCAL VOC 2012 dataset. It has been verified through many experiments that this method has reached a more advanced level of detection.

Go to article

Authors and Affiliations

WenQing Huang
MingZhu Huang
YaMing Wang
Download PDF Download RIS Download Bibtex

Abstract

The paper examines the usage of Convolutional Bidirectional Recurrent Neural Network (CBRNN) for a problem of quality measurement in a music content. The key contribution in this approach, compared to the existing research, is that the examined model is evaluated in terms of detecting acoustic anomalies without the requirement to provide a reference (clean) signal. Since real music content may include some modes of instrumental sounds, speech and singing voice or different audio effects, it is more complex to analyze than clean speech or artificial signals, especially without a comparison to the known reference content. The presented results might be treated as a proof of concept, since some specific types of artefacts are covered in this paper (examples of quantization defect, missing sound, distortion of gain characteristics, extra noise sound). However, the described model can be easily expanded to detect other impairments or used as a pre-trained model for other transfer learning processes. To examine the model efficiency several experiments have been performed and reported in the paper. The raw audio samples were transformed into Mel-scaled spectrograms and transferred as input to the model, first independently, then along with additional features (Zero Crossing Rate, Spectral Contrast). According to the obtained results, there is a significant increase in overall accuracy (by 10.1%), if Spectral Contrast information is provided together with Mel-scaled spectrograms. The paper examines also the influence of recursive layers on effectiveness of the artefact classification task.

Go to article

Authors and Affiliations

Kamila Organiściak
Józef Borkowski
Download PDF Download RIS Download Bibtex

Abstract

In this work, two robust zeroing neural network (RZNN) models are presented for online fast solving of the dynamic Sylvester equation (DSE), by introducing two novel power-versatile activation functions (PVAF), respectively. Differing from most of the zeroing neural network (ZNN) models activated by recently reported activation functions (AF), both of the presented PVAF-based RZNN models can achieve predefined time convergence in noise and disturbance polluted environment. Compared with the exponential and finite-time convergent ZNN models, the most important improvement of the proposed RZNN models is their fixed-time convergence. Their effectiveness and stability are analyzed in theory and demonstrated through numerical and experimental examples.
Go to article

Authors and Affiliations

Peng Zhou
1
Mingtao Tan
2
ORCID: ORCID

  1. College of Electronic Information and Automation, Guilin University of Aerospace Technology, Guilin 541004, China
  2. School of Computer and Electrical Engineering, Hunan University of Arts and Science, Changde 415000, China
Download PDF Download RIS Download Bibtex

Abstract

We propose an approach to indirectly learn the Web Ontology Language OWL 2 property characteristics as an explanation for a deep recurrent neural network (RNN). The input is a knowledge graph represented in Resource Description Framework (RDF) and the output are scored axioms representing the characteristics. The proposed method is capable of learning all the characteristics included in OWL 2: functional, inverse functional, reflexive and irreflexive, symmetric and asymmetric, transitive. We report and discuss experimental evaluation on DBpedia 2016-10, showing that the proposed approach has advantages over a simple counting baseline.

Go to article

Authors and Affiliations

J. Potoniec
Download PDF Download RIS Download Bibtex

Abstract

Safety and security have been a prime priority in people’s lives, and having a surveillance system at home keeps people and their property more secured. In this paper, an audio surveillance system has been proposed that does both the detection and localization of the audio or sound events. The combined task of detecting and localizing the audio events is known as Sound Event Localization and Detection (SELD). The SELD in this work is executed through Convolutional Recurrent Neural Network (CRNN) architecture. CRNN is a stacked layer of convolutional neural network (CNN), recurrent neural network (RNN) and fully connected neural network (FNN). The CRNN takes multichannel audio as input, extracts features and does the detection and localization of the input audio events in parallel. The SELD results obtained by CRNN with the gated recurrent unit (GRU) and with long short-term memory (LSTM) unit are compared and discussed in this paper. The SELD results of CRNN with LSTM unit gives 75% F1 score and 82.8% frame recall for one overlapping sound. Therefore, the proposed audio surveillance system that uses LSTM unit produces better detection and overall performance for one overlapping sound.
Go to article

Bibliography

[1] UNODC: United Nations Office on Drugs and Crimes, “Burglary | Statistics and data,” 2017. https://dataunodc.un.org/crime/burglary. [2] K. Lashmi and A. S. Pillai, “Ambient Intelligence and IoT Based Decision Support System for Intruder Detection,” 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), Coimbatore, India, 2019, pp. 1-4. https://doi.org/10.1109/ICECCT.2019.8869327 [3] Dr. P. Prakash, R. Suresh and P.N. Kumar Dhinesh, “Smart City Video Surveillance using Fog Computing,” in International Journal of Enterprise Network Management, vol. 10, no. 3/4, pp.389 – 399, 2019. https://doi.org/10.1504/IJENM.2019.103165 [4] Caught on camera, “Different Types of CCTV-CCTV Camera Types and Uses,” 2020. [Online]. Available: https://www.caughtoncamera.net/news/different-types-of-cctv/ . [5] S. Ntalampiras, “Audio Surveillance,” 2012. [pdf]. Available: https://www.itpress.com/Secure/elibrary/papers/9781845645625/9781845645625012FU1.pdf [6] P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio and M. Vento, “Audio Surveillance of Roads: A System for Detecting Anomalous Sounds,” in IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 1, pp. 279-288, Jan. 2016. https://doi.org/10.1109/TITS.2015.2470216 [7] S. Ntalampiras, I. Potamitis and N. Fakotakis, “Probabilistic Novelty Detection for Acoustic Surveillance Under Real-World Conditions,” in IEEE Transactions on Multimedia, vol. 13, no. 4, pp. 713-719, Aug. 2011. https://doi.org/10.1109/TMM.2011.2122247 [8] A. Mesaros et al., “Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 2, pp. 379-393, Feb. 2018. https://doi.org/10.1109/TASLP.2017.2778423 [9] E. Çakır, G. Parascandolo, T. Heittola, H. Huttunen and T. Virtanen, “Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 6, pp. 1291-1303, June 2017. https://doi.org/10.1109/TASLP.2017.2690575 [10] S. Adavanne, P. Pertilä and T. Virtanen, “Sound event detection using spatial features and convolutional recurrent neural network,” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 771-775. https://doi.org/10.1109/ICASSP.2017.7952260 [11] P. Zinemanas, P. Cancela and M. Rocamora, “End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments,” 2019 24th Conference of Open Innovations Association (FRUCT), Moscow, Russia, 2019, pp. 533-539. https://doi.org/10.23919/FRUCT.2019.8711906 [12] G. Parascandolo, H. Huttunen and T. Virtanen, “Recurrent neural networks for polyphonic sound event detection in real-life recordings,” 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, 2016, pp. 6440-6444. https://doi.org/10.1109/ICASSP.2016.7472917 [13] L. Birnie, T. D. Abhayapala, H. Chen and P. N. Samarasinghe, “Sound Source Localization in a Reverberant Room Using Harmonic Based Music,” ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 651-655. https://doi.org/10.1109/ICASSP.2019.8683098 [14] L. O. Nunes et al., “A Steered-Response Power Algorithm Employing Hierarchical Search for Acoustic Source Localization Using Microphone Arrays,” in IEEE Transactions on Signal Processing, vol. 62, no. 19, pp. 5171-5183, Oct.1, 2014. https://doi.org/10.1109/TSP.2014.2336636 [15] M. W. Hansen, J. R. Jensen and M. G. Christensen, “Pitch and TDOA-based localization of acoustic sources with distributed arrays,” 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, QLD, 2015, pp. 2664-2668. https://doi.org/10.1109/ICASSP.2015.7178454 [16] J. Pak and J. W. Shin, “Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 8, pp. 1335-1345, Aug. 2019. https://doi.org/10.1109/TASLP.2019.2919378 [17] S. Adavanne, A. Politis and T. Virtanen, “Direction of Arrival Estimation for Multiple Sound Sources Using Convolutional Recurrent Neural Network,” 2018 26th European Signal Processing Conference (EUSIPCO), Rome, 2018, pp. 1462-1466, https://doi.org/10.23919/EUSIPCO.2018.8553182
Go to article

Authors and Affiliations

V. S. Suruthhi
1
V. Smita
1
Rolant Gini J.
1
K.I. Ramachandran
2

  1. Department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, India
  2. Centre for Computational Engineering &Networking (CEN), Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, India

This page uses 'cookies'. Learn more