Deep learning vs feature engineering in the assessment of voice signals for diagnosis in Parkinson’s disease

Journal title

Bulletin of the Polish Academy of Sciences: Technical Sciences








Majda-Zdancewicz, Ewelina : Faculty of Electronics, Military University of Technology, ul. Gen. Sylwestra Kaliskiego 2, 00-908 Warsaw, Poland ; Potulska-Chromik, Anna : Department of Neurology, Medical University of Warsaw, ul. Banacha 1a, 02-097 Warsaw, Poland ; Jakubowski, Jacek : Faculty of Electronics, Military University of Technology, ul. Gen. Sylwestra Kaliskiego 2, 00-908 Warsaw, Poland ; Nojszewska, Monika : Department of Neurology, Medical University of Warsaw, ul. Banacha 1a, 02-097 Warsaw, Poland ; Kostera-Pruszczyk, Anna : Department of Neurology, Medical University of Warsaw, ul. Banacha 1a, 02-097 Warsaw, Poland



voice processing ; Parkinson’s disease ; non-linear analysis ; convolutional networks

Divisions of PAS

Nauki Techniczne




  1.  Y.D. Kumar and A.M. Prasad, “MEMS accelerometer system for tremor analysis”, Int. J Adv. Eng. Global Technol. 2(5), 685‒693 (2014).
  2.  P. Pierleoni, “A Smart Inertial System for 24h Monitoring and Classification of Tremor and Freezing of Gait in Parkinson’s Disease”, IEEE Sens. J. 19(23), 11612‒11623 (2019).
  3.  W. Pawlukowska, K. Honczarenko, and M. Gołąb-Janowska, “Nature of speech disorders in Parkinson disease”, Pol. Neurol. Neurosurg. 47(3), 263‒269 (2013), [in Polish].
  4.  S.A. Factor, Parkinson’s Disease: Diagnosis & Clinical Management, 2nd Edition, 2002.
  5.  R. Chiaramonte and M. Bonfiglio, “Acoustic analysis of voice in Parkinson’s disease: a systematic review of voice disability and meta- analysis of studies”, Rev. Neurologia 70(11), 393‒405 (2020).
  6.  Jiri Mekyska, et al., “Robust and complex approach of pathological speech signal analysis”, Neurocomputing 167, 94‒111 (2015).
  7.  B. Erdogdu Sakar, G. Serbes, C. Sakar, “Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson’s disease”, PLoS One 12, 8 (2017)
  8.  L. Berus, S. Klancnik, M. Brezocnik, and M. Ficko, “Classifying Parkinson’s Disease Based on Acoustic Measures Using Artificial Neural Networks”, Sensors (Basel) 19(1), 16 (2019).
  9.  L. Jeancolas et al., “Automatic detection of early stages of Parkinson’s disease through acoustic voice analysis with mel-frequency cepstral coefficients”, 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), 2017, pp. 1‒6.
  10.  D.A. Rahn, M. Chou, J.J. Jiang, and Y.Zhang, “Phonatory impairment in Parkinson’s disease: evidence from nonlinear dynamic analysis and perturbation analysis”, J. Voice 21, 64‒71 (2007).
  11.  J. Kurek, B. Świderski, S. Osowski, M. Kruk, and W. Barhoumi, “Deep learning versus classical neural approach to mammogram recognition”, Bull. Pol. Acad. Sci. Tech. Sci. 66(6), 831‒840 (2018).
  12.  S. Sivaranjini and C.M. Sujatha, “Deep learning based diagnosis of Parkinson’s disease using convolutional neural network”, Multimed. Tools Appl. 79, 15467–15479 (2020).
  13.  M. Wodziński, A. Skalski, D. Hemmerling, J.R. Orozco-Arroyave, and E. Noth, “Deep Learning Approach to Parkinson’s Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification” 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019, pp. 717‒720.
  14.  J. Chmielińska, K. Białek, A. Potulska-Chromik, J. Jakubowski, E. Majda-Zdancewicz, M. Nojszewska, A. Kostera-Pruszczyk and A. Dobrowolski, “Multimodal data acquisition set for objective assessment of Parkinson’s disease”, Proc. SPIE 11442, Radioelectronic Systems Conference 2019, 114420F (2020).
  15.  M. Kuhn, K. Johnson, Applied predictive modeling, New York: Springer, 2013.
  16.  P. Liang, C. Deng, J. Wu, Z. Yang, and J. Zhu, “Intelligent Fault Diagnosis of Rolling Element Bearing Based on Convolutional Neural Network and Frequency Spectrograms” 2019 IEEE International Conference on Prognostics and Health Management (ICPHM), San Francisco, USA, 2019, pp. 1‒5.
  17.  M.S. Wibawa, I.M.D. Maysanjaya, N.K.D.P. Novianti, and P.N. Crisnapati, “Abnormal Heart Rhythm Detection Based on Spectrogram of Heart Sound using Convolutional Neural Network”, 2018 6th International Conference on Cyber and IT Service Management (CITSM), Parapat, Indonesia, 2019, pp. 1‒4.
  18.  M. Curilem, J.P. Canário, L. Franco, and R.A. Rios, “Using CNN To Classify Spectrograms of Seismic Events From Llaima Volcano (Chile)”, 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brasil, 2018, pp. 1‒8.
  19.  D. Rethage, J. Pons and X. Serra, “A Wavenet for Speech Denoising”, 2018 IEEE Int. Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018, pp. 5069‒5073.
  20.  A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet classifi-cation with deep convolutional neural networks”, Neural Infor-mation Processing Systems, 2012.
  21.  J. Jakubowski and J. Chmielińska, “Detection of driver fatigue symptoms using transfer learning”, Bull. Pol. Acad. Sci. Tech. Sci. 66(6), 869‒874 (2018).
  22.  A. Benba, A. Jilbab, and A. Hammouch, “Voice analysis for detecting persons with Parkinson’s disease using MFCC and VQ”, International conference on circuits, systems and signal processing (ICCSSP’14), Russia, 2014.
  23.  E. Niebudek-Bogusz, J. Grygiel, P. Strumiłło, and M. Śliwińska-Kowalska, “Nonlinear acoustic analysis in the evaluation of occupational voice disorders”, Occupational Medicine, 64(1), 29–35 (2013), [in Polish].
  24.  E. Majda and A.P. Dobrowolski, “Modeling and optimization of the feature generator for speaker recognition systems”, Electr. Rev. 88(12A), 131‒136 (2012).
  25.  Y. Maryn, N. Roy, M. De Bodt, P.B. van Cauwenberge, P. Corthals, “Acoustic measurement of overall voice quality: a meta-analysis”, J. Acoust. Soc. Am. 126(5), 2619‒2634 (2009), doi: 10.1121/1.3224706.
  26.  E. Niebudek-Bogusz, J. Grygiel, P. Strumiłło, and M. Śliwińska-Kowalska, “Mel cepstral analysis of voice in patients with vocal nodules”, Otorhinolaryngology 10(4), 176‒181 (2011), [in Polish].
  27.  A. Krysiak, “Language, speech and communication disorders in Parkinson’s disease”, Neuropsychiatr. Neuropsychol. 6(1), 36–42 (2011), [in Polish].
  28.  F. Alías, J.C. Socoró, and X. Sevillano, “A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds”, Appl. Sci. 6(5), 143 (2016).
  29.  X. Valero and F. Alias, “Gammatone Cepstral Coefficients: Biologically InspiredFeatures for Non-Speech Audio Classification”, IEEE Trans. Multimedia 14(6), 1684‒1689 (2012).
  30.  S. Malcolm, “An Efficient Implementation of the Patterson-Holdworth Auditory Filter Bank”, 35. Apple Computer Technical Report, 1993.
  31.  D.M. Agrawal, H.B. Sailor, M.H. Soni, and H.A. Patil, “Novel TEO-based gammatone features for environmental sound classification”, 2017 25th European Signal Processing Conference (EUSIPCO), Kos, Greece, 2017, pp. 1809‒1813.
  32.  S. Russel and P. Norvig, Artificial intelligence – a modern approach, Upper Saddle River: Pearson Education, 2010.
  33.  A. Chatzimparmpas, R.M. Martins, K. Kucher, and A. Kerren, “StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics”, IEEE Transactions on Visualization and Computer Graphics 27(2), 1547‒1557 (2021), doi: 10.1109/TVCG.2020.3030352.






DOI: 10.24425/bpasts.2021.137347


Bulletin of the Polish Academy of Sciences: Technical Sciences; Early Access; e137347