Modeling and Predicting the Changes in Hearing Loss of Workers with the Use of a Neural Network Data Mining Algorithm: A Field Study

,


Introduction
Exposure to noise is a largely prevalent occupational health risk factor.Occupational noise exposure is usually characterized by frequency-weighted sound level, normalized to an 8-hour working day (LEX,8h).It is estimated that over 60 million of people around the world are exposed to noise exceeding the 85 dB permissible limit in the workplace, for an eight-hour working day (Zare et al., 2019).Exposure to excessive noise levels affect the workers' health and causes vari-ous occupational safety hazards (Nassiri et al., 2016;Safari Variani et al., 2018).One of the most adverse health effects of noise exposure is noise-induced hearing loss (NIHL).
Worldwide, NIHL is a largely prevalent occupational hearing hazard.The World Health Organization (WHO) has estimated that over 12% of the world population are affected by NIHL and NIHL is the second most common cause of hearing loss among older adults (Zare et al., 2019).In the USA, 7.4-10.2 million industrial workers are at risk of NIHL.In Sweden, 100 million US dollars is annually spent on compensation for the hearing-impaired (Ahmed et al., 2001).In Malasya, 26.9% of industry workers have a hearing loss in a frequency range of 3000-6000 Hz (Leong, 2003) and in Iran, at least 1 million workers are subjected to excessive noise exposure (Golmohammadi et al., 2006).
In the industry, noise pollution control is mainly based on the measurement of L EX,8h (Golmohammadi et al., 2006).The estimation of occupational health hazards caused by noise exposure includes a variety of factors, such as the worker's age and the duration of work experience, the type of noise, the noise exposure duration, the noise frequency spectrum, the number of workers affected by noise, and the duration of the working shift (Zamanian et al., 2012;2013;Tajic et al., 2008).The knowledge of the interrelationships between individual hearing risk factors greatly helps in designing and undertaking hearing protection activities in the workplace.The present study had the following main objectives: (1) to identify the predictor factors of hearing loss in the industry, (2) to determine the hearing loss of workers across both ears, (3) to model and predict the changes of hearing loss with the use of neural network algorithms, (4) to assess the accuracy and error rates of the hearing loss models developed in the study.
The study was conducted with the use of data mining and artificial neural network modelling.Data mining (DM) is the process of extracting valid, reliable information from databases and transforming the information of interest into a form suitable for use in a given application or activity.Data mining is the analysis step of the Knowledge Discovery in Databases (KDD) process (Badr et al., 2009).Artificial neural networks are computational, information processing paradigms modeled after the human brain, designed to recognize patterns and the relationships between the components and parameters of the system under investigation (Kohzadi et al., 1995).The advantage of artificial neural networks is direct learning from the data without any need to estimate their statistical characteristics.Neural networks make it possible to uncover the relationship between the set of inputs and outputs, to predict every output corresponding to a desired input with no need of any initial assumptions and knowledge about the relationships between the studied parameters (Golabi et al., 2013).

Objective of the study
The study was a cross-sectional investigation aimed at monitoring and predicting -with the use of an artificial neural network modeling -the development of hearing loss of workers in a mineral and in an indus-trial company.After determining the factors influencing the hearing loss we sought to determine the impact and the weight of each individual factor.The successive stages of the study were as follows: 1) selection of predictor factors for hearing loss modeling, 2) audiometric testing of both ears, The subjects in each group were categorized into three ranges of the duration of work experience: less than 10 years, 10-20 years, and more than 20 years of service (Majumder, Sharma, 2014).

Modeling the development of hearing loss
on the basis of an artificial neural network

Input and output encoding
The robustness of neural networks to unforeseen pattern variations in new data set is regarded as their positive side.Their downside however is that, for encoding attribute values, they follow a standardized procedure, meaning that all the attributes, including the categorical ones, are granted a value ranging from 0 to 1.The following equation (Eq.( 3)) indicates the calculation algorithm (Larose, Larose, 2014): Upon clear ordering of the classes, single output nodes can be utilized.For instance, one can imagine a categorization of reading prowess in elementary schools based on a collection of student attributes.The successive reading level categories may be defined as follows: first grade category -output from 0 to 0.25, second grade category: 0.25 ≤ output < 0.50, third grade category: 0.50 ≤ output < 0.75, fourth grade category: output ≥ 0.75 (Larose, Larose, 2014).

Neural networks for estimation and prediction
Since neural networks produce a continuous output, they are typically exploited for running estimation and prediction, using Eq. ( 4): where output is the neural network output in (0,1) range, data range is the range of the original attribute values on a non-normalized scale, and minimum indicates the lowest attribute value on the non-normalized scale (Larose, Larose, 2014).
A neural network comprises a layered, feedforward, completely connected network of artificial neurons, or nodes.The feedforward nature of the network is limited to a single flow direction, whereby no looping or cycling is permitted.The neural network consists of two or more layers; nonetheless, the majority of networks comprise three layers: an input layer, a hidden layer, and an output layer.The number of hidden layers may not exceed one.In most of the cases, however, the networks encompass only one layer, which is sufficient for most applications.Furthermore, the neural network is completely connected, which means that each node in a particular layer is associated with each node in the next layer.On the other hand, the nodes of the same layer are not connected to each other.The weight of the connection between nodes is indicated by W 1A .At the stage of initialization, the weights are given random values of 0 and 1 (Larose, Larose, 2014).
The number and the type of data set attributes commonly determine the number of input nodes.The number of hidden layers and the number of nodes in every hidden layer can be identified by the user.Based on the classification task, the output layer may possess more than one node (McCullagh, 2010; Larose, Larose, 2014).The neural network structure is displayed in Fig. 1.The power and the flexibility of the network are related with the number of nodes in the hidden layer.A large number of hidden layers may cause overfitting which results in memorizing the training set at the expense of generalizability to the validation set.If overfitting occurs, one may reduce the number of hidden layers.On the other hand, when the training accuracy is too low, the number of nodes may be increased in the hidden layer (Larose, Larose, 2014).
Data set inputs (e.g.attribute values) are fed into the input layer and pass through the hidden layer with no processing.As a result, the input layer nodes do not possess the same node structure as that of the hidden layer and output layer nodes.The node inputs and the connection weights are combined into a single scalar value through a combination function (which typically is summation Σ), known as net (Eq.( 5)) (Larose, Larose, 2014): where x ij indicates the i-th input to node j, W ij refers to the weight connected with the ith input to node j, and there are I + 1 inputs to node j.
It is worth mentioning that inputs from upstream nodes are illustrated by x 1 , x 2 , ..., x I , whereas x 0 is a constant input which conventionally takes the value of 1 and is similar to the constant factor in regression models.Therefore, every hidden or output layer node (represented by j) comprises an extra input which is equal to a specific weight W 0j x 0j = dW 0j (e.g.W 0 B for node B).Equation ( 6) presents the calculation algorithm (Larose, Larose, 2014): In neurons functioning in biology, signals are transmitted between two neurons only if the combination of inputs to a neuron exceeds a certain threshold level, which results in firing of the neuron.The firing response is not necessarily linearly related to the input stimulation increment.This behavior of neurons in biology is simulated in artificial neural networks by a nonlinear activation function.The sigmoid function (Eq.( 7)) is the most typical activation function (Krogh, Vedelsby, 1995; Larose, Larose, 2014): where e is the natural logarithm base.
Prior to computing net Z , the node contribution should be gauged, as shown in Eq. ( 8): In the next step, node Zcombines the outputs from nodes A and B through net Z , a weighted sum.This is carried out by the use of the weights related to the connections between these nodes, as shown in Eq. ( 9) (Larose, Larose, 2014).
It should be noted that the inputs x i to node Z are not data attribute values but outputs from the sigmoid functions from upstream nodes.The accuracy and the error rates of the models were determined from the confusion matrix.Confusion matrix is a square matrix whose dimensions are equal to the number of the output factor groups.In the matrix, the main diameter represents the percentage of cases predicted properly.According to Eq. ( 10), the model accuracy is the ratio of positive cases to the total number of cases (

Data analysis
The data were analyzed with the use of SPSS software, version 18.The mean, standard deviation, correlation coefficient, and regression diagrams were analyzed by linear regression and a paired t-test.Modeling of the hearing loss changes was made with the use of IBM SPSS Modeler 18.0 software.

Age and work experience duration
The age and work experience duration of the workers tested in the study are shown in Table 1.Table 3 shows the correlation between age, duration of work experience and hearing loss, determined by linear regression for the three groups.The data indicate that there was a statistically significant difference between age and hearing loss as well as between duration of work experience and hearing loss, in Groups 1 and 2.

Modeling of hearing loss changes
Five different models (Model 1 -Model 5) of hearing loss changes were used for hearing loss prediction and modeling.Model 1 (People working in the office) was used for Group 1, Model 2 -for Group 2, Model 3 -for Group 3, Model 4 (people working in the operating area) for Groups 2 and 3, Model 5 for Groups 1, 2, and 3.The results of modeling obtained with the use of IBM SPSS Modeler 18.0, are shown in Figs 3-7.The plots show, for a given model, the weight percentage of hearing loss of each predictor factor.Tables 4-8 present the confusion matrix of the neural network algorithm for the hearing loss models.

Discussion and conclusions
The results shown in Table 3 indicate that correlation between age and hearing loss as well as between duration of work experience and hearing loss of the workers were statistically significant in Groups 1 and 2. The data in Fig. 2 show that there was a significant difference between the A-weighted sound pressure level of exposure and hearing loss across Groups 1-3.
The data presented in Figs 3-7 show the weight of individual predictor factors in the development of hearing loss in Models 1-5.In Model 1 (L Aeq < 70 dB), the hearing threshold at 8 kHz, with a 38% weight, had the maximum impact on hearing loss, while the factors of age and hearing threshold at 1 kHz (6% weight), were the least influential ones (Fig. 3).The prediction accuracy of the neural network algorithm was 100% (Table 4).In Model 2 (L Aeq 70-80 dB), the factor with maximum weight was the hearing threshold at 4 kHz (19%) while the factors of age and threshold at the 500-Hz frequency (weights of 7%) had the minimum impact (Fig. 4).The prediction accuracy of the neural network algorithm was 100% (Table 5).In Model 3 (L Aeq > 85 dB) the threshold at 4 kHz, with a 20% weight, had the maximum impact while the factor of duration of work experience had the minimum impact, with a 6% weight (Fig. 5).In this model, the prediction accuracy of the neural network algorithm was 80% (Table 6).In Model 4, determined for Groups 2 (L Aeq 70-80 dB) and 3 (L Aeq > 85 dB), the maximum impact was observed for the hearing thresholds at 4 kHz and 2 kHz frequencies, with 18% weights, and the minimum impact (3% weight) was found for the factors of duration of work experience and noise (Fig. 6).The prediction accuracy of the neural network algorithm was 98% (Table 7).In Model 5, determined for Groups 1-3, the threshold at 4 kHz had the maximum impact, with a weight of 18%, and the least impact (1% weight) was found for the factor of age (Fig. 7).The neural network algorithm prediction accuracy was 99.3% (Table 8).
The present findings, demonstrating the effects of individual predictor factors on the development of hearing loss in industry workers, are in agreement with published studies of hearing loss in steel industry workers which indicate that hearing loss increases with sound exposure level and duration of workers' The present study added new data to a large body of investigations of hearing hazards in the industry.Most published studies on the modeling of hearing loss were based on audiometric data.The findings reported here show that neural data mining classification algorithms can be an effective tool for hearing hazard identification and greatly help in designing and conducting hearing conservation programs in the industry.

2. 5 . 3 .
Assessment of the accuracy and the error rates of the models

Figure 2
Figure 2 shows correlation and linear regression between L Aeq and hearing loss for all participants in Groups 1-3.
service(Golmohammadi et al., 2001;Masumi et al., 2008).Golmohammadi et al. (2006) studied the effect of noise on the development of hearing loss of workers in stone cutting industry in Iran and reported that maximum hearing loss was observed in a frequency range around 4000 Hz(Golmohammadi et al., 2006).Zare et al. (2019) used the C5 algorithm to determine the weight of factors affecting hearing loss, determined from audiometric data, in three groups of workers, classified on the basis of the exposure sound pressure level.The factor with the highest weight, in a group of machinery workers was the hearing threshold at 4 kHz frequency(Zare et al., 2019).The high prediction accuracy of the algorithms applied in the present study is a finding in agreement with reported studies.For example, in a study conducted to predict hearing loss symptoms from audiometry, using the FP-Growth (Frequent Pattern Growth) algorithm as a feature extraction technique, Noma et al. (2013) reported that the error rate ranged from 0 to 5.4%.In a recent study Zare et al. (2019) obtained prediction accuracy from 94% to 100% for different models.Majumder and Sharma (2014) used machine learning and data classification models to investigate hearing hazards of professional drivers.The study was conducted with the use of unsupervised (Expectation Maximization, k-means, Linear Vector Quantization, Self Organization Map) and supervised (Naïve Bayes, Instance Based, Back Propagation Network, Radial Basis Function) learning techniques.The results have demonstrated that all the techniques, except the Radial Basis Function classifier, have shown high performance in terms of classification accuracy (Majumder, Sharma, 2014).Nawi et al. (2011) applied a Gradient Descent with Adaptive Momentum (GDAM) algorithm to predict noise induced hearing loss in workers, using age, duration of work experience, and noise exposure as the main factors involved in hearing loss.The accuracy obtained in the present study in the prediction of hearing loss is close that reported by Nawi et al. (2011).

Table 1 .
Age and work experience duration of workers in Groups 1-3.

Table 5 .
Confusion matrix of data determined by the neural network algorithm in Model 2.

Table 6 .
Confusion matrix of data determined by the neural network algorithm in Model 3.

Table 7 .
Confusion matrix of data determined by the neural network algorithm in Model 4.

Table 8 .
Confusion matrix of data determined by the neural network algorithm in Model 5.