An Acoustic Study of the Emphatic Occlusive [ t ˙ ] in School-Going Children with Cleft Palate or Cleft Lip

The aim of this acoustic study is to analyse the phoneme [t ˙ ] produced by school children surgically operated on for the cleft palate or cleft lip, in order to examine their vocal characteristics, to provide speech therapists with numerous concrete analyses of voice and speech, to eﬀectively support them and to prevent some serious outcomes on their psychological and academic development. The motivation for this study was mainly stemming from the diﬃculties that Algerian schoolchildren with clefts encounter in the pronunciation of this phoneme. To carry out the study, several acoustic parameters were investigated in terms of the fundamental frequency F 0 , the ﬁrst three formants F 1 , F 2 , and F 3 , the energy E 0 , the Voice Onset Time (VOT), the durations [CV] and [V] of the subsequent vowel [a]. For the analysis, further important parameters in the ﬁeld of pathological speech were deployed, namely the degree of disturbance of F 0 (jitter), the degree of disturbance of intensity (shimmer) and the HNR (Harmonics to Noise Ratio). Results revealed disturbance in the values of F 1 , F 2 , and F 3 and stability in the values of F 0 . Another important reported aspect is the increase in the value of the VOT due to the diﬃculties in controlling the plosives’ successive closure and release.


Introduction
The present paper tackles the acoustic and articulatory analysis of a pathological disorder of the phonatory system, particularly facial cleft which affects large proportions of people worldwide.
Facial clefts are frequent defects, affecting one child per thousand in Europe.This frequency varies across ethnicities (3.6/1000 for American Indians against 0.3/1000 for African Americans), geographic origins, socio-economic level of parents and sex (cleft lip affects boys significantly, while cleft palates are more common among girls) (Andreu, 2013).
The cleft lip, associated or not with a cleft palate, is twice as frequent in boys, and division of the palate is twice as frequent in girls.The prevalence is global-ly twice as frequent among Asians and two times less among Africans than in the European population (Sante Publique France, 2019).
This study is significant since it enables to obtain concrete data that can support patient care, more notably children who are challenged at school and require early support to preclude school failures.
Nowadays, it is recognized that speech-language pathologists perceive the extensive need for visualizing speech in order to correlate objective and concrete data with auditory analysis, which is often characterized by subjectivity.Besides, the visualization of speech utility is accentuated by the rapid development of new computer technologies and the incrementing number of speech analysis software on the internet.Affording means for retrieving relevant acoustic pa-rameters, computer-aided analysis allows researchers to obtain detailed accounts about the nature and degree of speech disorders.
To carry out this work, a corpus of 261 audio files was recorded in a hospital environment; the hospital of Beni Messous-Algiers; contained pathological cases, which were subsequently compared with their healthy counterparts, taken as a group norm.In this concern, several acoustico-articulatory parameters were extracted in order to characterize the cleft palate.
Although previous studies conducted in Algerian hospitals embarked acoustic analysis to assess pathological speech, they did not consider the study of cleft palates (Ferrat, Guerti, 2012

State of the art
There have been numerous researches carried out in the study of speech production by subjects carrying palatal divisions; the following is an account of researches in that area.
V. Veau, a distinguished surgeon, was the first to write numerous books on the fissure palate and harelip.Accordingly, drawing upon the cineradiographic method, G. Kieffer conducted a particularly fascinating study on the articulation of the consonants of French in two pathological subjects (one presented cleft palate) compared with two healthy speakers.He noted articulatory changes (type and location of joint), the time of stops incompliant to the standard, and differences in mode (occlusions in the constrictors).Whilst, Borel-Maisonny, who defined the terminology of cleft, presented X-ray films on which variations in the position of the tongue inside the oral cavity were visible and of paramount importance.Then, many authors have attempted to describe the communication deficits of children (and adults) with cleft palate.Accordingly, they identify articulatory errors, such as defects resulted from reduced sensation related to the operation of the palate, velopharyngeal insufficiencies, and the remaining anomalies of the hard palate (Béchet, 2011).
Nowadays, other works that are worth mentioning, have analysed the voice and speech of patients with cleft palate, using many techniques including spectral analysis, disturbance analysis and formant analysis.
Villafuerte-Gonzalez et al. (2015), as well as Aydınlı et al. (2016) found out that the fundamental frequency, the jitter and the shimmer, are significantly different in patients with a repaired cleft palate, compared to normal children without speech, language and voice troubles.In addition, Yang et al. (2018) revealed in their study that there was no significant difference between the control group and the pathological group in the low frequency region (boys: 0-2720 Hz; girls: 0-2240 Hz).Their deployment of Long-Term Average Spectra (LTAS) measurements demonstrated a considerable difference between the control group and the clinical group in the mid-frequency region (boys: 2720-4000 Hz; girls: 2240-4000 Hz).Moreover, Segura-Hernández et al. (2019) reported that there is no significant difference in the value of the mean fundamental frequency (F 0 ) between the control group and the clinical group.After the surgical intervention, jitter and shimmer did not demonstrate any difference across the groups.Whereas, Attuluri et al. (2017) found an increase in jitter and shimmer measurements and reduced HNR (Harmonics to Noise Ratio) measurements in children with cleft palate compared to normal children.Likewise, Zajac and Linville (1989) and Lewis et al. (1993) reported in their studies that speakers with cleft palate indicated greater frequency troubles (jitter) than in the control group.

What is Cleft Palate (CP)?
Cleft Palate (CP) is a pathology of the phonatory system that concerns the incomplete closure of the palate (an absence of substance in the roof of the mouth), leading to communication between the interior of the mouth (oral cavity) and the interior of the nose (nasal cavity) as shown in Fig. 1.CP occurs at the very beginning of pregnancy; from the 5th to the 7th week of intrauterine life, and results from the absence or insufficient fusion of the different facial buds (Devisse et al., 2012).Indeed, during the formation of the face, small masses of tissue head toward the nose and mouth area and get together to form the upper lip, nose and palate.If the tissues' forgathering delays, they run the risk of not being welded resulting, in turn, in diverse forms of more or less marked clefts: labial, labio-alveolar, palato-velar, unilateral or bilateral, complete or incomplete.
CP can be both extensive and long in great or slight degrees (Fig. 2).It can involve the nasal sill, the upper lip, the alveolar ridge, the palatal vault and the velum.Everything takes place as if the palate is closing like a zipper and can get stuck at any point between the retro-incisor papilla which marks the front of the se- condary palate and the back of the uvula.For some children, this cleft may cover only a portion of the upper palate's back.For other children, it may be a complete separation, from the back to the front part of the oral cavity, also affecting all of the palate down to the uvula.The causes of CP are not clearly established, yet heredity and genetics could be crucial factors since the risk is higher in families with a history of CP.Some other factors such as smoking, heavy drinking, and taking certain medications in early pregnancy also appear to increase the risk of a CP.For some children with cleft palate, the fundamental mechanism of velar incompetence is mainly confined to the anatomical changes in the muscles of the veil due to the cleft.
Resorting to surgery between 9 and 18 months is indispensable to mend a CP (depending on the surgical techniques), so that the child will be predisposed to accept the operation, so as not to slow down the growth of the jaw, to allow the child to acquire the correct language at hand and to reduce nasal regurgitation (Fig. 3).In the absence of intervention, a prosthesis can account for the development of an intelligible language.The objective of the velopalatal division surgery is to restore the palate's anatomy and function; the surgery is meant to close up the opening between the palate and the nasal cavity.Besides this, the in-tervention aims to reconstruct the palate, to join the muscles, and to develop a palate of sufficient length for the child to effectively speak and eat.
Even with an early surgical intervention, the majority of children with cleft palate exhibit common errors and have a speech typical to cleft palate (Ghafari et al., 2015).
The peculiarity of the Arabic language is the presence of the rear glottal phonemes When performing emphatic consonants, the tongue does not cover or overlap the palate, yet it bends and curves to form a hollow in which the voice is pressed (Hadj-Salah, 1979).
At the articulatory level, emphatic phonemes are manifested in the pronunciation through shrinking the pharyngeal cavity and hollowing out the tongue with backward movement of its root, which increases the volume of the oral cavity.
Within the framework of our work, we mainly explored the pronunciation of the emphatic phoneme [t ˙].This phoneme is classified as non-aspirating and unvoiced emphatic alveo-dental plosive.Its articulation is due to an oral phenomenon which consists of a postponement of the root of the tongue causing pharyngeal constriction and a lowering and hollowing of the back of the tongue thus creating an increase in the volume of the oral cavity.In the acoustic domain, the emphatic phoneme [t ˙] appears on the spectrogram as an explosion bar of relative duration 20-30 ms, longer with long vowels and which is a very brief acoustic event, relatively energetic, in the form of a vertical bar preceded by a period of silence.It is often followed by a very short release of energy, just before a vowel.With

An acoustic analysis of pathological speech (PS)
Cleft Lip and Palate (CLP) affects resonance, voice, and speech.Besides, the most frequently reported resonance and speech disorders have been demonstrated in several reports which have also tackled the acous-tic abnormalities in the voice of patients with CLP (Segura-Hernández et al., 2019).
Acoustic speech analysis is becoming increasingly useful in the diagnosis of voice disorders or laryngological pathologies (Teixeira, Fernandes, 2015).The objective of this acoustic analysis is to extract the acoustic parameters which will allow us to compare the recordings of pathological cases with normal cases and to identify the acoustic and articulatory characteristics which characterize facial clefts.
Our acoustic analysis revolves around two groups of schoolchildren with facial clefts: the first concerns speakers undergoing speech therapy and the second concerns speakers with facial clefts who have been designated by speech therapists as already being taken in charge in the speech therapy service and have completed their rehabilitation.The main objective of this analysis is the physico-acoustic characterization of this type of PS, allowing a better knowledge of the relevant parameters of discrimination of this type of PS.
This analysis was carried out using Praat software (Boersma, Weenink, 2012).The acoustic signal was segmented manually using this software, according to the shape of the signal and the corresponding spectrogram.The audio files were exploited to retrieve the following acoustic parameters: VOT, [CV], [V], F 0 , F 1 , F 2 , F 3 , E 0 , jitter, shimmer, and HNR.

The recording conditions
The present work was carried out in collaboration with the Infantile Surgery Service at Beni-Messous hospital in Algiers, in which the acoustic recordings of children with different facial clefts took place.
In this study, access constraints to patients were noted.That is, recordings could only be possible proceeding at least a month from the surgery.Those recordings took place in the speech therapists' consultation room, once the children and their parents had consented.Those children went early in the morning to see three different doctors (The surgeon, ORL specialists and the speech therapist).
The recordings of the control speakers were performed under two conditions; first, pupils aged between 5 to 11, from "November, 1st, 1954 School" in Hammamet-Algiers, were recorded in a small quiet room.This would not be possible without permissions of the administration of the establishment and the children's parents.Second, other speakers of different ages (children were accompanied by their parents), were recorded in a quiet room at the University of Algiers 2 while respecting the same recording conditions.

Used material
For good quality recordings, that are reliable and free of parasitic oscillation, TASCAM DR-05, a portable recorder, was utilized to record audio sequences in .wavformat of 16 bits with sampling frequency of 44100 KHz.

The selected Corpus
Owing to their cleft palate, pathological speakers suffer from velar insufficiency which leads them to hardly utter certain sounds, more specifically, the occlusives.Indeed, cleft palate prevents these pathological speakers from reaching the oral pressure necessary for the production of those sounds.In this category of sounds, the research's scope was narrowed to the emphatic phoneme [t ˙] during and after speech therapy.It may be plausible to recall that subjects with cleft palate deploy various compensatory joints; each of which would embrace articulatory strategies adapted to their anatomy.
The study's corpus consists of six words containing the emphatic phoneme [t ˙] in words beginnings, followed by the short vowel [a] and the long vowel [ā] (Table 1).Each word of the corpus was repeated at least three times by each speaker during the recording.

Speakers
The participants were divided into a clinical group of seventeen (17) schoolchildren with cleft (pathological speakers, Ps) and a control group of twenty seven (27) schoolchildren (control speakers, Cs).
Pathological speakers (Ps) were selected by speech therapists on the basis of their progress in rehabilitation.These Ps are divided according to their type of facial cleft into two groups: eleven (11) speakers undergoing speech therapy and six (6) having completed their speech therapy (Table 2).
Control speakers (Cs) were healthy speakers, with unknown history of their speech production or perception disorder.In addition, the Cs were selected for their good speech production.
The classification of clefts that we have adopted here, is that of V. Veau (Fezari et al., 2014), whereby three types of cleft palate were examined: clefts of the soft and hard palate, up to the incisive foramen (7 Ps), clefts of the soft and hard palate extend unilaterally through alveolus (3 Ps), and clefts of the soft and hard palate extend bilaterally through alveolus (7 Ps), (Table 2, Fig. 4).

The extraction of acoustic parameters
The recorded sound files were manually segmented using Praat analysis tool (Boersma, Weenink, 2012).These files are used for the extraction of acoustic parameters.The corpus of audio recordings was then divided into two sub-corpuses (Table 3): • The first consists of words containing the emphatic phoneme [t ˙] followed by the short vowel [a], comprising 358 sound files; • The second consists of words containing the emphatic phoneme [t ˙] followed by the long vowel [ā] including 368 sound files.
We have studied the acoustic parameters such as the fundamental frequency F 0 related to the vibration of the vocal cords, the formants which make it possible to characterize the influence that the sound undergoes during its passage through the cavities of the speech system, the duration of sounds to study air flow and fluency of speech and energy (intensity).We also used other important parameters in the field of PS, namely the degree of disturbance of F 0 (jitter), the degree of disturbance of intensity (shimmer) and the HNR (Harmonics to Noise Ratio).These three parameters are usually measured on sustained vowels, and their values above a certain threshold are considered to be related to PS.
To study the duration, we measured for each sequence consonant-vowel [CV] the duration [CV], the VOT (voice onset time) which is one of the main indices for measuring occlusives and the duration [V] of the vowel [a] which follows the emphatic consonant The VOT is one of the main indices of measuring occlusives.According to D. Klatt, voice establishment delay or VOT (voice onset time) is defined as the time from the consonant relaxation until the appearance of the stable formant structure of the subsequent vowel (Fauth et al., 2014).The measure chosen goes from the consonantal explosion to the beginning of the stable formative structure of the next vowel.For the absolute duration of the vowel, it was quantified between VVO (vocalic voiced onset) and VVT (vocalic voiced termination), in other words, between the beginning and the end of a clearly defined vowel structure, respectively.
As part of our work, for each occurrence, following the consonant [t ˙], the duration [V] (ms) of the vowel [a] was computed, taking as benchmarks the ap- pearance and disappearance of a clearly determined forming structure.For the voiceless occlusive [t ˙], we measured the duration of [CV] (ms) and VOT (ms), which goes from the acoustic explosion, due to the relaxation, to the appearance of the clear formant structure of the subsequent vowel (Table 4).
The acoustic parameters studied for the Cs group and the Ps group (distributed according to the three types of clefts) the values obtained for each acoustic parameter studied (Table 4) are presented in the form of means ± standard deviation (Table 5).
In Table 5, we noticed that: • the VOT is larger for the PS and significantly larger for the LV; • the times [CV] and [V] are very high for the PS; • E 0 is almost identical between PS and NS; • according to the literature, the values of F 0 are greater than 250 Hz and are almost identical between PS and NS; • for the cleft type 2, the three formants F 1 , F 2 , and F 3 are very high regarding the norm for the two types of vowels LV and SV; • for the cleft type 4, the two formants F 1 and F 2 are lower compared to the NS for the two types of vowels, while, F 3 is plainly higher in the case of the SV and the reverse in the case of the LV; • for the type 3 cleft, we see a distinctly high F 1 , a slight difference in F 2 and F 3 for the SV; for the LV, the forming F 2 is almost stable, and F 3 is lower compared to the NS; • the shimmer is less than 3.81 for NS and is significantly higher than the standard for the three types of facial cleft; • the jitter is clearly less than 1.04% except for type 4 cleft, which represents the end value at the threshold in the pathological reference of Praat; • HNR is not significant for all three types of facial cleft.It is slightly lower compared to normal cases.

Results and discussions
The analysis of the values of the acoustic parameters used within the framework of our work reveals some essential conclusions.
The values of formants F 1 , F 2 , and F 3 are disturbed compared to the standardized values.This can be clarified from the volumes and shapes of the oral cavity and the labio-dental cavity before and after surgery receiving quite significant changes when affected by cleft palate.
However, the values of the fundamental frequency F 0 are practically stable and still within the norm as reported in the literature.This is accounted by the fact that the vocal cords are not affected by cleft palates, which are manifested at the level of the oral cavity and the labio-dental cavity.
Relative to the difficulties of controlling occlusives closure and subsequent release, the noise of the explosion of the release of the occlusive influences the next vowel, delaying the onset of a clear formant structure and, thereby, expanding the duration of VOT.
As some studies demonstrate, the explosion of the occlusive is not clear, and the constrictive corresponding to the ideal occlusive at the articulatory level is often carried out following this occlusive, bringing out a long duration of the VOT (Béchet et  Another interesting aspect observed in the study is the increment in the value of VOT among pathologic speakers with the three types of cleft due to the difficulties in controlling the closure of the occlusives and the subsequent relaxation.Indeed, prolonged segments may be due to an active strategy to increment oral air pressure and/or improve the perceptual accuracy of speech segments (Eshghi et al., 2017).
The difficulties in closure control were also reflected in the durations of [CV] and of the following vowel [V], notably for facial cleft type 2 and 4.
Nasal loss persists and is significant regardless of the type of cleft; nasal breath becomes audible and more or less masks speech.

Conclusion
This study evaluated the main vocal characteristics in school children with cleft palate.The obtained results demonstrate the presence of various disorders in those children with clefts in the articulation of the phoneme [t ˙] such as the disturbances at the level of the formants and the prolongation of the durations of the phonemes compared to their healthy counterparts.Rehabilitation taking these hazards into account will allow better care for patients, especially children at school.
We have come to the conclusion that pathological speech becomes more intelligible after rehabilitation.A significant improvement was identified in the placement of the tongue during the production of the phoneme [t ˙] as the motor speech system was able to adapt to the change in the peripheral structure (transformation of the operated area), and the speakers adopted various compensatory strategies, but the adaptability is specific to each subject.This, in turn, yields various reactions from the speakers.
We also observed a failure of the articulatory target of the emphatic Arabic consonant [t ˙] in children with facial cleft who have completed their rehabilitation, and each one of them establishes a proper target according to the nature of the reconstruction of the palate, which would, therefore, correspond to an individualistic strategy.
The results of this acoustic analysis were satisfactory and proved that the acoustic analysis is a reliable method for the evaluation of rehabilitation in patients with cleft lip and palate and thus enables speech therapists to obtain valuable advice to effectively help their patients manage the joint structures in their speech production.

Fig. 2 .
Fig. 2. Simple cleft palate (own material): a) division of the uvula, b) total veil division, c) division of the veil and the palatal vault.
[a] and [ā], the concentration is in the region of F 1 -F 2 between 1500-2400 Hz.In front of [t ˙], the F 2 of [a] is around 1150-1250 Hz and that of [ā] is at 1050-1150 Hz, which significantly differentiates the [a] from the [ā].The [a] and [ā] have an F 1 forming around 650 Hz (Djoudi, 1991).

F 11 +
* Classification of the facial cleft: type 2 -clefts of the soft and hard palate, up to the incisive foramen, type 3clefts of the soft and hard palate extending unilaterally through alveolus, type 4 -clefts of the soft and hard palate extending bilaterally through alveolus.* * Re-education in progress (−), re-education completed (+).

Fig. 5 .
Fig. 5. Absence of explosion in the realization of the word [t ˙ā'ira] by a pathological speaker.

Table 1 .
The used corpus.

Table 2 .
All pathological speakers according to the type of facial cleft selected for the different experiments of this study.

Table 3 .
Number of sound files recorded in pathological speech (PS) and normal speech (NS).

Table 4 .
Example of the acoustic analysis of recordings of a pathological speakers (cleft type 4).

Table 5 .
Values of the acoustic parameters of the three facial cleft types of the Ps and the Ns.