In this article some key events concerning founding Polish Section of the Audio Engineering Society were presented. In addition, the history covering International Symposia on Sound Engineering and Mastering was outlined. Also, papers contained in this issue were shortly reviewed.
The paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that have an easy interpretation in the context of perceptual relevance. Within the study parameter values were extracted from music excerpts, gathered and compared to determine to what extent they are similar within the songs of the same performer or samples representing the same piece.
Recently, the rapid advancement of the IT industry has resulted in significant changes in audio-system configurations; particularly, the audio over internet protocol (AoIP) network-based audio-transmission technology has received favourable evaluations in this field. Applying the AoIP in a certain section of the multiple-cable zone is advantageous because the installation cost is lower than that for the existing systems, and the original sound is transmitted without any distortion. The existing AoIP-based technology, however, cannot control the audio-signal characteristics of every device and can only transmit multiple audio signals through a network. In this paper, the proposed Audio Network & Control Hierarchy Over peer-to-peer (Anchor) system enables all audio equipment to send and receive signals via a data network, and the receiving device can mix the signals of different IPs. Accordingly, it was possible to improve the system-application flexibility by simplifying the audio-system configuration. The research results confirmed that the received audio signals from different IPs were received, mixed, and output without errors. It is expected that Anchor will become a standard for audio-network protocols.
In the early days, consumption of multimedia content related with audio signals was only possible in a stationary manner. The music player was located at home, with a necessary physical drive. An alternative way for an individual was to attend a live performance at a concert hall or host a private concert at home. To sum up, audio-visual effects were only reserved for a narrow group of recipients. Today, thanks to portable players, vision and sound is at last available for everyone. Finally, thanks to multimedia streaming platforms, every music piece or video, e.g. from one’s favourite artist or band, can be viewed anytime and everywhere. The background or status of an individual is no longer an issue. Each person who is connected to the global network can have access to the same resources. This paper is focused on the consumption of multimedia content using mobile devices. It describes a year to year user case study carried out between 2015 and 2019, and describes the development of current trends related with the expectations of modern users. The goal of this study is to aid policymakers, as well as providers, when it comes to designing and evaluating systems and services.
In this paper, a new lifting wavelet domain audio watermarking algorithm based on the statistical characteristics of sub-band coefficients is proposed. First of all, an original audio signal was segmented and each segment was divided into two sections. Then, the Barker code was used for synchronization, the LWT (lifting wavelet transform) was performed on each section, a synchronization code and a watermark were embedded into the first section and the second section, respectively, by modifying the statistical average value of the sub-band coefficients. The embed strength was determined adaptively according to the auditory masking property. Experiments show that the embedded watermark has better robustness against common signal processing attacks than present algorithms based on LWT and can resist random cropping in particular.
Biography and scientific achievements of Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering.
The MDCT and IntMDCT Algorithm is widely utilized is Audio coding. By lifting scheme or rounding operation IntegerMDCT is evolved from Modified Discrete Cosine Transform. This method acquire the properties of MDCT and contribute excelling invertiblity and good spectral mean .In this paper we discuss about the audio codec like AAC and FLAC using MDCT and Integer MDCT algorithm and to find which algorithm shows better Compression Ratio(CR).The confines of this task is to hybriding lossy and lossless audio codec with diminished bit rate but with finer sound quality. Certainly the quality of the audio is figure out by Subjective and Objective testing which is in terms of MOS (Mean opinion square), ABx and some of the hearing aid testing methodology like PEAQ(Perceptual Evaluation Audio Quality) and ODG(Objective Difference Grade)is followed. Execution measure, that is Compression Ratio(CR) and Sound Pressure Level (SPL) is approximated.
Field programmable analog arrays (FPAA), thanks to their flexibility and reconfigurability, give the designers quite new possibilities in analog circuit design. The number of both academic projects on FPAA and applications of commercially available programmable devices is still growing. This paper explores the properties and parameters of two most popular FPAA circuits: the AnadigmVortex AN221E04 and AnadigmApex AN231E04 from the Anadigm company. The research conducted by the authors led to the discovery of some undocumented features of these devices. Several applications for audio processing were built and tested. The results show that these circuits can be used in medium-demanding audio applications. Thanks to dynamic reconfigurability, they also allow to build an universal analog audio signal processor. These circuits can also act as a versatile platform for rapid prototyping and educational purposes.
The paper examines the usage of Convolutional Bidirectional Recurrent Neural Network (CBRNN) for a problem of quality measurement in a music content. The key contribution in this approach, compared to the existing research, is that the examined model is evaluated in terms of detecting acoustic anomalies without the requirement to provide a reference (clean) signal. Since real music content may include some modes of instrumental sounds, speech and singing voice or different audio effects, it is more complex to analyze than clean speech or artificial signals, especially without a comparison to the known reference content. The presented results might be treated as a proof of concept, since some specific types of artefacts are covered in this paper (examples of quantization defect, missing sound, distortion of gain characteristics, extra noise sound). However, the described model can be easily expanded to detect other impairments or used as a pre-trained model for other transfer learning processes. To examine the model efficiency several experiments have been performed and reported in the paper. The raw audio samples were transformed into Mel-scaled spectrograms and transferred as input to the model, first independently, then along with additional features (Zero Crossing Rate, Spectral Contrast). According to the obtained results, there is a significant increase in overall accuracy (by 10.1%), if Spectral Contrast information is provided together with Mel-scaled spectrograms. The paper examines also the influence of recursive layers on effectiveness of the artefact classification task.
Independent Component Analysis (ICA) can be used for single channel audio separation, if a mixed signal is transformed into time-frequency domain and the resulting matrix of magnitude coefficients is processed by ICA. Previous works used only frequency (spectral) vectors and Kullback-Leibler distance measure for this task. New decomposition bases are proposed: time vectors and time-frequency components. The applicability of several different measures of distance of components are analysed. An algorithm for clustering of components is presented. It was tested on mixes of two and three sounds. The perceptual quality of separation obtained with the measures of distance proposed was evaluated by listening tests, indicating "beta" and "correlation" measures as the most appropriate. The "Euclidean" distance is shown to be appropriate for sounds with varying amplitudes. The perceptual effect of the amount of variance used was also evaluated.
This paper reviews parametric audio coders and discusses novel technologies introduced in a low-complexity, low-power consumption audio decoder and music synthesizer platform developed by the authors. The decoder uses parametric coding scheme based on the MPEG-4 Parametric Audio standard. In order to keep the complexity low, most of the processing is performed in the parametric domain. This parametric processing includes pitch and tempo shifting, volume adjustment, selection of psychoacoustically relevant components for synthesis and stereo image creation. The decoder allows for good quality 44.1 kHz stereo audio streaming at 24 kbps. The synthesizer matches the audio quality of industry-standard sample-based synthesizers while using a twenty times smaller memory footprint soundbank. The presented decoder/synthesizer is designed for low-power mobile platforms and supports music streaming, ringtone synthesis, gaming and remixing applications.
This paper addresses the problem of tampering detection and discusses methods used for authenticity analysis of digital audio recordings. Presented approach is based on frame offset measurement in audio files compressed and decoded by using perceptual audio coding algorithms which employ modified discrete cosine transform. The minimum values of total number of active MDCT coefficients occur for frame shifts equal to multiplications of applied window length. Any modification of audio file, including cutting off or pasting a part of audio recording causes a disturbance within this regularity. In this study the algorithm based on checking frame offset previously described in the literature is expanded by using each of four types of analysis windows commonly applied in the majority of MDCT based encoders. To enhance the robustness of the method additional histogram analysis is performed by detecting the presence of small value spectral components. Moreover, computation of maximum values of nonzero spectral coefficients is employed, which creates a gating function for the results obtained based on previous algorithm. This solution radically minimizes a number of false detections of forgeries. The influence of compression algorithms' parameters on detection of forgeries is presented by applying AAC and Ogg Vorbis encoders as examples. The effectiveness of tampering detection algorithms proposed in this paper is tested on a predefined music database and compared graphically using ROC-like curves.
In the age of digital media, delivering broadcast content to customers at an acceptable level of quality is one of the most challenging tasks. The most important factor is the efficient use of available resources, including bandwidth. An appropriate way of managing the digital multiplex is essential for both the economic and technical issues. In this paper we describe transmission quality measurements in the DAB+ broadcast system. We provide a methodology for analysing parameters and factors related with the efficiency and reliability of a digital radio link. We describe a laboratory stand that can be used for transmission quality assessment on a regional and national level.