Media Lab

Department of Electrical Engineering, National Taipei University.


The design of an audio system with dynamic sweet spot adjustment by head movement tracking

    Multi-channel audio is suitable for a home theater system, allowing people to enjoy the surround sound as that heard at the cinema. However, it is difficult to mount a multi-channel system in Asian houses. Unlike European or American, Asian have very limited living spaces, so the initial step of this proposal is to downmix multi-channel signals without losing the spatial effect. When converting multi-channel audio to stereo, the symmetric solution and low-order infinite impulse response filters are adopted in order to lower the computational cost and probably compact the size of the integrated chip. The main idea is to utilize a crosstalk cancellation system to predict the crosstalk signals that happen when playing three-dimensional audio with a pair of loudspeakers, and then eliminates their effects, so listeners can perceive surround audio.
    The major novelty of the proposed system is the integration of audio signal processing and robotics. The proposed system provides dynamic listening area for users. The use of robotic arms enables the system to automatically change the loudspeakers’ positions and directions, thereby generating a sweet spot where the listener’s head located.


Applications of Machine Learning in Healthcare

    Physicians have to face a large number of patients daily with very limited time at their disposal to attend these patients. Hence, the methods with fast and accurate judg-ment cardiac status play a vital role in diagnosis for ar-rhythmia in the early stages. Arrhythmia is a general term for heartbeat abnormalities which contains a variety of abnormal patterns. These patterns can be classified into three categories: heart rhythm, chronic heart rhythm, and irregular contraction.
    Electrocardiogram (ECG) is a graph that depicts blood circulation through the heart. ECG is also used for depicting the state of health of an individual and is helpful in disease diag-nosis. The target of this work is to check the application of curve fitting on ECG signals based on the Fourier series analysis method. When ECG signals are approximated by the Fourier series model, the fitting for the cardiac cycle is used for judging arrhythmias. The study has presented efficient methods for signal identification with the help of fitting parameters and ECG classification.


Artificial Intelligence in Financial Markets

    With advances in information technology and the development of big data, manual operation is unlikely to be a smart choice for stock market investing. Instead, the computer-based investment model is expected to bring investors more accurate strategic analysis and more effective investment decisions than human beings. This project aims to improve investor profits by mining for critical information in the stock data, therefore helping big data analysis. We used the R language to find the technical indicators in the stock market, and then applied the technical indicators to the prediction. The proposed R package includes several analysis toolkits, such as trend line indicators, W type reversal patterns, V type reversal patterns, and the bull or bear market. The simulation results suggest that the developed R package can accurately present the tendency of the price and enhance the return on investment.

Spherical Harmonics

Usage of sound field synthesis in the construction of U-learning environment in a smart city

    E-Learning is a method to learn with a computer with the Internet connected. Because of its flexibility, user can learn at home anytime. M-Learning is a way using mobile devices with the connection between the wireless network, applications, or platforms to access digital contents. Either E-learning or M-learning can only allow users to study the stuff by electronic devices such as personal computers, laptops, smartphones, or tablets. U-learning, on the other hand, integrates equipment, digital contents, and environments, to build a high-efficient learning space. Users do not have to bring or be equipped with any devices, an advantage that is suitable for smart city developments. This study aims to combine image processing and sound synthesis, providing an interactive learning environment where users are able to learn anywhere and anytime without any personal devices. We intend to use a background subtraction algorithm to detect the position of a user so the playback can select the corresponding digital voice content. The localization of a virtual sound source also depends on the user’s movement, thereby providing a high degree of interaction between a user and the proposed learning system. The capability of supporting multiple users will also be developed. The proposed system can function as a virtual narrator at a museum for smart city applications.

Spherical Harmonics

Digital content creation and virtual acoustics reproduction for virtual reality application

    In this study, we propose a binaural system for virtual auditory space synthesis and reproduction in the real-time and also show how to capture digital audio and video content. The study contains: high resolution head-related transfer function (HRTF) measurement and customization, motion-tracked binaural (MTB) system design, pinna model simulation, and digital content creation. The already existing HRTF databases are not enough for rendering virtual acoustic space in the interactive environment, so we will conduct high resolution measurement for virtual reality application. As HRTF datasets vary from person to person, it is difficult to select appropriated HRTF datasets from already existing databases for users. We therefore propose to use k-centers clustering to find the potential best-fit HRTF dataset for a specific user. When it comes to the MTB system, the audio quality is the first concern. This is because the MTB system can only render two-dimensional auditory space. In the study, a pinna model will be utilized, so the system can represent three-dimensional sound space. We aim to produce two types of auditory systems. The one is for high-performance computers and the other is for low-cost mobile devices. Finally, an example digital video content will be captured by a 360 degrees camera combined with our proposed system and mixing algorithm to reproduce full three-dimensional auditory-visual virtual reality.

Spherical Harmonics

Audio interfaces for mobile devices and artificial intelligence in effects units

    With the advent of modern techniques, there are increasing mobile devices with powerful functions. This proposal aims to develop audio effects units in software design. The proposed system is supposed to process 48k Hz digital audio signals on a tablet, a smartphone, or a notebook in the real-time. The built-in special sound effects include: echo, reverberation, modulation, flanging, chorus, loop, and equalization. Compared with conventional hardware effect units, the system is designed on the basis of stereophonic sound, therefore enhancing the externalization. In addition to popular sound effects, we develop artificial intelligence in noise reduction, a function which is especially helpful for singers performing on the busy street.
    We also intend to build an audio interface serving as the connection between a mobile device and an electric instrument, such as an electric guitar, an electric piano, or an electric bass, providing impedance matching and voltage balance. The conventional audio interface is designed for personal computers and only has USB or IEEE 1394 communication protocol. Our proposed interface, on the other hand, transfers the electric instrument signal into a 3.5mm jack in a smartphone or a tablet. The developed interface is expected to be light, low-power, and low-noise, thereby suitable for street musicians.

Media Lab