TY - JOUR
AU - Moradi, Ashkan
AB - Voice casting is one of the most crucial aspects of dubbing and localization, which consists of adapting audio-visual content from one language and culture to another. Dubbing is widely used in various industries, such as entertainment, education, marketing, and gaming, to reach global audiences and increase customer satisfaction. Choosing the right voice actor is not a simple task that anyone can do and requires consideration of many aspects of the source (original language) and target material (in the dubbed language), such as language characteristics, acoustic characteristics of the characters, age, emotions, and gender. An automated system can be a powerful tool alongside directors/managers to improve the selection process. These systems rely on the acoustic characteristics of the speakers and are powered by machine learning methods to achieve reliable results. This paper intends to contribute by introducing a novel method for nonlinear mapping voices between languages using encoder-decoder neural networks and phonetic-acoustic features. It details the architecture of the proposed automatic voice casting (AVC) system and reports experimental results comparing it with other baseline frameworks. In this study, the proposed AVC system is for English to Farsi voice casting in closed map situations. The experiments are conducted using the English to Persian voice-casting (E2PCast) dataset. The baseline system based on siamese neural networks (SNN) and x-vector input had a 4-best character accuracy of 99.25% and an accuracy of 96.32% on the test data. Our proposed system, acoustic–phonetic encoder-decoder mapping (APEDM), with an x-vector and phonetic vector as its inputs while applying SNN for similarity scoring showed the best results. This system achieved 98.98% accuracy and 4-best character accuracy of 100% in the test data.
TI - APEDM: a new voice casting system using acoustic–phonetic encoder-decoder mapping
JF - Multimedia Tools and Applications
DO - 10.1007/s11042-024-20496-1
DA - 2024-12-23
UR - https://www.deepdyve.com/lp/springer-journals/apedm-a-new-voice-casting-system-using-acoustic-phonetic-encoder-jck70fx8Li
SP - 1
EP - 25
VL - OnlineFirst
IS - 
DP - DeepDyve
ER -