journal article
Download Only Collection
Tan, Sihan; Khan, Nabeela; An, Zhaoyi; Ando, Yoshitaka; Kawakami, Rei; Nakadai, Kazuhiro
doi: 10.1080/01691864.2024.2442721pmid: N/A
Technology to support human communication by sign language may address a growing social need and is interesting from an engineering perspective, considering multimodal information processing with potential applications in robotics. Recent advances in deep learning and generative AI technologies have markedly improved the performance of image and natural language processing. This paper surveys sign language recognition, translation, and generation, dedicating a section to research using large language models, which represents a novel approach to sign language processing (SLP). We also review currently available datasets, focusing on their applicability to SLP. Key findings include demonstrating the limitations of gloss-based approaches in capturing non-verbal cues and the wide variability across datasets, which impede the development of robust SLP systems. Additionally, we identify inconsistencies in evaluation metrics, emphasizing the need for standardized approaches that account for the nuances of both sign and spoken languages. Finally, we evaluate existing datasets, assessing their relevance and potential to advance sign language processing research.
doi: 10.1080/01691864.2024.2407130pmid: N/A
In this paper, we present a variant of our previous research on multi-goal path finding problem, focusing on finding a feasible and closed path to visit a sequence of goals in an environment with obstacles. The newly proposed method, Segmentation & Regression v2 (S&Reg v2), employs multi-task learning networks to generate regions and estimates of lengths of local paths between pairwise goals. Importantly, the estimates are performed as weights for a complete graph to compute the visiting sequence. Subsequently, the path-finding process is executed following the sequence, and the predicted region works as a sampling domain to enhance the search speed. A hybrid sampler is designed by combining a uniform domain with the region domain, ensuring successful samples, even if the region is disconnected. Besides, a selection rule is introduced to balance the sampling domain during different searching stages. A proof of probabilistic completeness of the S&Reg v2 method is given. Simulations verify the superior performance of the S&Reg v2 method, demonstrating a reduction in calculation time ranging from 3.9% to 13.0%. Furthermore, a practical scenario validates the reliability of S&Reg v2, achieving a 15.0% improvement in success rate and a 9.7% reduction in calculation time.
Augustine Ajibo, Chinenye; Ishi, Carlos Toshinori; Ishiguro, Hiroshi
doi: 10.1080/01691864.2024.2398554pmid: N/A
This study investigates how the perception of persuasive behaviors (polite, logical, displeased, angry) of an android robot is affected by situations regarding the context of violation (affecting oneself or others), and by subject traits, such as compliance awareness (CA) and agreeableness (AG). We conducted a video-based experiment based on a mixed-subjects design with 98 participants from the US and conducted a three-way mixed analysis of variance to investigate the impact of persuasive types and situation types (as within-subject factors), and the subject trait groups (CA or AG, as between-subject factors), on the subjective impressions of the persuasive behaviors by the android robot. Results showed that more negative behaviors (anger and displeasure) are appraised as being more appropriate and effective to persuade a violator in situations where the violation affects others, while no clear preference was found in a situation where the violation affects only oneself. Regarding the subject traits, participants with higher CA and lower AG would be willing to adhere to any persuasive behaviors, while their counterparts would dislike being persuaded through negative behaviors by the robot. These findings can be considered in future studies to develop cognitive models for generating situation-aware behaviors in social robots.
Ma, Ruidong; Liu, Yanan; Graf, Erich W.; Oyekan, John
doi: 10.1080/01691864.2024.2407115pmid: N/A
The Assemble-To-Order (ATO) strategy is increasingly becoming prevalent in the manufacturing sector due to the high demand for high-volume personalised and customised goods. The use of Human-Robot Collaborative (HRC) Systems are increasingly being investigated in order to make use of the dexterous strength of human hands while at the same time make use of the ability of robots to carry massive loads. However, current HRC systems struggle to adapt dynamically to varying human actions and cluttered workspaces. In this paper, we propose a novel neural network framework that integrates both Graph Neural Network (GNN) and Long Short-Term Memory (LSTM) for adaptive response during HRC scenarios. Our framework enables a robot to interpret human actions and generate detailed action plans while dealing with objects in a cluttered workspace thereby addressing the challenges of dynamic human-robot collaboration. Experimental results demonstrate improvements in assembly efficiency and flexibility, making our approach the first integration of iterative grasping and flexible HRC within a unified neural network architecture.
García, Gonzalo A.; Pérez, Guillermo; Laycock-Narayan, Rohan K.; Levinson, Leigh; Amores, J. Gabriel; Alvarez-Benito, Gloria; Castro-Malet, Manuel; Castaño-Ocaña, Mario; López-González de Quevedo, Marta J.; Durán-Viñuelas, Ricardo; Gomez, Randy; Šabanović, Selma
Showing 1 to 6 of 6 Articles
doi: 10.1080/01691864.2024.2415093pmid: N/A
Haru4Kids (H4K) is a system that emulates the physical, social, family-oriented robot Haru, designed with the goal of cohabitating with children in their homes for extended periods of time. In a previous experiment [Garcia GA, Perez G, Levinson L, et al. Living with Haru4Kids: study on children's activity and engagement in a family-robot cohabitation scenario. In: 2023 IEEE ROMAN; 2023 Aug. p. 1428–1435], seven families kept H4K for a span of two weeks in their homes. Throughout this period of cohabitation, we collected child-robot interaction data, including images that were later hand-annotated to estimate user engagement. In this present work, we used a novel AI-based, four-stage framework available from Roboflow for the automatic estimation of children's level of engagement from their inferred emotions. We did a deep study of the performance and behaviour of that framework over our dataset of users' pictures and characterized its response in order to understand its advantages and limitations, including the technique used to translate emotions into engagement levels. We also tested a different approach for that mapping, using a machine learning technique based on Support Vector Machine (SVM). The framework yielded promising results just ‘off-the-shelf’: 0.47–0.68 accuracy and 0.46–0.70 F1 using the original mapping, and 0.39–0.75 and 0.37–0.78 respectively using SVM. Therefore, we propose this emotion-based approach for engagement estimation from pictures.