TY - JOUR AU - Zhang,, Jianjun AB - Abstract Training simulator is an efficient and innovative tool to help users learn professional skills due to its convenience and safety. However, complex human–computer interaction is one of the main disadvantages that limit its effectiveness in safety training, especially for the rescue of a railway accident that requires collaborations. Through designing a set of task-specific hand gestures, we developed a training simulator for the recovery of a railway accident that helps the rescuers learn and practice rescue skills in a life-like environment and gain the firsthand experience. To test the validity of our training simulator, a user experiment is designed to compare it with the controller-based simulator in a between-groups study with 51 participants, focusing on different aspects of effectiveness. The results demonstrate that the hand gesture-based controller can be more efficient and usable to deal with complex interactions than the traditional hand-held controller. RESEARCH HIGHLIGHTS An interactive, multi-user, immersive crane training simulator is described. A hand gesture-based interaction is proposed to assist rescuers in the safety training of railway accidents. The training simulator is to encourage the crane operator and the signaler to collaborate in the recovery work of railway accidents. The proposed training simulator sets a good example for developing similar crane-training simulator in the future. 1. INTRODUCTION Railway accidents severely threaten the safe running of train and cause economic loss and human death. However, after an accident occurs, inadequate or lack of rescue experience and skills of many rescuers who participated in the rescue often lead to low rescue efficiency, chaos and even casualties. Therefore, improving the rescue knowledge and professional skills prior to taking part in an on-site rescue is critical in reducing railway accident-related injuries and death. Through a process of rescue training in actual railway accidents, inexperienced rescuers can improve their ability to deal with the railway accident. However, on-site rescue training for inexperienced rescuers is impractical due to concerns associated with enormous social and economic costs, and high risks. Virtual reality (VR) training simulator, as an efficient training medium, provides a safe, low-cost approach to view, hear and interact with realistic scenes through input devices to carry out the training operations (Park et al., 2006). In railway accidents, rescuers generally utilize the railway crane to recover the traffic. VR crane-training simulators have been well investigated by a number of researchers. For instance, to overcome the special and temporal limitation of the traditional teaching forms, Sang et al. (2016) made an interactive truck crane simulation based on VR and carried out simulation experiments of crane’s movements. Juang et al. (2013) introduced a virtual crane-training simulator to apply the kinematic vision and stereoscopic vision into a virtual crane simulator to increase its safety training effectiveness. To reduce the incidents during operations, Peteira et al. (2011) proposed a virtual environment to improve the operation accuracy of operators in handling industrial cranes. Although VR has already been extensively used in crane training simulator, few studies have been conducted focusing on the human–computer interaction (HCI) of a simulator. For most interactive devices in crane training simulators, trainees are obliged to control joysticks, controllers or keyboards, etc., to interact with the crane training simulator (Erra et al., 2018). Such interaction modes have inherent disadvantages: (i) the interactive device is delicate, so it must be carefully handled. (ii) It can be broken during the complex operations. (iii) Users are required to have prior experiences to control these interactive devices, which can lead to a poor efficiency for new users. On the other hand, crane operations are collaborative tasks that involve multiple crane operators and signalers. When crane operators are not able to see the obstacle in a workspace, they completely rely on the instructions from signalers to conduct operation safely. Although signalers play a critical role in the safety training, they are usually poorly trained and have little experience in collaborating with crane operators in real lifting tasks (Fang and Teizer, 2014). In our previous work (Xu et al. 2018), a comparison was made between the training provided by the controller-based control training simulator and traditional on-site training approach through interviews and surveys. The findings have demonstrated better training effects of using the controller-based control training simulator in railway accident recovery process than traditional training method. However, as we used interactive device in our previous study, the limitations are obvious: firstly, the complex interaction of pre-tested crane training system limits the smoothness and the effectiveness of railway accident training. Secondly, our previous work focused on railway crane operators’ training, which fails to take the collaboration of operators in the rescue into consideration. The concept of HCI was proposed in the 1980s, which focused on using knowledge in cognitive and computer sciences to improve the usability of computer (Card et al., 1983). Since the development of computer technology, research interest in HCI has also been attracted to develop new design methods. As an alternative interactive method, gestures can be used as a communication tool between computers and human, which is not only an ornament of spoken language but also an essential component of the language itself (Rautaray and Agrawal, 2015). The gesture recognition technology receives great attention in HCI (Zafrulla et al., 2011; Panwar, 2012; Meng et al., 2013; Harshitha et al., 2014). Comparing with traditional interactive tools such as mouse, hand-held controllers and joysticks or other input devices, hand gestures provide a unique user experience with respect to effectiveness, performance and ease-of-use. Currently, the design and development of gesture-based simulators are supported mainly by two methods, namely, vision-based devices and contact-based devices (Rautaray and Agrawal, 2015). The vision-based devices rely on the captured video sequence by one or several cameras for interpreting and analyzing the motion. Vision-based gesture recognition has made big progress due to the development of cameras, image and video compression technologies. Wu et al., (2016) introduced a virtual farming object interaction system based on cloud computing and somatosensory technology of leap motion. The design was composed of some advanced technique, such as the cloud-side calculation and gesture interaction control, to support the crop species selection, morphological changes, crop growth, pause and shadow generations. Strazdins et al. (2017) developed a gesture-based simulator using Kinect to allow the participants to use natural gestures in the crane operation workplace. The contact-based devices use specific part(s) of human body to recognize gestures based on physical tracking devices or wearable sensors attached to the users’ body. For instance, Pouke et al., (2012) combined the eye tracker and mid-air gesture interaction using an accelerometer sensor attached to the users’ hand to perform gesture control. An automatic hand gesture recognition simulator has also been developed for augmented reality. The differences between static and dynamic gestures are addressed (Reifinger et al., 2007). The applications of gesture recognition simulators are well investigated. However, there are a few studies focusing on the HCI experience of crane training simulators, especially for railway cranes in railway accident rescue. Given these considerations, in order to improve the effectiveness and performance of rescuer operations, in this study hand gestures are used to enable the cooperation between operators and signalers at a railway accident site. In the case study part, a user experiment is designed to test the efficiency and usability of our simulator compared with the controller-based simulator. 2. ARCHITECTURE AND METHODS 2.1. Overall architecture of the simulator The overall architecture of our simulator is shown in Fig. 1. The simulator is comprised of three main parts including the input module, the motion control module and the output module. Overall architecture of the simulator. FIGURE 1 Open in new tabDownload slide FIGURE 1 Open in new tabDownload slide (i) The input module is mainly responsible for processing the sensor data captured by HTC Vive and passing it to the motion control module. (ii) Motion control module will trigger the motion of the railway crane and signaler’s actions. (iii) Finally, the scenario will be updated in the output module. On the hardware side, the HTC Vive launched by HTC Co., and Valve Co. is used to offer immersive experiences with its advantages in convenience, low cost and accuracy. The trainee can interact with virtual environment by hand gestures. To offer a better interactive experience, our simulator also provides multiplayer mode, which enables the crane operator and the signaler to work collaboratively. 2.2. Hand gesture-based control 2.2.1. Gesture mapping User interface design is important HCI device (Rezazadeh et al., 2011). In particular, how to design intuitive and natural hand gestures requires various considerations, such as users’ physical limitations and recognition accuracy of input devices. The actions of a railway crane, such as swinging, luffing, hoisting, extending and travelling as shown in Fig. 2, are frequently applied in recovery process after railway accidents. In this proposed training simulator, these actions can be triggered and performed correspondingly by a set of hand gestures rather than using joysticks or controllers. The basic motion of a railway cran. FIGURE 2 Open in new tabDownload slide FIGURE 2 Open in new tabDownload slide In order to build a set of hand gestures for railway rescue, a hand gesture vocabulary is designed by the following principles as illustrated in Fig. 3. Each action of a crane is associated with one hand gesture that people use in daily life. For example, in Fig. 3a, when the operator keeps arms by side, with the upper arm and the lower arm form an angle of 90 degrees, it means that the actions of crane will be suspended. For example, the user has to make the hand gesture described above to suspend the actions of the railway crane. To control the traveling of crane, the user extends their right arm forward/backward slowly as shown in Fig. 3b, and the position of the crane in the virtual scenario will move accordingly. Similarly, in Fig. 3c, when the user moves the right arm upward/downward, it means that the crane luffs the boom. In Fig. 3d, when the user moves their right arm leftward/rightward, it represents that the crane swings. Likewise, moving the left arm forward/backward means that the crane extends/retracts the boom as shown in Fig. 3e, and moving the left arm upward/downward means that the crane hoists up/lower the rope as shown in Fig. 3f. Predefined hand gestures: (a) ‘Stop’ hand gesture, (b) ‘Traveling’ hand gesture, (c) ‘Luffing’ hand gesture, (d) ‘Swinging’ hand gesture, (e) ‘Extending’ hand gesture and (f) ‘Hoisting’ hand gesture. FIGURE 3 Open in new tabDownload slide FIGURE 3 Open in new tabDownload slide 2.2.2. Motion control mechanism The motion control module is predesigned to provide interactions between the users and the railway crane. It is responsible for mapping the user’s hand gesture to the actions of the railway crane. The user’s hand gesture is tracked by the HTC Vive controller. In Fig. 4, when the controller moves into the non-trigger zone following the hand gesture (Fig. 3a), a ‘neutral’ state is triggered, which means that the railway crane suspends the current actions. When the controller moves into the trigger zone following the hand gestures (Fig. 3b–f), the controller can trigger different predefined actions of railway crane in the virtual scenario. The schematic design of controllers. FIGURE 4 Open in new tabDownload slide FIGURE 4 Open in new tabDownload slide The Algorithm I is proposed to identify the state transition, which is based on Fig. 4. The size of the non-trigger zone is limited to|$2\Delta l$|⁠, then the position values of controllers are compared with the predefined threshold (⁠|$2\Delta l$|⁠) to obtain the state of the controller. As stated in Fig. 4, the position of the headset is made as the world coordinate, and then the position value of trigger zone |$({\mathrm{x}}_0,{y}_0,{z}_0)$| is predefined as the local coordinate. The position value of the controller |$(\mathrm{x},\mathrm{y},\mathrm{z})$| will change with the hand gesture. If the controller is triggered, then the predefined actions of the railway crane will be triggered by the hand gesture. Algorithm I. . //The controller is tracked and the position value |$\big(\mathrm{x},\mathrm{y},\mathrm{z}\big)$| is transmitted to the computer: LET |${x}_0-\Delta l{y}_0+\Delta l$| and |$\mathrm{y}<{y}_0-\Delta l$|       or |$\mathrm{x}>{x}_0+\Delta l$| and |$\mathrm{x}<{x}_0-\Delta l$|       or |$\mathrm{z}>{z}_0+\Delta l$| and |$\mathrm{z}<{z}_0-\Delta l$| THEN     The predefined actions of a railway crane are triggered     ELSE     The actions of a railway crane are in neutral   ELSE     IF|$\Big|x-{x}_0\Big|>\Delta l$| and |$\Big|y-{y}_0\Big|>\Delta l$| and |$\Big|z-{z}_0\Big|>\Delta l$| THEN     The controller is invalid     ELSE     Changing the hand gesture     UNTIL the controller is triggered End of Algorithm I Algorithm I. . //The controller is tracked and the position value |$\big(\mathrm{x},\mathrm{y},\mathrm{z}\big)$| is transmitted to the computer: LET |${x}_0-\Delta l{y}_0+\Delta l$| and |$\mathrm{y}<{y}_0-\Delta l$|       or |$\mathrm{x}>{x}_0+\Delta l$| and |$\mathrm{x}<{x}_0-\Delta l$|       or |$\mathrm{z}>{z}_0+\Delta l$| and |$\mathrm{z}<{z}_0-\Delta l$| THEN     The predefined actions of a railway crane are triggered     ELSE     The actions of a railway crane are in neutral   ELSE     IF|$\Big|x-{x}_0\Big|>\Delta l$| and |$\Big|y-{y}_0\Big|>\Delta l$| and |$\Big|z-{z}_0\Big|>\Delta l$| THEN     The controller is invalid     ELSE     Changing the hand gesture     UNTIL the controller is triggered End of Algorithm I Open in new tab Algorithm I. . //The controller is tracked and the position value |$\big(\mathrm{x},\mathrm{y},\mathrm{z}\big)$| is transmitted to the computer: LET |${x}_0-\Delta l{y}_0+\Delta l$| and |$\mathrm{y}<{y}_0-\Delta l$|       or |$\mathrm{x}>{x}_0+\Delta l$| and |$\mathrm{x}<{x}_0-\Delta l$|       or |$\mathrm{z}>{z}_0+\Delta l$| and |$\mathrm{z}<{z}_0-\Delta l$| THEN     The predefined actions of a railway crane are triggered     ELSE     The actions of a railway crane are in neutral   ELSE     IF|$\Big|x-{x}_0\Big|>\Delta l$| and |$\Big|y-{y}_0\Big|>\Delta l$| and |$\Big|z-{z}_0\Big|>\Delta l$| THEN     The controller is invalid     ELSE     Changing the hand gesture     UNTIL the controller is triggered End of Algorithm I Algorithm I. . //The controller is tracked and the position value |$\big(\mathrm{x},\mathrm{y},\mathrm{z}\big)$| is transmitted to the computer: LET |${x}_0-\Delta l{y}_0+\Delta l$| and |$\mathrm{y}<{y}_0-\Delta l$|       or |$\mathrm{x}>{x}_0+\Delta l$| and |$\mathrm{x}<{x}_0-\Delta l$|       or |$\mathrm{z}>{z}_0+\Delta l$| and |$\mathrm{z}<{z}_0-\Delta l$| THEN     The predefined actions of a railway crane are triggered     ELSE     The actions of a railway crane are in neutral   ELSE     IF|$\Big|x-{x}_0\Big|>\Delta l$| and |$\Big|y-{y}_0\Big|>\Delta l$| and |$\Big|z-{z}_0\Big|>\Delta l$| THEN     The controller is invalid     ELSE     Changing the hand gesture     UNTIL the controller is triggered End of Algorithm I Open in new tab Moreover, a state transition model is illustrated in Fig. 5 to provide a better explanation of the interaction between the hand gesture and the controller, as well as the actions of the railway crane. The HTC Vive controller has two states: ‘neutral’ or ‘trigger’. The state transition is transformed by the change of input conditions, based on the hand gestures. Once the controller is activated, a connection between the controller and the simulator will be established through the application program interface provided by the HTC Vive. Based on the connection, the states of controllers (neutral/trigger) will switch according to different hand gesture and further trigger the actions of the railway crane in virtual scenarios. Concept model of state transition. FIGURE 5 Open in new tabDownload slide FIGURE 5 Open in new tabDownload slide Figure 6 shows the motion control results of a railway crane using hand gestures. The results indicate that our defined hand gestures are capable of controlling the actions of virtual railway crane accurately by using HTC Vive controller in real time. Illustration of different motion phases of a railway crane: (a and b) moving crane on the track, (c) luffing the boom, (d) extending the boom, (e) swinging the base and (f) hosting the accdient vehicle. FIGURE 6 Open in new tabDownload slide FIGURE 6 Open in new tabDownload slide 2.3. Hand gesture track 2.3.1. Hand signals in railway crane operations Crane lifting is a collaborative task that involves of crane operators and signalers. Crane operators often cannot see the scene very well due to the obstruction in the workspace. They have to rely on signals or instructions (Fig. 7) (Sui, 1998) from signalers to conduct safe operations (Fang and Teizer, 2014). A good cooperation between the signaler and the operator is crucial in preventing crane accidents. Hand signals in the railway rescue workplace. FIGURE 7 Open in new tabDownload slide FIGURE 7 Open in new tabDownload slide 2.3.2. Upper body motion tracking To offer a multiplayer interaction in our simulator, the signaler’s full body is brought into the virtual environment. Firstly, the HTC Vive allows any form of motion behaviors to be tracked in a virtual environment to enrich the user experience. Hence, we can generate plausible hand motions of avatars to represent the signaler in virtual space, with the efficient inverse kinematic algorithm (Tolani et al., 2000). Inverse kinematics (IK) algorithm is an approach to automatically calculate the angles and positions of joints between the two known endpoints. For example, a 3D human model avatar with bones is pre-designed, and the hand-held controller is set as the arm’s parent object. IK algorithm can instantly update the angles and positions of the arm joint, using the input locations of the two controllers as the targets of the avatars’ hands (Tan et al., 2017). Figure 8 is a demonstration of tracking signaler’s motion with IK. Signalers’ motion tracking results. FIGURE 8 Open in new tabDownload slide FIGURE 8 Open in new tabDownload slide 3. EXPERIMENTS 3.1. Experiment setup The aim of the experiment is to test three hypotheses: (i) our simulator is more effective than controller-based training simulator; (ii) our simulator supports the collaboration between the operator and the signaler in recovery process of the railway accident; and (iii) the hand gesture-based control is easier to use than controller-based interaction. The experiment method of our simulator is presented in Fig. 9. One participant controlled the railway crane, and the other, acting as an operator, manipulated the virtual avatar to assist the operators to conduct rescue operations. The simulator only requires one suit of HTC Vive. We equipped the operator with an Head‐Mounted Display (HMD) and a hand-held controller. The signaler is provided with the Vive tracker, but not HMD. In this case the signaler could not be immersed in the virtual scenario. They had to stand toward the instructor computer and make signals accordingly. The instructor computer is able to display the performance of the operator and the signaler. Crane operator is collaborating with the signer in railway accident scenario: the first picture is rendered from a third user’s perspective, and the following pictures are the actual scenario user’s view in the virtual railway accident scenario. FIGURE 9 Open in new tabDownload slide FIGURE 9 Open in new tabDownload slide The virtual environment was developed to mimic the real railway accident environment the operator and signaler will encounter on-site. Here we used an identical railway accident scenario, collision detection and response as the previous system (Xu et al., 2018). 3.2. Participants In this study, 51 students from the college, aged between 22 and 26 (mean: 24.2; SD: 1.49), volunteered themselves in the experiment. They were randomly assigned to two groups; 26 participants have been assigned to hand-gesture group. They performed in pairs: one as the signaler and the other as operators. The rest 25 participants have been assigned to the controller group. All of them had little or no training experience of railway crane operations and crane signals. Before the experiment, a questionnaire was given to the participants to fill in. This questionnaire was designed to assess the participants’ knowledge in HCI and VR. All of the participants had known hand gesture track technology and the VR technology (mean: 4.05; SD: 0.64). They had used the hand gesture control (mean: 2.11; SD: 0.68). All participants had experiences with the keyboard, mouse or controller as HCI tools (mean: 4.5; SD: 0.54). 3.3. Procedure The procedure in this experiment includes three steps. Firstly, all of the participants were asked to try two different interactions: hand gesture-based control and controller-based control to help them familiar with each of the configurations before the evaluation study. Specifically, there is no signaler in the controller-based group. Our previous work focused on railway crane operators’ training and did not take the signaler in the rescue into consideration. Hence, we only measured the performance of the signaler from the hand-gesture group in the study. Secondly, the 51 participants were randomly assigned to the two groups. Finally, they operated the two control methods at the same time. In this experiment, as shown in Fig. 10, the experiment scenario is related to railway accident rescue, and it is used to test our hypotheses. After a railway accident occurs, railway cranes are required to clear the track and resume services as quickly as possible. When the vehicle is in a good condition, it should be lifted back onto the track; see in Step1. If a considerable length of track or vehicles is damaged, the vehicle should be lifted away from the track and place it on the empty space near the track, as shown in Step2. Rescue operation in railway accident site. FIGURE 10 Open in new tabDownload slide FIGURE 10 Open in new tabDownload slide Step1: clearing the derailed vehicle in railway accident rescue. Step2: re-railing working in railway accident rescue. 3.4. Participants performance measures Performance data from hand gesture group and controller group of the study were compared to assess the validity of our proposed simulator. A selected range of performance measures have been considered, including the task time, the error count and the simulator usability scale (SUS) score (Bangor et al., 2009). We measure the total time it took participants to complete the above re-railing task in the experiment. The timer started when the railway crane drove in Step1 and ended when the derailed train was lifted to the target destination near the track in Step2 in Fig. 10. In the experiment, the wrong operations or wrong gestures are defined as errors. For instance, if the railway crane collides with the lifting load or the lifting load collides with the obstacle, this will be counted as an error. To keep track of the time and errors, the process of whole experiment was observed and checked on an instructor computer to count the completion time and errors by an observer. After the experiment, the participants were given a SUS questionnaire to fill the feedback about their overall experience. The SUS questionnaire is one of the most widely used tools for evaluating system or product-aware usability. It usually consists of 10 questions with a 5-point Likert scale from 1 (strongly disagree) to 5 (strongly agree). Among them, the odd-numbered question is positive, and the even-numbered question is negative. A 0–100 score can be calculated from the 10 ratings as a numeric evaluation of subjective assessment. To obtain the SUS score, the odd questions were the rating minus 1, and the contribution from the even questions was 5 minus the rating. It guarantees that high rating always indicates positive evaluation. Then the sum of the ratings was multiplied by 2.5 to yield the final score. Previous studies confirmed that, even the sample size is small, the SUS questionnaire still provides a valid result (Bangor et al., 2009). 4. RESULTS 4.1. Time and error count Figure 11 shows the average completion time for each participant in hand-gesture group and controller group, and the data are normally distributed for each group. There is a significant difference between hand-gesture group and controller group (t(48) = 9.546; P < 0.001). It implies that the participants took more time to clear the derailed vehicle in the controller-based control group (mean = 15.39; SD = 1.07) than the hand gesture-based control group (mean = 12.6; SD = 1.00). Completion time for the rescue operation training. FIGURE 11 Open in new tabDownload slide FIGURE 11 Open in new tabDownload slide For hand-held controller-based simulators, we speculate that there are some complex buttons to control the railway crane. The participants in controller group had to spend more time to recall the operational approach. It is not very smooth, and it is tended to overshoot when participants were confused with so many buttons that causes miss operations in the experiment. When this occurred the participant needed to correct mistakes and hence cost more time. By contrast, in our proposed simulator, each action of a crane is associated with one hand gesture, which creates a series of natural and intuitive gestures that human uses in daily life. It is natural and smooth for participants to conduct the task using the hand gesture method and thus requires less completion time, which improves the average completion time by 18.1% compared with the controller method. If we define the efficiency as the average time cost, we can find that our simulator is more effective than controller-based training simulator. Figure 12 gives a summary of the errors the participants made in hand-gesture group (mean = 8.96; SD = 1.61) and controller group (mean = 12.9; SD = 1.44), and the error counts are normally distributed for each group. The difference is statistically significant in Fig. 12a (t(48) = 8.9; P < 0.001). Obviously, there is a positive correlation between the task completion time and the error count. That means the more errors the participants made, the longer time the participants cost to complete the task. In the controller group, the participants frequently confused with buttons on the hand-held controller in the experiment; hence, it is easier to make mistakes. There were also some errors made by the participants in the hand-gesture group. We observed that these errors occurred because of the instability of hand gesture tracking. In our experiment, participants mentioned that the hand-gesture control method is not very accurate, which we will discuss in the discussion section in detail. Error count participants made in experiments. FIGURE 12 Open in new tabDownload slide FIGURE 12 Open in new tabDownload slide 4.2. Accuracy evaluation of railway crane signals Good collaboration between the crane operator and the signaler will improve the crane operation progress. This is especially true in railway accident rescue and hence is significant to keep the accuracy of actions of signalers. To thoroughly assess the performance of the signaler, we compared every signal performed by the participant in the hand gesture group with the standard signals by rating a 5-point scale from 1 (completely wrong) to 5 (perfect). Table 1 presents that every single gesture in hand-gesture based group is rated by the crane operation experts. Obviously, the average score is very high, which means participants perform well using hand gesture, and the signaler gives the correct instruction to the operator in the study. Accuracy evaluation of railway crane signals. TABLE 1 Accuracy evaluation of railway crane signals. . Mean . SD . Hoisting raise 3.6 0.89 Hoisting lower 3.64 0.84 Swinging left 3.8 0.84 Luffing up 3.8 0.89 Stop 4.12 0.81 . Mean . SD . Hoisting raise 3.6 0.89 Hoisting lower 3.64 0.84 Swinging left 3.8 0.84 Luffing up 3.8 0.89 Stop 4.12 0.81 Open in new tab TABLE 1 Accuracy evaluation of railway crane signals. . Mean . SD . Hoisting raise 3.6 0.89 Hoisting lower 3.64 0.84 Swinging left 3.8 0.84 Luffing up 3.8 0.89 Stop 4.12 0.81 . Mean . SD . Hoisting raise 3.6 0.89 Hoisting lower 3.64 0.84 Swinging left 3.8 0.84 Luffing up 3.8 0.89 Stop 4.12 0.81 Open in new tab Moreover, from the above the results, we know that, compared with the controller group, the operators in hand-gesture group performed better. One reason could be that when operators cannot see the scene very well due to the obstruction in the workspace they rely on signaler to conduct safe and quick operations, which cause less time and errors. However, in this study, the performance is not perfect from the above Table 1. This is probably because all the participants have little experience with crane signals; the way they understand railway crane signals varies a lot, resulting in wrong hand gestures. Moreover, the hand gesture in real world may be inconsistent with the avatar’s hand gesture because the IK algorithm only provides relatively accurate hand animations when conducting small-scale hand movements. It cannot guarantee the accuracy when performing gross movements (Tan et al., 2017). 4.3. User feedback After the experiments, the participants filled in the SUS questionnaire. The results are shown in Fig. 13. Longer distance from the coordinate axis will lead a bigger advantage of the proposed simulator. In all cases, the hand gesture-based simulator performs much better comparing with the hand-held controller based simulator. For questions 1 and 9 which are related to the engagement of the simulator, the hand gesture-based control shows superior to the controller-based control. For questions 3 and 5, which investigate the instruction simplicity of the training simulator, instructions given by the hand gesture-based control are perceived as simpler as those with the controller-based control. Evaluation results of controller and hand gesture method. FIGURE 13 Open in new tabDownload slide FIGURE 13 Open in new tabDownload slide Based on the above method, we obtained the SUS scores of the hand-gesture group and the controller group, and then described in Table 2. SUS score in hand-gesture group and controller group TABLE 2 SUS score in hand-gesture group and controller group . Mean . SD . Hand-gesture 70.4 4.3 Controller 60.4 5.7 . Mean . SD . Hand-gesture 70.4 4.3 Controller 60.4 5.7 Open in new tab TABLE 2 SUS score in hand-gesture group and controller group . Mean . SD . Hand-gesture 70.4 4.3 Controller 60.4 5.7 . Mean . SD . Hand-gesture 70.4 4.3 Controller 60.4 5.7 Open in new tab Bangor et al. (2009) pointed out that when the SUS score is <50, the application is considered unacceptable to the users; when the SUS score is >50 but <70, the application is within the critical value range that can be accepted by the user. The higher the score, the more useful for the application. Obviously, Table 2 demonstrates that our proposed simulator appears more useful to the participants. 5. Discussion The results of the study confirmed the hypotheses. Participants who had used the hand-gesture based simulator complete the training significantly faster and with fewer errors than participants who had used the controller-based simulator. This result is important because completing railway accident rescues efficiently is significant for live emergency rescue operation. A few participants, in hand-gesture group, reported that the proposed simulator requires physical efforts of using hands and arms, which may cause a fatigue or numb, especially for the repeated gesture and mid-air gesture. Nevertheless, participants still showed interests in hand-gesture interaction. According to the results of time and error measurement, the users can learn and practice the required skills quickly and accurately in railway accident rescues and recoveries through this kind of virtual interactive training. In addition, the results of SUS questionnaire also indicated the participants agreed that the proposed system would facilitate active learning and student motivation as well as engagement. More importantly, the participants believed that the proposed training simulator is more useful than the hand-held controller-based simulator. A particular feature of our simulator is to encourage the crane operator and the signaler to collaborate in the recovery work of railway accidents. The operator controls the railway crane and the signaler gives instructions to help the operator to complete the operation. In Table 1, the average score is very high, which means participants perform well using hand gesture to the standard railway crane signals. Obviously, the operator and the signaler could ‘communicate’ effectively. The application of hand gesture track technology in our training simulator could enhance the efficiency of rescue training. This study has a number of limitations. A big challenge of hand gesture-based control lies in the simulator’s accuracy and control fidelity. In our experiment, participants mentioned that the hand-gesture control method is not very accurate. For example, we observed that some participants extended the arm forward or moved it forward, which could result in suspending the actions of the railway crane in the virtual scenario. Hence, the accuracy of hand-gesture tracking technology in our simulator should be further investigated and improved. In addition, each movement of the crane is associated with one hand gesture, which creates a series of natural and intuitive gestures that human uses in daily life. It would be very easy for the participants to learn and understand the gesture-based control and thus reduce the likelihood of errors. However, a few participants mentioned that the hand-gesture control is not very friendly and the differences between left and right hand are sometimes not very obvious. It is easy for the participants to be confused and thus make errors. That is because the participants have different hand-gesture habits or physical preferences. It is possible to define better gestures in the future, for example, moving two hands at the same time for traveling the whole vehicle but single hand for extending the arm. Another limitation is the possibility of ‘negative transfer’ from simulator interactive devices to real operations because of differences between the design of interactive devices and real controls. In general, there is a focus on the fidelity with the belief that training has to be as close to the real thing as possible to be effective and to prevent negative transfer. Fidelity can be measured, but figuring out what and how to measure fidelity is not a trivial study. While the correspondence between the training environment and real environment is necessary, selecting the critical features for representation in the training environment depends upon both the task and the trainee’s level of expertise. Further study will be carried out on fidelity to boost our simulator in future. In addition to the limitations mentioned above, another concern is that we did not provide an immersed virtual scenario for the signaler. According to latest study (Skarredghost, 2018), it is possible to use multiple headsets in the same room. We will integrate this technology into our simulator in the future study to provide better user experience for the signaler. In the experiment, the simulator was evaluated by young participants who had no experience in railway accident rescue and knowledge on working condition in railway accident. In the future, we will carry out experiments with participants who have rich experience in major railway accident rescue to see if the knowledge of participants could make a difference. Furthermore, we can carry out more experiment by comparing our simulator with other interaction methods such as a keyboard, joystick or other input devices to give a comprehensive analysis. 6. Conclusions In this study, we proposed a hand gesture-based interaction to assist rescuers in the training of railway accidents. To evaluate its validity, an experiment was designed to measure a range of behaviours and performances of participants. The results demonstrate that our hand gesture-based simulator has advantages over the controller-based ones in terms of efficiency and usability. Our simulator also supports multi-users interactions. Therefore, it can simulate the collaborations between operators and signalers. Furthermore, this study sets a good example for developing similar crane training simulator in the future. Funding National Natural Science Foundation of China (51405402); Independent Research Project of the State Key Laboratory of Traction Power. References Bangor , A. , Kortum , P. and Miller , J. ( 2009 ) Determining what individual SUS scores mean: adding an adjective rating scale . J. Usability Stud. , 4 , 114 – 123 . OpenURL Placeholder Text WorldCat Card , S. K. , Moran , T. P. and Newell , A. ( 1983 ) The keystroke-level model for user performance time with interactive systems . Commun. ACM , 23 , 396 – 410 . Google Scholar Crossref Search ADS WorldCat Chan , J. C. P. , Leung , H. , Tang , J. K. T. and Komura , T. ( 2011 ) A virtual reality dance training system using motion capture technology . IEEE Trans. Learn. Technol. , 4 , 187 – 195 . Google Scholar Crossref Search ADS WorldCat Erra , U. , Malandrino , D. and Pepe , L. ( 2018 ) Virtual reality interfaces for interacting with three-dimensional graphs . Int. J. Hum. Comput. Interaction , 12 , 1 – 14 . OpenURL Placeholder Text WorldCat Fang , Y. and Teizer , J. ( 2014 ) A multi-user virtual 3d training environment to advance collaboration among crane operator and ground personnel in blind lifts . J. Agron. Crop Sci. , 200 , 261 – 272 . Google Scholar Crossref Search ADS WorldCat Harshitha , R. , Syed , I. A. and Srivasthava , S. ( 2014 ) HCI using hand gesture recognition for digital sand model. In IEEE second int. conf. on image information processing , IEEE , pp. 453 – 457 . Juang , J. R. , Hung , W. H. and Kang , S. C. ( 2013 ) Simcrane3d+: a crane simulator with kinematic and stereoscopic vision . Adv. Eng. Inform. , 27 , 506 – 518 . Google Scholar Crossref Search ADS WorldCat Meng , Z. Y. , Pan , J. S. , Tseng , K. K. and Zheng , W. ( 2013 ) Dominant points based hand finger counting for recognition under skin color extraction in hand gesture control system. In Sixth int. conf. on genetic and evolutional computing , IEEE , pp. 364 – 367 . Panwar , M. ( 2012 ) Hand gesture recognition based on shape parameters. In Int. conf. on computing, communication and applications , IEEE , pp. 1 – 6 . Park , C. H. , Jang , G. and Chai , Y. H. ( 2006 ) Development of a virtual reality training system for live-line workers . Int. J. Hum. Comput. Interaction , 20 , 285 – 303 . Google Scholar Crossref Search ADS WorldCat Peteira , I. , Pla-Castells , M. and Gamón , M. A. ( 2011 ) Using virtual reality for increasing safety in handling cranes: a presence study . 9th int. industrial simulation conf. , EUROSIS , 73 – 80 . OpenURL Placeholder Text WorldCat Pouke , M. , Karhu , A. , Hickey , S. and Arhippainen , L. ( 2012 ) Gaze tracking and non-touch gesture based interaction method for mobile 3D virtual spaces . Ozchi , 44 , 505 – 512 . OpenURL Placeholder Text WorldCat Rautaray , S. S. and Agrawal , A. ( 2015 ) Vision Based Hand Gesture Recognition for Human Computer Interaction: A Survey . Kluwer Academic Publishers . Google Scholar Crossref Search ADS Google Scholar Google Preview WorldCat COPAC Reifinger , S. , Wallhoff , F. , Ablassmeier , M. , Poitschke , T. and Rigoll , G. ( 2007 ) Static and dynamic hand-gesture recognition for augmented reality applications. In Human–Computer Interaction. HCI Intelligent Multimodal Interaction Environments . Springer , Berlin Heidelberg . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Rezazadeh , I. M. , Wang , X. , Firoozabadi , M. and Golpayegani , M. R. H. ( 2011 ) Using affective human–machine interface to increase the operation performance in virtual construction crane training system: a novel approach . Automat. Constr. , 20 , 289 – 298 . Google Scholar Crossref Search ADS WorldCat Sang , Y. , Zhu , Y. , Zhao , H. and Tang , M. ( 2016 ) Study on an interactive truck crane simulation platform based on virtual reality technology . IJDET , 14 , 64 – 78 . OpenURL Placeholder Text WorldCat Skarredghost . ( 2018 ) SteamVR tricks: how to maximize your game area and use multiple vives in the same room . https://skarredghost.com/2018/01/16/steamvr-tricks-maximize-game-area-use-multiple-vives-room/ (accessed April 25, 2018). Strazdins , G. , Pedersen , B. S. , Zhang , H. and Major , P. ( 2017 ) Virtual reality using gesture recognition for deck operation training . Oceans , 1 – 6 . OpenURL Placeholder Text WorldCat Sui , S. and L. ( 1998 ) Railway Accident Rescue . China Railway Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Tan , Z. , Hu , Y. and Xu , K. ( 2017 ) Virtual reality based immersive telepresence system for remote conversation and collaboration . Int. workshop on next generation computer animation techniques , Springer , 234 – 247 . OpenURL Placeholder Text WorldCat Tolani , D. , Goswami , A. and Badler , N. I. ( 2000 ) Real-time inverse kinematics techniques for anthropomorphic limbs . Graph. Models , 62 , 353 – 388 . Google Scholar Crossref Search ADS PubMed WorldCat Wu , F. , Ding , Y. , Ding , W. and Xie , T. ( 2016 ) Design of human computer interaction system of virtual crops based on leap motion . Trans. Chin. Soc. Agric. Eng. , 32 , 144 – 151 . OpenURL Placeholder Text WorldCat Xu , J. , Tang , Z. , Yuan , X. , Nie , Y. , Ma , Z. , Wei , X. and Zhang , J. ( 2018 ) A VR-based the emergency rescue training system of railway accident . Entertain. Comput. , 27 , 23 – 31 . Google Scholar Crossref Search ADS WorldCat Zafrulla , Z. , Brashear , H. , Starner , T. , Hamilton , H. and Presti , P. ( 2011 ) American Sign Language recognition with the kinect. In Int. conf. on multimodal interfaces, ICMI 2011, Alicante, Spain, November , ACM , pp. 279 – 286 . © The Author(s) 2019. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Hand Gesture-Based Virtual Reality Training Simulator for Collaboration Rescue of a Railway Accident JF - Interacting with Computers DO - 10.1093/iwc/iwz037 DA - 2019-04-23 UR - https://www.deepdyve.com/lp/oxford-university-press/hand-gesture-based-virtual-reality-training-simulator-for-Wk0o50zwYi DP - DeepDyve ER -