Altarawneh, E. and Jenkin, M. Leveraging cloud-based tools to talk with robots. Proc. 16th Int. Conf. On Informatics in Control, Automation and Robotics (ICINCO), Prague, Czech Republic, July 2019. TAlthough there has been significant advances in human-machine interaction systems in recent years, cloud- based advances are not easily integrated in autonomous machines. Here we describe a toolkit that supports interactive avatar animation and modeling for human-computer interaction. The avatar toolkit utilizes cloud-based speech-to-text software that provides active listening by detecting sound and reducing noise, a cloud-based AI to generate appropriate textual responses to user queries, and a cloud-based text-to-speech generation engine to generate utterances for this text. This output is combined with a cloud-based 3D avatar animation synchronized to the spoken response. Generated text responses are embedded within an XML structure that allows for tuning the nature of the avatar animation to simulate different emotional states. An expression package controls the avatar's facial expressions. Latency is minimized and obscured through parallel processing in the cloud and an idle loop process that animates the avatar between utterances.
Codd-Downey, R. and Jenkin, M. Finding divers with SCUBANet. Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), Montreal, Canada, May 2019. Robot-diver communication underwater is complicated by the attenuation of RF signals, the complexities of the environment in terms of deploying interaction devices, and issues related to the cognitive loading of human operators. Humans operating underwater have developed a simple yet effec- tive strategy for diver-diver communication based on the visual recognition of gestures. Can a similar approach be effective for diver-robot communication? Here we
present experiments with SCUBANet, an underwater detection dataset of body parts associated with diver-robot communication. Given the nature of standard diver gestures, here we concentrate on diver recognition and in particular on diver body-head-hand localization and examine the feasibility of using a CNN-based approach to address this problem. Such data-driven approaches typically require an appropriately annotated dataset. The SCUBANet dataset contains images of object classes commonly encountered during human-robot communication underwater. Object classes are labeled using per-instance bounding boxes. Annotations were created through crowd sourcing via a web-based interface to ease deployment. We provide baseline performance on diver and diver component recognition and localization using transfer learning on three widely available pre-trained models.
Kapralos, B., Kanev, K., Jenkin, M. and Dubrowski, A. Immersive technologies and multimodal interactions in biomedical engineering and augmented medical simulations and training. Proc. 5th Int. Symp. towards the Future of Advanced Researches in Shizuoka University, Hamamatsu, Japan, March 6, 2019. Despite the growing popularity of virtual simulations, there are a number of issues that must be addressed before their use becomes widespread within the medical teaching curriculum. More specifically, the majority of available virtual simulations are restricted cognitive skill development given the complexities and costs associated with high-end haptics-based rendering required to simulate the sense of touch inherent with medical-based technical skills
development. Although several low cost and low fidelity gaming-based haptic devices are currently available, they are rather restrictive and cannot provide the higher level of fidelity, and the range of motion (degrees of freedom) required to realistically simulate many medical procedures. One issue pertains to fidelity, that is, how realistic the virtual environment must be in order to ensure effective learning, while another issue pertains to multimodal interactions. The
perception of fidelity is influenced by multimodal interactions, which has potentially significant implications for designers and developers of virtual simulations, given that with current technology we cannot faithfully recreate a real-world scenario with equal consideration of all of the senses. We have formed an interdisciplinary team of researchers comprised of experts in computer science, biomedical engineering, medicine/healthcare, and education that is examining the application of immersive technologies and virtual simulation to medical engineering with an emphasis of medical simulation. Our work to date has seen the development of various virtual simulations including one developed for interprofessional critical care education, and another for eye fundus examination training. Aside from their use as an educational tool, the two platforms are being used as a testbed to investigate various aspects of fidelity and multimodal
interactions (visuals, sound, and haptic cues). Our goal is to develop a greater understanding of fidelity and multimodal interactions, and to examine whether we can take advantage of any perceptual effects to ultimately develop virtual simulations that are more effective with respect to learning and cost. Through a series of user studies, our work to date has methodically examined the direct effect of sound on the perception of visual fidelity and its relationship to task
performance. We have also begun ex- amining whether sound can be used
to increase the perception of haptic fidelity inherent in low-end consumer-level haptic devices to allow the use of such devices (coupled with the appropriate auditory cues) in applications that require higher fidelity at a fraction of the associated cost. Here we provide a brief overview of the two platforms, and the experiments conducted to date.