Hogue, A., German, A. and Jenkin, M., Underwater enviornment reconstruction using stereo and inertial data. Proc. IEEE SMC, Montreal, 2007.
The underwater environment presents many challenges for robotic sensing including highly variable lightin, the presence of dynamic objects, and the six degree of freedemom (6DOF) 3D environment. Yet in spite of these challenges the aquatic environment presnets many real and practical applications for robotic sensors. A common requirement of many of these tasks is the need to construct accurate 3D representations of structures in the environment. In order to address this requirement we have developed a stereo vision-inertial sensing device that we have successfully deployed to reconstruct complex 3D structures in both the aquatic and terrestrial domains. The sensor temporally combines 3D information obtained using stereo vision algorithms with a 3D inertial sensor. The resulting point cloud ais then converted to a volumetric representation and a textured polygonal mesh is extracted for later processing. Recently obtained udnerwater reconstructions of wrecks and coral obtained with the sensor are presented.
Lappe, M., Jenkin, M., and Harris, L., Visual odometry by leaky integration. Proc. VSS 2007, Sarasota, FL. J. of Vision, 7: 147a, 2007.
Visual motion can be a cue to travel distance when the motion signals are integrated. Previous work has given conflicting results on the precision of travel distance estimation from visual motion: Frenz and Lappe reported underestimation, Redlick, Jenkin and Harris overestimatino of travel distance. In a collaborative study we resolved the conflict by tracing it to differences ineht tasks given to the subjects. Self-motion was visually stimulated in a immersive virtual environment. Subjects completed two tasks in separate blocks. They either had to report the distance traveled from the start of the movement as in earlier studies of Frenz and Lappe, or they had to report when they had reached a predetermined target position as in earlier studies by Redlick et al. Consistent with both earlier studies, underestimation of travel distance occured when the task required judgment of distance from the starting position, and overestimation of travel distance occured when the task required judgment of the remaining distance to the previewed target position. Based on these results we develoepd a leaky integrator model that explains both effects with a single mechanism. In this model, a state variable, either the distance from start or the distance to target, is updated during the movement by integration over the space covered by the movement. Travel distance mis-estimation occurs because the integration leaks and because the transformation of visual motion to travel distance involves a gain factor. Mis-estimates in both tasks can be explained with the same leak rate and gain in both conditions.
Lappe, M., Jenkin, M., and Harris, L., Travel distance estimation from visual motion by leaky path integration, Exp. Brain Res., 180: 35-48, 2007.
Visual motion can be a cue to travel distance when the motion signals are integrated. Distance estimates from visually simulated self-motion are imprecise, however. Previous work in our labs has given conflicting results on the imprecision: experiments by Frenz and Lappe had suggested a general underestimation of travel distance, whil results from Redlick, Jenkin and Harris had shown an overestimation of travel distance. Here we describe a collaborative study that resolves the conflict by tracing it to differences in the tasks given to the subjets. With an identical set of subjects and identical visula motion simulation we show that underestimation of travel distance occurs when the task involves a judgement of distance from the starting position, and that overestimation of travel distance occurs when the task requires a judgment of the remaining distance to a particular target position. We present a leaky integrator model that explains both effects with a single mechanism. In this leaky integrator model we introduce the idea that, depending on the task, either the distance from the start, or the distance to the target is used as a state variable. The state variable is updated during the movement by integration over the space covered by the movement, rather than over time. In this model, travel distance mis-estimation occurs because the integration leaks and because the transformation of visual motion to travel distance invovles a gain factor. Mid-estimates in both tasks can be explained with the same leak rate and gain in both conditions. Our results thus suggest that observers do not simply integrate travel distance and then relate it to the task. Instead, the internally represented variable is either distance from the origin or distance to the goal, whichever is relevant.
Jenkin, M., Hogue, A., German, A., Gil, S., Topol, A., and Wilson, S., Underwater surface recovery and segmentation, Proc. IEEE Int. Conf. on Cognitive Informatics (ICCI), Lake Tahoe, 2007.
The underwater environment presents many challenges for robotic sensing including highly variable lighting and the presence of dynamic objects such as fish and suspended particulate matter. The dynamic six-degree-of-freedom nature of the environmetn presents further challenges due to unpredictable external forces such as current and surge. Despite these challenges the aquatic environment presents many real and partical applications for robotic systems. A common requirement of many of these tasks is the need to construct accurate 3D representations of specific environmental structures. In order to address these needs we have developed a stereo vision-inertial sensing device that has been successfully deployed to reconstruct complex 3D structures in both the aquatic and terrestrial domains. The sensor combines 3D information, obtained using stereo vision algorithms, with 3DOF inertial data to construct 3D models of the environment. The resulting model representation is then converted to a textured polygonal mesh for later processing. Semi-automatic tools have been develoepd to aid in the processing of these representations. Reconstruction and segmentation of coral and other underwater structures obtained with the sensor are presented.
Borzenko, O., Lesperance, Y. and Jenkin, M., INVICON: A toolkit for knolwedge-based control of vision systems, CRV 2007.
To perform as desired in a dynamic environment a vision system must adapt to a variety of operation conditions by selecting vision modules, tuning their parameters, and controlling image acquisitions. Knowledge-based (KB) controller-agents that reason over explicitly represented knowledge and interact with their environment can be used for this task; however, the lack of a unifying methodology and development tools makes KB controllers difficult to create, maintain, and reused. This paper presents the INVICON toolkit, based on the IndiGolog agent programming language with elements from control theory. It provides a basic methodology, a vision module declaration template, a suite of control components, and support tools for KB controller development. We have evaluated INVICON in two case studies that involved controlling vision-based poses estimation systems. The case studies show taht INVICON reduces the effort needed to build KB controllers for challenging domains and improves their flexibility and robustness.
Kapralos, B., Jenkin, M. and Milios, E., Diffraction modeling for interactive virtual acoustical environments, GRAPP 2007.
Since the dimensions of many of the objects/surfaces encountered in our daily lives are within an order of magnitude of the wavelength of audible soudns, diffraction is an elemntary means of sound propogation. Despite its importance in the real-world, diffraction effects are often overlooked by acoustical modeling methods leading to a degredation in immersion or presence. This paper describes an acoustical diffraction method based on the Huygens-Fresnel principle. The method is simple and efficient allowing it to be incorporated in interactive acoustical environments including virtual environments. Experimental results are presented that illustrate the performance and effecitveness of the method and its conformance to theoretical diffraction models.
Harris, L. and Jenkin, M. (Eds.) Computational Vision in Neural and Machine Systems. Cambridge University Press, 2007.
Computational vision deals with the underlying mathematical and computational models for how visual information is processed. Whether the processing is biological or machine, there are fundamental questions related to how the information is processed. How should information be represented? How should information be transduced in order to highlight features of interest while suppressing noise and other artefacts of the image capture process? Computational Vision in Neural and Machine Systems address these and other questions in 13 chapters, divided into three sections, which overlap between biological and computational systems: dynamical systems; attention, motion, and eye-movements; and stereovision. The editors have brought together the best and brightest minds in the field of computational vision, combining research from both biology and computing and enhancing the developing synergy between computational and biological visual modelling communities. Aimed at researchers and graduate students in computational or biological vision, neuroscience, and psychology.
Ripsman, A., Jasiobedzki, P. and Jenkin, M. Specular planar target surface recovery via coded target stereopsis, in M. Jenkin and L. Harris (Eds.) Computational Vision in Neural and Machine Systems, Cambridge University Press, 2007.
A key challenge facing comptuer vision systems is the presence of highly specular surfaces. Such surfaces present a challenge to conventional vision systems due to their reflective nature, which can maks the true structure of the surface and lead to incorrect measurements when traditional computer vision algorithms are used. Here we present a novel approach to reconstructing the local surface structure of highly specular, planar objects. The local surface structure of an object is obtained by utilizing the specular nature of the surface directly, rather than by making the common assumption that specularities are rare or do not exist. The basis of the technique presented here is to project controlled light patterns ont the object's surface and to infer local surface structure by examining the interaction of the illuminant and the surface. The approach can be used to recover both specular and diffuse surface regions. Experimental results demonstrate the effectiveness of the approach.
Jenkin, H. L., Barnett-Cowan, M., Islam, A., Mazour, E., Dyde, R. T., Sanderson, J, Jenkin, M. R., and Harris, L. R., The effect of tilt on the perceptual upright, Proc. ECVP 2007, Arezzo, 2007.Perception 36 ECVP Abstract Supplement, 2007.
The perceptual upright (PU), the orientation in which an object is most easily and naturally recognized, is determined by combining visual, gravity and body cues. Recognizing a character the identity of which depends on its orientation can be used to assess PU. For example, the letter 'p' when rotated 180 degrees becomes the letter 'd'. The transitions from p to d and d to p, when averaged, define PU. This is the orientated character recongition task (OCHART). The PU can be predicted from the weighted vector sum of the orientation of the visual background, gravity, and body. Observers completed OCHART in several body tilts in roll. The PU measured at some body tilts (eg 45 degrees) was not accurately predicted by this simple model. One possible explanation for this is that the nervous system's assessment of the relative weights and directions vision, gravity and the body required to determine the PU may depend on the internal representation of the body tilt and the orinetaiton of the eyes in the head.
Hogue, A., Gill, S. and Jenkin, M. Automated avatar creation for 3D games. Proc. Futureplay 2007, Toronto, Canada.
Immersion is a kay factor in video game and virtual reality simulation environments. Users' presence in a virtual environment is highly dependent on the user's identification with their avatar. Creating more realistic looking avatars thus enables a higher level of presence. Current video games allow character customizability via techniques such as hue adjustment for stock models, or the ability to select from a variety of physical features, clothing and accessories in existing player models. Occasionally user uploadable facial texture is available for avatar customization. We propose a dramatic leap forward in avatar customization through the use of an inexpensive, non-invasive, portable stereo video camera to extract and model geometry of real objects, including people, and to then use these textured 3D models to drive avatar creation. The system described here generates the 3D textured and normal-mapped geometry of a personalized photorealistic user avatar suitable for animation and real-time gaming applications.
Jenkin, H. L., Zacher, J. E., Jenkin, M. R., Oman, C. M. and Harris, L. R. (2007). Effect of field of view on the Levitation Illusion. J. Vestibular Research, 17: 271-277.
Supine subjects inside a furnished room in which both they and the room are pitched 90 degrees backwards may experience themselves and the room as upright relative to gravity. This effect is known as the levitation illusion because observers report that their arms feel weightless when extended, and objects hanging in the room seem to "levitate". This illusion is an extreme example of a visually induced illusion of static tilt. Visually induced tilt illusions are commonly experienced in wide-screen movie theatres, flight simulators, and immersive virtual reality systems. For technical reasons an observer's field of view is often constrained in these environments. No studies have documented the effect of field-of-view (FOV) restriction on the incidence of the levitation illusion. Preliminary findings suggest that when concurrently manipulating the FOV and observer position within an environment, the incidence of levitation illusions depends not only on the field of view but also on the visible scene content.