2nd Enactive Workshop - Lecture Descriptions

May 25-27 2006 - McGill University - Montreal, QC Canada




Four new haptic devices from the McGill University Haptics Laboratory
Vincent Hayward

In the past two years, the Haptics Lab has been working on several new devices: a handheld vibrotactile device, a contact location display, a planar direct-drive force-feedback device, and a high-density distributed tactile display; named "MicroTactus", "Morpheotron", "Pantograph", and "StreSS", respectively, each targeted at specific approaches to mediate haptic interaction. In this presentation, I will discuss recent enhancements for each of these devices, what drives performance and what provides effectiveness in each case.

back



Perception is enactive; cognition is not
Cedrick T. Bonnet1, Thomas A. Stoffregen1, Benoit G. Bardy2
1 University of Minnesota, 2 University of Montpellier

In two experiments, we independently varied the degree of cognitive and perceptual difficulty of supra-postural tasks. Increases in perceptual difficulty in Experiment 1 tended to be correlated with decreases in the variability of postural sway, consistent with our hypothesis of functional integration of postural control with supra-postural tasks. In Experiment 2, sway variability was not influenced by changes in the cognitive difficulty of tasks when perceptual difficulty was held constant. We predicted that sway would be related to the difficulty of visual tasks, with reductions in sway during more difficult visual tasks. We further predicted that in the absence of visual demand, variations in the difficulty of cognitive tasks would not influence sway variability.


Method

Subjects removed their shoes and stood with their feet together, 100 cm from a computer screen that was adapted to their height. Postural sway was recorded using a 3-df magnetic tracking system (Flock of Birds; Ascension Technology, Inc.), sampled at 25 Hz. The receiver was attached at approximately the seventh cervical vertebra. In Experiment 1, subjects held a computer mouse in their preferred hand.

Experiment 1 had two experimental conditions with three trials in each condition: In the signal detection task, subjects were presented with pairs of vertical lines, one neutral (lines of equal height, 1.95°) and one critical (lines of 1.95° and 2.12°). They were asked to click the mouse each time they saw a critical signal. In the mental arithmetic task, subjects were asked to count backward by 3 from a randomly assigned number, as fast and accurately as possible. After each condition, participants completed the NASA-TLX, evaluating the mental workload of the condition.

In Experiment 2, there were two conditions with four trials in each condition: The hard mental arithmetic task (same as in Experiment 1) and an easy mental arithmetic task (subjects were instructed iteratively to add 2). Subjects again completed the NASA-TLX after each condition.


Results

Mental arithmetic performance. In Experiment 1, the mean accuracy was 87 % with a mean response rate of 27 per trial. In Experiment 2, the accuracy for the hard task was 91 % with a response rate of 28 responses per minute. For the easy task, the mean accuracy was 99 % at 46 responses per minute. The means of both accuracy (t(14) = 1.941, p < .05) and response rate (t(14) = 8.088, p < .05) were significantly different, indicating that the easy task counting was more accurate and faster.

Subjective workload. In Experiment 1, the mean overall workload scores for the detection and arithmetic conditions were 59.3 and 60.1; these means did not differ, t(11) = -.114, p > .05. In Experiment 2, the easy and heard tasks were rated as having overall workload scores of 38.6 and 61.4 respectively. The difference was significant, t(14) = -4.080, p < .05.

Postural motion. The dependent variables were the standard deviation of torso position in AP and ML axes. In Experiment 1, during the signal detection task sway variability was significantly lower in the AP axis than during the mental arithmetic task, t(11) = 1.988, p < .05), as predicted. In Experiment 2, there were no significant differences in sway between the easy and hard mental arithmetic conditions in AP axis (t(14) = -.512, p > .05) or in ML axis (t(14) = .46, p > .05).


Discussion

We believe that stance can be modulated in ways that facilitate perceptual contact with the environment (in this case, visual performance). In Experiment 1, the signal detection and mental arithmetic tasks did not differ in subjective mental workload, but sway was reduced during signal detection. In Experiment 2, significant differences in mental workload (between hard and easy mental arithmetic tasks) had no effect on sway. We interpret these results as indicating that changes in sway were functionally related to the degree of oculomotor stability required in the signal detection task. That is, we believe that postural movements were modulated enactively to optimize the acquisition of visual information needed to perform the signal detection task.

back



Enactive exploration of postural coordination dynamics
Benoît G. Bardy1, 2, Elise Faugloire3, Omar Merhi3 & Thomas A. Stoffregen3
1 Efficiency and Deficiency Laboratory, University of Montpellier-1
2 Institut Universitaire de France
3 School of Kinesiology, University of Minnesota, Minneapolis, USA

Postural coordination in goal-directed stance typically exhibits two coordinative states, i.e., in phase (ankle-hip relative phase Φrel ≈ 20°) and anti-phase (Φrel ≈ 180°), together with self-organized properties such as bifurcation, hysteresis, critical fluctuations, and critical slowing down [Bardy, 2004; Bardy et al., 2002]. Here we explored the complete range of postural coordination by asking standing participants to execute 16 different ankle-hip relative phase patterns (from 0° to 337.5°). Each coordination mode was tested with and without real time visual feedback. First, visual feedback was provided via a Lissajous figure in which the instantaneous discrepancy between the requested relative phase and the actual relative phase was plotted as a real time trajectory. Second, participants attempted to produce the requested coordination without visual feedback.

We found a general influence of the requested relative phase on performed coordination. Our measures capturing the coordination (Φrel, SDΦrel, CE, and AE) indicated that participants did not accurately produce relative phases between 270° and 360° (0°), with constant errors as high as -100°. There was a tendency to overestimate relative phase for requested relative phase below the range [157.5° - 202.5°], and to underestimate relative phase above that range, as expressed by the negative slope of constant error over the scan. Thus, in this study there was a unique attractor around anti-phase, that is, a unique coordinative state which attracted the other patterns toward its relative phase value. This finding is complemented by better performance (smaller CE and AE) when the requested relative phase involved the lead of hip movement (relative to conditions in which the ankle lead), illustrating the asymmetrical nature of postural dynamics. Our results thus do not support a direct correspondence between constrained postural dynamics, that is, the time related postural behavior that emerges out of a coalescence of constraints [Bardy et al., 2002], and imposed dynamics, that is, the postural behavior that is specified by instructions or environmental information [Faugloire et al., 2005] . The task-specific behavior of the postural system may be adequately exploited by the central nervous system by modulating appropriately, under specific task-related circumstances, the order parameter for the coordination. The consequences of these results for training and rehabilitation will be detailed in the presentation.


References
Bardy, B. G., Postural coordination dynamics in standing humans. In V.K. Jirsa, J.A.S. Kelso (Eds.), Coordination dynamics: Issues and trends, Springer, Berlin, 2004, pp. 103-121.
Bardy, B.G., Oullier, O., Bootsma, R.J., and Stoffregen, T.A., Dynamics of human postural transitions, J. Exp. Psychol. Hum. Percept. Perform., 28 (2002) 499-514.
Faugloire, E., Bardy, B.G., Merhi, O., & Stoffregen, T.A. Exploring coordination dynamics of the postural system with real-time visual feedback. Neuroscience Letters, 374 (2005) 136-141.

Acknowledgments: Supported by Enactive Interfaces (IST contract #002114).

back



Study of Visuo-Haptic Spatial Misalignment
Pierre Davy, HyungSeok Kim, Nadia Magnenat-Thalmann
MIRALab, University of Geneva, {davy | kim | thalmann}@miralab.unige.ch

Abstract:

Haptic and visual channel are key sensory channels in most of our daily interactions. In multimodal environment, interoperation of different modalities gives either positive or negative effects on the perception and the performances. Thus, getting accurate knowledge of cross modal interaction on haptic and visual channel is essential to design efficient visuo-haptic applications. Among the numerous possible interactions between the two modalities, we are interested in studying spatial alignment. Spatial alignment characterizes spatial properties of the mapping between the haptic and the visual workspaces.

The purpose of this paper is to study the effect on the user of the misalignment between the haptic and the visual workspaces. By misalignment between the two workspaces, we mean both the positional misalignment and the directional misalignment. The positional misalignment is a case of two workspaces are displaced in parallel on the physical space. The directional misalignment is a case of one workspace is rotated against the other.

We noticed that depending of the misalignment, the user could get easily used to some misaligned setup while some cases it is not. This is due to the fact that the user is building a coherent representation of the space from the different information he receives. The force from the haptic device gives a 3d representation by the sense of touch, meanwhile the scene displayed gives another 3d representation by the sense of vision. And then the user merges those two kinds of information to build a 3d representation of the virtual world.

The purpose of the experiments proposed is to study the limitation and the particularities of the reconstruction of a common representation of the 3d world. Different level of alignment will be proposed to the user, basically near matched, 'orthogonally' misaligned, and completely misaligned.

The performance of the users is measured by the time they take to achieve a reconstruction task. This task uses an application developed to display and touch a piece of textile. A piece of textile torn apart will be presented and the user has to reconstruct it by selecting one point and then to glue the point in a correspondent place while the sense is simulated and displayed in both haptic and visual sensory channels.

The reconstruction task is performed by 20 to 30 users, and the performance of the user i.e. the time she/he will take to finish the task is analyzed against the level of misalignment. This approach allows first to compare the different performance, then to highlight function of perception to the level of misalignment, also to highlight a threshold in the representation of the 3d world if any. In the interpretation of the results, a special interest is put on the influence of the user's improvement through the experiment to collect information on learning curve by misalignment level.

back



A Design Approach to Identify Management Models for the Exploration of the ENACTIVE Results
Elisabetta Sani, Carlo Alberto Avizzano, Antonio Frisoli, Massimo Bergamasco
Perceptual Robotics Laboratory - Scuola Superiore Sant'Anna
Piazza Martiri della Libertà, 33
Pisa, 56127, Italy
E-mail (e.sani, carlo, antony, bergamasco)@sssup.it

Networks of Excellence represent new actors in the European system of research aimed at strengthening scientific and technological excellence by integrating the critical mass of resources and expertise at European level. NoEs represent complex entities whose main goals are to overcome the fragmentation of European research, to set up a durable and structured partnerships and to spread scientific excellence beyond the boundaries of its partnership.

They have posed interesting questions regarding the exploitation of the achieved results and they may represent interesting platforms of study from which relevant results can be derived and extended to other context.

NoEs cannot act as "private clubs" keeping the acquired knowledge within their boundaries, but on the contrary have to provide an improvement to the research community by disseminating and exploiting their own results.

The NoE has to protect the knowledge generated within itself and time it should evaluate the potential impact of such knowledge by developing a plan for its use and dissemination.

The ENACTIVE Network of Excellence represents an interesting case of study for the analysis of the relevant factors which are essential for the implementation of a proper exploitation strategy in large research communities.

The aim of this paper is to analyze the key factors needed to implement a successful exploitation and dissemination strategy in the context of a broad research network, and to identify how exploitation results can be determined by the actions undertaken by consortium partners, acting as a network.

The analysis of the current status of the ENACTIVE network policy can help to identify the future steps towards a successful continuation of the exploitation process, even after the conclusion of the Community's funding period.

Networked researches, such as that addressed within the European Networks of excellence, require specific new models for the exploitation that take into account the specific interests of distributed and administratively unrelated partners.

In order to maximize the impact of the research, dissemination and exploitation actions are usually planned by parallel actions opening several questions on the proper monitoring of these Networks. Among these the most important to be mentioned are:
How to evaluate the efforts and related exploitation results? How to estimate the status of dissemination? How to plan new policies for sustaining exploitation after the end of the Networks? Which factors can determine success for exploitation and dissemination?

This paper focuses its analysis on the above topics, with a specific attention given to the ENACTIVE Network of Excellence. The results of this analysis will act as an internal pushing action focusing on specific aspects and factors which play a relevant role.

The paper proposes the results of a survey based on specific questionnaires which have been administered to several partners of the ENACTIVE Network. All questions are focused on the ENACTIVE research and customized for the partners' expertise in order to identify which factors may improve expectation for dissemination and exploitation.

The questionnaires highlight the actions performed by each partner for transferring and disseminating the knowledge acquired within the Network, trying to quantify the obtained results. A preliminary model for the transfer of achieved results is obtained from the best practise of the partners in dissemination and knowledge transfer. The identified key factors have been mapped at a Network level for the identification of an exploitation policy to adopt in a long term.

back



A Bimodal Audio/Optical Integrated System for Real-Time Localization and Tracking of a Live Performer
Riccardo Marogna, Cosmo Trestino, Barbara Mazzarino, Gualtiero Volpe, Antonio Camurri,Giovanni De Poli.

Abstract

In this paper we present an integrated system for multimodal 3D tracking of a performer during an artistic event. This system integrates two tracking algorithms: one audio and one optical. DEI-CSC, University of Padova, has developed the audio tracking algorithm. The optical tracking algorithm has been developed by DIST-InfoMus Lab University of Genova. The two algorithms are integrated using the EyesWeb software platform (www.eyesweb.org), which is responsible of the synchronisation and combination of data flow.

1. Audio Localization System

The audio system performs the extraction of Direction Of Arrival (DOA) through Time Difference Of Arrival (TDOA) estimation implementing an Acoustical Transfer Functions ratio (ATF-ratio) estimation algorithm [1]. The system includes a 4-microphones uniform linear array, therefore a set of independent TDOA estimations is obtained and realizable delay vectors [2] are used for the extraction of DOA. The TDOA estimation procedure can be summarized as follows:
  1. Acquisition: the system starts acquiring the four received signals for M=10 consecutive frames of N=512 samples each (Fs = 16 kHz) and computing the received mean power Ps. If Ps>Pn (a pre-evaluated Power Threshold) the system considers the presence of a source signal and proceeds with the evaluation stage;

  2. Evaluation: the system extracts a realizable set of TDOA through the ATF-ratio estimation algorithm and thus the DOA. Hence, the system is able to produce a good DOA estimation in 320 ms.
In a tracking scenario, we assume that the performer position will change slowly, that is, consecutive DOA estimations will be close (in space) to each other. Hence, the source localization algorithm can divided into three stages:
  1. clamping stage: the system performs a first unconstrained DOA estimation, then updates a vector of "candidate DOA" for the next estimation;

  2. tracking stage: while the performer keeps on speaking (or singing), next estimations will be constrained by the vector, that is, the algorithm will search for a set of TDOA which is in accordance with a candidate DOA;

  3. unclamping stage: if the performer gives up speaking (or singing) for a while, the system loses his position and returns to stage 1.

2. Tracking System

For the optical tracking we use a LaserTracker technology for the EyesWeb software platform [3]. A prototype of Laser Tracker was developed at the end of the EU-IST project MEGA. This system is composed by a portable radio emitter /optical reflector and landmarks infrared emitters. This system is connected to a PC with serial port and requires a preliminary calibration session. During the calibration it is possible to set the number of landmarks in the tracking system and associate to them the world coordinates, the angle (in radians) between the positive X-axis and the landmark zero position (the zero step of the encoder). It is also possible to display the positive direction of the landmark angle and the data received from the tracker HW. The calibration generates an external file to be loaded in the final EyesWeb application for tracking. This tracking was consolidated during the last year and has been used in live performances where it is required a high precision in the tracking task in noising environments.

Acknowledgments
We thank our colleague Matteo Ricchetti for the development of the optical tracking system.

References
[1] S. Gannot, D. Burshtein, E. Weinstein, "Signal enhancement using beamforming and non-stationarity with application to speech", IEEE Trans. Signal Processing, 49 (8) (2001).
[2] S.M. Griebel, M.S. Brandstein, "Microphone Array Source Localization Using Realizable Delay Vectors", IEEE Workshop on applications of Signal Processing to Audio and Acoustics, New York, October 2001.
[3] A. Camurri, G. De Poli, M. Leman, G. Volpe (2005), "Toward Communicating Expressiveness and Affect in Multimodal Interactive Systems for Performing Art and Cultural Applications", IEEE Multimedia Magazine, 12(1): 43-53, IEEE CS Press, 2005.

back



Examplary Enactive Tasks and Associated Technological Bottlenecks
Collective paper, Co-ordinated by Annie Luciani (INPG)
Charlotte Magnusson (ULUND), Marcello Carozzino (PERCRO), Joan de Boerck (LUCEDM), Ignazio Mansa (CEIT), Carsten Preusche (DLR), Gunnar Jansson (UPPSALA), HyungSeok Kim (UNIGE), Ian Summers (UNEXE), Annie Luciani (INPG), Armen Katchatourov (COSTECH), Como Trestino (DEI)

Keywords: Navigation tasks. Object identification tasks. Selection tasks. Spatially-oriented manipulation tasks. Physically-oriented manipulation tasks. Functional complexity. Environments. 3D scenes. Instruments. Object-near-hand. Object-within-hand.

Abstract: States of the art performed in technology of haptic interfaces, action-vision and action - audition cooperation in mediated computerized systems, showed that, despite the huge development and boiling in such domains, we are confronted now to some critical issues: increasing of the complexity of the virtual scenes, of the demanding on temporal features (real time simulation, latency and synchronization, variability of computer architectures, users demanding, etc). This paper presents:
(1) A Theoretical grid for analysis Enactive platforms and Enactive Tasks
(2) An analysis of the strengths, limitations and bottlenecks of main relevant technological platforms from those used or developed by the Enactive partners
(3) A detailled analysis of Enactive exemplary tasks, which could be used as "significant materials" to drive the analysis of the technological bottlenecks. "Exemplary" has to has to understood as "exemplary to detect technological difficulties, technological limitations to overcome, technological bottlenecks that are not solved nowadays and that could be critical for the future of Enactive Interfaces, Enaction Interaction and Enactive Knowledge".
(4) A preliminary analysis, based on the previous theoretical grid, of technological bottlenecks that have to overcome in the near future of Enactive Interfaces.

back



Ssynth: a Real Time Additive Synthesizer With Flexible Control
V. Verfaille, J. Boissinot, Ph. Depalle, M. M. Wanderley
Sound Processing and Control Laboratory, McGill University

We developed a real time additive synthesizer called Ssynth, with advanced and flexible control functionalities. This research is a further development of Escher, a system developed for studying gestural control in interpolation of digital musical instruments playing. By considering synthesis from the control viewpoint, in terms of design and implementation, Ssynth allows to generate good quality sounds from an instrumental sound database, with a coherent control, providing interpolation and extrapolation of musical playing of digital instruments.

For modularity purpose in the design of digital musical instruments, Ssynth is composed of two parts: a set of Pd patches that implements the different mapping strategies and layers, and a synthesizer1 implemented in C that can be compiled as a stand alone program or as a Pd object, using the Pd scheduler to have output audio. Two parameter conversion layers convert gesture data into higher-level parameters (fundamental frequency, intensity, and dynamics) and higher-level parameters into synthesis parameters. That way, the great number of additive synthesis control parameters is reduced to a smaller set. Additive frames are organized as a 3-dimensional mesh according to pitch (7 values), dynamics (3 values; related to loudness and brightness) and instrument (4 up to now). Morphing between N tones according to those parameters is provided in two steps: interpolated additive frames are weighted from pitch-shifted additive frames with different fundamental frequency and dynamics of the same instrument; then a morpher weights interpolated frames of several instruments. Morphing attack and release requires a time-warping of additive data to provide better quality timbre of the morphed sounds.

The spectral envelope is a function of frequency2 that simplifies the control of partials amplitudes in Ssynth and is useful to morph sounds. When gesturally controlled, it may be necessary to convert a spectral envelope model into another, more suited to provide a stable spectral envelope for a given control. Ssynth uses various models and conversion methods between the following spectral envelope models: formants, cepstrum, LPC class (autoregressive filter, correlation function and reflection coefficients).

The sound database contains additive analysis and spectral envelope models of wind, wood and brass instrument tones (clarinet and oboe as in Escher, plus saxophone and trumpet) from the McGill master samples database. Based on all those components and techniques, Ssynth allows for interpolating and extrapolating the database, synthesizing polyphonic sound, and handling OSC control messages.


References
McAulay, R. J. and T. F. Quatieri (1986). Speech Analysis/Synthesis Based on a Sinusoidal Representation. IEEE Trans. on Acoustics, Speech, and Sig. Proc. 34(4), 744-54.
Schwarz, D. and X. Rodet (1999). Spectral envelope estimation and representation for sound analysis-synthesis. In Proc. Int. Comp. Music Conf., Beijing, pp. 351-4.

1It implements both 1-order or 3-order phase polynomial models (McAulay and Quatieri 1986) with scalar, vectorized and recursive formulation implementations.
2Properties such as envelope fit and smoothness are added and tuned differently depending on the application (Schwarz and Rodet 1999).

back



Evaluation of Audio-Haptic Interaction By Means of Games
Marcello Carrozzino

Abstract

Multi-modal integration can greatly enhance the sense of presence and interaction with virtual environments. The use of multiple sensory feedback is typically used to complement and integrate information during the exploration of virtual environments.

The present work focuses on specific aspects of haptic-audio integration during manipulation which is addressed within the ENACTIVE NoE research. Haptic interfaces are useful to represent continuous contacts, gross textures and stiffness of objects. Sound instead has high performances in representing impact impulses and continuous contact during fine texture exploration. However the impact of integration between feedback modalities is hard to be quantified since it is related to subjective interpretation of sensation.

In the proposed work we have defined a multi-modal game to explore how well modalities integrate each other, and which improvements can the feedback add to single modality fruition. The game is based on the concept of the "card memory game", where the content of each card is represented by audio-haptic textures. Players should identify identical features and remove related card from the board. Game parameters are tracked during play and later used to estimate quality factors of integration. In the present paper we will discuss the system setup, the experiment design and preliminary results.

back



Sonification of Musicians Ancillary Gestures
Vincent Verfaille, Oswald Quek, Marcelo Wanderley
Input Devices and Music Interaction Laboratory, McGill University

Rather than quantifying the different kinds of movements and presenting such information using visual methods (e.g. graphs or tables), sonification of such gestures provides complementary way of analysing movements. Sonification is "the transformation of data relations into perceived relations in an acoustic signal for the purposes of facilitating communication or interpretation" (1). It helps to reveal structures in data that are not at all obvious in the traditional visual-only analysis (2), and ideally represents data sets with large numbers of changing variables and temporally complex information that may be blurred or missed by visual displays. Moreover, it frees the cognitive load of the listener and enables to focus on important aspects of the data (1).

We used recorded musical performances using video cameras and the high-accuracy Optotrak system 3020 (optical movement tracker with active infra-red markers) to track musicians' movements, and to visualize in detail various markers on the musicians' body using Matlab. Three performers played several times Stravinsky's Three Pieces for Solo Clarinet with three expressiveness manners: normal, immobile and exagerated (3). Sonifications are in semi real-time, i.e. offline pre-processing is done using Matlab, and sonifications are played in realtime under MaxMSP.

We sonified the following 4 ancillary gestures: the circular movement of clarinet bell, the body weight transfer, the body curvature, and the knee bend of a musician. We chose synthesis technique with unique features that makes it identifiable from the rest, so as to be able to simultaneously hear 4 different gestures. Risset's infinite glissandi sound (4) is used to sonify circular motions of the bell. Body weight transfer is sonified using a beat interference technique based on additive synthesis. The body curvature was mapped to brightness of frequency modulation sounds (5). Knee bending was sonified using white noise filtering, and mapped to the filter cut-off frequency (and so forth to brightness). When building this sonification system, we built adequate mappings between gesture data and synthesis techniques, in order to build an efficient auditory scene. Preliminary observations have indicated that sonifying up to 4 musicians ancillary gestures can be heard clearly. Sonification is a complementary tool to identify and qualitatively analyse musicians ancillary movements. Next steps are conducting a formal experiment, in order to reveal if listeners can identify gestures, performers and performance manners, and performing interactive sonifications, in order ot provide more efficient mappings.

References
[1] S. Barrass and G. Kramer, "Using sonification," Multimedia Systems, vol. 7, pp. 23-31, June 1999.
[2] S. Pauletto and A. Hunt, "Interactive sonification in two domains: helicopter flight analysis and physiotherapy movement analysis," in Proc. Int. Workshop on Interactive Sonification, Bielefeld, January 2004.
[3] M. M. Wanderley, B. W. Vines, N. Middleton, C. McKay, and W. Hatch, "Expressive movements of clarinetists: Quantification and musical considerations," Tech. Rep., MT2004-IDIM01, IDMIL, McGill, Oct. 2004.
[4] J. C. Risset, "Pitch control and pitch paradoxes demonstrated with computer-synthesized sounds," Jour. Ac. Soc. of Am., vol. 46, no. (A), pp. 88, 1969.
[5] J. Chowning, "The synthesis of complex audio spectra by means of frequency modulation," Comp. Music Journal, vol. 1, no. 2, pp. 46-54, 1977.

back



Two haptic-auditory applications for persons with visual impairments
Kirsten Rassmus-Gröhn, Joakim Eriksson, Charlotte Magnusson, ULUND

The present paper reports on two different applications developed at ULUND. Both applications are designed to allow the user to interact with a haptic-auditory virtual environment. Common problems in both applications concern the design of tools for creating an overview and tools for navigation.

The first application is a virtual haptic-audio drawing application for low vision and nonvision users. It is and will be developed in close collaboration with a user reference group of 5 blind/low vision school children. The objective of the prototype application is twofold. During the early development stages, it will be used as a research vehicle to investigate user interaction techniques and do basic research on navigation strategies and helping tools. Later, the prototype will be tailor-made for use in schoolwork and the final application should be possible to use in different school subjects. The application consists of a haptic "paper" which allows the creation/feeling of positive and negative relief. To this sound feedback (different for free-space, on paper and when drawing) depending on the cursor position is added. Up/down is mapped to pitch while left/right corresponds to the panning (stereo) of the sound.

The second application is an audio-haptic Pacman game. Pacman was an arcade game of the early 1980's, and it is perhaps the best remembered game from that period. Players guide Pacman around a maze eating dots, while avoiding four ghosts. We are developing a haptic version of Pacman, where the player would rely on haptics and sound rather than visual input. Pacman can be guided around the maze using a Phantom device. With the help of 3D-sound, the player can monitor the whereabouts of the ghosts. A previous version of a plain (no pacman) haptic maze has been tested to obtain initial user feedback and the game will be evaluated iteratively with our reference group of youths with various degrees of visual impairments.

The results from the above work indicate that a vertical work area may some times be preferred, although there are ergonomic issues to consider. Another common result is the importance of avoiding the user "falling off" or "getting lost". This indicates the need of some kind of constraining box or similar. Both applications also highlight the importance of active exploration (enactive interaction), since the haptic/auditory feedback will generally not result in an understanding of the environment if it is simply played back to the user. For the drawing application in particular we also have a set of preliminary results:

back



NiMMiT: a Notation for Modelling Multimodal interaction Techniques
Joan De Boeck, Chris Raymaekers, Karin Coninx

When developing applications, special attention must be paid on designing the user interface. This is especially true in the case of rich multimodal or enactive interfaces. Due to the inherent complex nature of these solutions, a lot of communication between designers, as well as several experimental prototypes are necessary. This often results in a lot of writing and rewriting programming code.

In order to facilitate the design of such a rich user interface, we have developed a graphical notation which allows describing the user interaction by means of a high level diagram. The notation is high level, freeing the designer from most implementation issues, but on the other hand, it is detailed enough to allow an interpreter to automatically execute the diagram.

As we consider user interaction to be event driven and state driven and we identify the need for data flow and hierarchical reuse of existing components, NiMMiT is built to support these features. Existing notations such as Coloured Petri-Nets [2], InTML [3], or UML [4] all have their strengths, but none of them offer an ideal solution to support the special needs for fully describing multimodal interaction.

In the presentation, we will show the basic primitives and illustrate our notation by means of a practical example. Voodoo Dolls [1] is a well known two-handed interaction metaphor for 3D virtual environments, in which a miniature representation of two virtual objects (one in each hand) are used in order to manipulate both objects in respect to each other. We will illustrate how this metaphor can be described and automatically executed by explaining the NiMMiT diagram and simultaneously show a video assembly of the execution of the diagram.

As we have chosen to develop a new notation, based upon some principles of other existing notations, we will also elaborate in this presentation on the same practical example using the other related notations, such as Coloured Petri-Nets, InTML and UML2.

We end the presentation by formulating our conclusions and pointing out current and future benefits of the NiMMiT notations, such as simplifying the automatic data capturing during the conduction of a user experiment.

[1] Jeffrey S. Pierce, Brian C. Stearns, and Randy Pausch. Voodoo Dolls: Seamless interaction at multiple scales in virtual environments. In Proceedings of the Symposium on Interactive 3D Graphics, pages 141-145, Atlanta, GA, USA, April 26-28 1999.
[2] K. Jensen: An Introduction to the Theoretical Aspects of Coloured Petri Nets. In: J.W. de Bakker, W.- P. de Roever, G. Rozenberg (eds.): A Decade of Concurrency, Lecture Notes in Computer Science vol. 803, Springer-Verlag 1994, 230-272
[3] Figueroa, P., Green, M., and Hoover, H. (2002). InTml: A description language for VR applications. In Proceedings of Web3D'02, Arizona, USA.
[4] Ambler, S. (2004). Object Primer, The Agile Model-Driven Development with UML 2.0. Cambridge University Press.

back



A Basic Gesture and Motion Format for Virtual Reality Multisensory Applications
Annie Luciani, Matthieu Evrard, Damien Couroussé, Nicolas Castagné, Claude Cadoz, Jean-Loup Florens

Keywords:

File Format, Gesture, Movement, Motion Capture, Virtual Reality, Computer Animation, Multisensoriality

Abstract:

The question of encoding movements such as those produced by human gestures may become central in the coming years, given the growing importance of movement data exchanges between heterogeneous systems and applications (musical applications, 3D motion control, virtual reality interaction, etc.). For the past 20 years, various formats have been proposed for encoding movement, especially gestures. Though, these formats, at different degrees, were designed in the context of quite specific applications (character animation, motion capture, musical gesture, biomechanical concerns). The article introduce a new file format, called GMS (for 'Gesture and Motion Signal'), with the aim of being more low-level and generic, by defining the minimal features a format carrying movement/gesture information needs, rather than by gathering all the information generally given by the existing formats. The article argues that, given its growing presence in virtual reality situations, the "gesture signal" itself must be encoded, and that a specific format is needed. The proposed format features the inner properties of such signals: dimensionality, structural features, types of variables, and spatial and temporal properties. The article first reviews the various situations with multisensory virtual objects in which gesture controls intervene. The proposed format is then deduced, as a mean to encode such versatile and variable "gestural and animated scene".

back



Perception and Recognition of Sound Events and Sources
Stephen McAdams

A panorama of issues related to the role of sound source processing in various aspects of auditory perception and cognition will be presented in an integrated framework including results of past research and future directions of exploration. One of the primary benefits of audition for an organism sensitive to acoustic information is the detection, localization and recognition of mechanical events (actions) in the environment, as well as understanding what the significance of those events is in a given context (McAdams & Drake, 2002). It is therefore important to study aspects of sensory processing of the immediately available acoustic information, but also the processing and interpretation of information in the context of past experience, most often based on knowledge acquired implicitly about the nature of sound sources and their behavior through time. Initial processing must sort out from the acoustic pressure wave the information that originates from different sound sources. The nature and behavior of these sources provide acoustic cues that allow listeners to form perceptual representations of events (at times overlapping with other events) and of streams of events (at times interleaved with other streams), a set of processes referred to as auditory scene analysis. Once these event and stream representations are formed, other processes compute their auditory attributes such as pitch, spatial position, loudness, duration and timbre for events, or relations among these event attributes for streams. Many of these attributes are closely related to the mechanical properties of the sound sources and the way they were set into vibration, and thus serve as cues for source perception and recognition (McAdams, 1993). Quantitative relations can be established between relevant mechanical properties, the perceptual dimensions that represent them, and putative cues derivable from the acoustic signal that serve as a vehicle between the two (McAdams, Chaigne & Roussarie, 2004). Not all cues that listeners can perceive are used when the perceptual goal is to categorize or to identify aspects of the source such as its geometry or the materials from which it is made, suggesting that adult listeners have learned implicitly which cues are most reliable for particular perceptual tasks.

McAdams, S. (1993). Recognition of sound sources and events. In S. McAdams & E. Bigand (Eds.), Thinking in sound: The cognitive psychology of human audition (pp. 146-198). Oxford: Oxford University Press.
McAdams, S., & Drake, C. (2002). Auditory perception and cognition. In H. Pashler & S. Yantis (Eds.), Stevens' handbook of experimental psychology: Vol. 1. Sensation and perception (3rd ed., pp. 397-452). New York: Wiley.
McAdams, S., Chaigne, A., & Roussarie, V. (2004). The psychomechanics of simple sound sources: Material properties of impacted bars. Journal of the Acoustical Society of America, 115, 1306-1320.

back



Perception of accentuation in audio-visual speech
Nusseck, M., Cunningham, D.W., Wallraven, C. & Bülthoff, H.H.

Introduction:

In everyday speech, auditory and visual information are tightly coupled. Consistent with this, previous research has shown that facial and head motion can improve the intelligibility of speech (Massaro et al., 1996; Munhall et al., 2004; Saldana & Pisoni 1996). The multimodal nature of speech is particularly noticeable for emphatic speech, where it can be exceedingly difficult to produce the proper vocal stress patterns without producing the accompanying facial motion. Using a detection task, Swerts and Krahmer (2004) demonstrated that information about which word is emphasized exists in both the visual and acoustic modalities. It remains unclear as to what the differential roles of visual and auditory information are for the perception of emphasis intensity. Here, we validate a new methodology for acquiring, presenting, and studying verbal emphasis. Subsequently, we can use the newly established methodology to explore the perception and production of believable accentuation.

Experiment:

Participants were presented with a series of German sentences, in which a single word was emphasized. For each of the 10 base sentences, two factors were manipulated. First, the semantic category varied -- the accent bearing word was either a verb, an adjective, or a noun. Second, the intensity of the emphasis was varied (no, low, and high). The participants' task was to rate the intensity of the emphasis using a 7 point Likert scale (with a value of 1 indicating weak and 7 strong). Each of the 70 sentences were recorded from 8 Germans (4 male and 4 female), yielding a total of 560 trials.

Results and Conclusion:

Overall, the results show that people can produce and recognize different levels of accentuation. All "high" emphasis sentences were ranked as being more intense (5.2, on average) than the "low" emphasis sentences (4.1, on average). Both conditions were rated as more intense than the "no" emphasis sentences (1.9). Interestingly, "verb" sentences were rated as being more intense than either the "noun" or "adjective" sentences, which were remarkably similar. Critically, the pattern of intensity ratings was the same for each of the ten sentences strongly suggesting that the effect was solely due to the semantic role of the emphasized word. We are currently employing this framework to more closely examine the multimodal production and perception of emphatic speech.

References:

back



Motor Signatures in a Collective Improvisational Dance Situation
Ludovic Marin, Johann Issartel, Marielle Cadopi
Efficiency and Deficiency Laboratory, University of Montpellier-1

Coordination between two (or more) people and which is called interpersonal coordination is present all the time and in every situation as soon as two people have perceptual contact (tactile, visual, sound etc). For example, when two people are walking together in the street, while they are holding their hands (or even just talking together), they immediately couple (co-ordinate) their gait by walking in-phase (Courtine et al., 2003). A similar phenomenon can be observed at the end of a performance or show when all applause tends to be in-phase (Neda, 2000). To date, all interpersonal coordination studies are based on specific time constraints such as the end of the show for Neda's study, a specific stride cycle (Courtine et al.) or even a metronome time scale (Schmidt et al. 1990). But no studies have examined what could happen if there are no constraints in time. In our study we use an improvisational dance task between two dancers as the experimental situation because: 1) movements are not planned, 2) moves are made from moment to moment and 3) improvisation begins at the very beginning of the task. The two main questions that arise are what kind of relationships emerges between two dancers involved in an improvisational task? And, can we find a common motor signature within each pair of participants?

Method

Six pairs of expert dancers were seated facing each other and asked to move only their forearm in the sagittal plane. We asked them to move freely while abiding by the instruction to "be tuned in to each other". This instruction helped create an improvisational situation. They started and stopped moving when they wanted. The subject's motor performance was recorded by electrogoniometers and analyses performed on the angle between the forearm and the table. We used the original method of the cross-wavelet transform (Issartel et al. 2005) in order to measure the temporal evolution of the common frequencies between the two dancers as well as the temporal evolution of the continuous relative phase between the experts.

Results and Discussion

The results revealed a common motor signature within each pair of dancers in an improvisational dance task. Three principal results proved it: 1) we observed that a preferential frequency emerged between the two dancers and 2) we observed a regular alternation of the phase and anti-phase behavior at this preferential frequency. These results showed that an improvisation, which is an artistic and a symbolic task, does not produce random coordination or frequencies but spontaneous coordination and a preferential frequency (for a given pair) as also observed in the studies mentioned above (with specific constraints in time). Moreover, a third result can be observed. Conjointly with the main shared frequency, the dancers each perform secondary frequencies in a particular structure: they perform a coordination, stop it and start again. This kind of motor behavior may represent expert coordination because of the fine and complex motor behavior involved.

Conclusion

In conclusion, a situation with no specific constraints, such as a dance improvisation is definitely not a random or uncertain situation. There is a collective organization that structures improvisation and interpersonal coordination. This finding will help open the door to analyzing artistic tasks in the same ways as traditional laboratory tasks.

back



Real-Time Postural Control with Smooth Collision Management
Manuel Peinado1, Ronan Boulic2, Daniel Meziat1
1 University of Alcala, {Manuel.Peinado, Daniel.Meziat}@uah.es
2 VRLAB, Ecole Polytechnique Fédérale de Lausanne, Ronan.Boulic@epfl.ch

Abstract: Real-time postural control of full body 3D articulated figure is finding a renewed interest in the community of enactive interface as it is felt that the user body movement is the natural interaction channel for specifying virtual human postures. One of our goals is to reduce the time for evaluating the suitability of complex virtual prototypes with respect to various interaction tasks involving human beings. It is therefore required to handle potential collisions between the virtual human body and elements of the 3D environment. We present a first evaluation of a real-time Inverse Kinematics motion capture algorithm integrating an automatic collision management. Our collision management integrates an anticipation capacity owing to the concept of smooth collision zone surrounding the obstacles. As a consequence the movement of body parts is damped in the direction of the obstacle while the movement is not altered in other directions (Fig. 1).

Inverse Kinematic Motion Capture: Our prior contribution to the real-time motion capture of the full user body posture [MBRT99] was exploiting a set of 14 sensors (one for the position and orientation measurement andall other for the orientation data only). The objective was to reflect as closely as possible the performer posture over time (e.g. for constituting libraries of recorded movements). Now the trend is to reduce the number of sensors for improving the user comfort while still recovering a believable posture with the help of IK constraints [PHWLBTM04]. It is ultimately expected that the user can be freed from any invasive system with a vision-based approach [BVUPS05]. The present study exploit captured motion from the CMU database [CMU] on which we evaluate the computing cost of managing anticipated collisions within the Inverse Kinematics motion capture loop (Fig. 2).

Smoothing collision with Inverse Kinematic: Our approach relies on the concept of observers introduced in [PBLM05]. An observer is a geometric primitive (point, sphere, segment, cylinder...) for which we detect when it enters a collision zone surrounding the obstacles while the movement is captured (Fig2c). If it is the case (Fig2e) the movement of the observer is altered by declaring it as a traditional IK effector and by reevaluating the solution for the current time step. By construction the computing cost is higher than regular IK, but the case study from Fig1 demonstrates its feasability with an average 13 to 18ms per time step including collision management.

References:
[MBRT99] Molet T., Boulic R., Rezzonico S., Thalmann, D.,"An architecture for immersive evaluation of complex human tasks", IEEE Transaction in Robotics and Automation, Special Section on Virtual Reality, Volume 15 (3), ISSN 1042- 296X, June 1999
[PHWLBTM04] M. Peinado , B. Herbelin , M. Wanderley, B. Le Callennec, R. Boulic, D. Thalmann, D. Méziat, Towards Configurable Motion Capture with Prioritized Inverse Kinematics, Proc. of the third International Workshop on Virtual Rehabilitation (IVWR'04), Lausanne Sept. 16-17th 2004
[PBLM05] M. Peinado, R. Boulic, B. Le Callennec, D. Meziat, "Progressive Cartesian Inequality Constraints for the Inverse Kinematics Control of Articulated Chains", Short Presentation Proc. of Eurographics'05, Dublin Sept. 2005-08-17, ISSN 1017, 4656
[BVUPSP05] R. Boulic, J.Varona, L. Unzueta, M. Peinado, A. Suescun, F. Perales "Real-Time IK Body Movement Recovery from Partial Vision Input", Proc. of the Second International Enactive Workshop, Genoa, November 2005
[CMU] http://mocap.cs.cmu.edu/

back



A Case Study on Qualitative Evaluation of Avatar Believability
Barbara Mazzarino1, Gualtiero Volpe1, Manuel Peinado3,
Ronan Boulic2, Marcelo Wanderley4, Antonio Camurri1
1 DIST {Barbara.Mazzarino|Gualtiero.Volpe|Antonio.Camurri}@unige.it
2 VRLAB, EPFL, Ronan.Boulic@epfl.ch 3 University of Alcala, Manuel.Peinado@uah.es
4 McGill University, Marcelo.Wanderley@mcgill.ca

Abstract: In this paper we present some preliminary results of the application of qualitative motion analysis evaluation applied on a 3D reconstruction of an expressive movement. This joint work qualitatively evaluates a 3D movement reconstructed with Inverse Kinematics (IK) from partial sensor data [PHWLBTM04]. The input data for the reconstruction algorithm are obtained using an optotrack system with sensors applied on the half body of a clarinet musician. We used the EyesWeb Gesture Processing Library for analysing the video sequences of both the 3D subject, reconstructed with real-time requirements or off-line, and the human player during the performance. The obtained qualitative features are used to compare the movement of the avatar and of the real subject, and also for assessing the impact of the real-time requirement on the resulting movements. The results of the analysis can be used for improving the believability of 3D reconstructed humans, since they focus on the global quality of the produced motion, and put into evidence the areas (or movements) where a qualitative difference between real and virtual players is observed.

Qualitative Analysis and Discussion: The EyesWeb open platform is a software developed for supporting research on multimodal analysis with a special focus on expressive gesture. As such it is the appropriate complementary tool to visual inspection for assessing movement properties of the IK reconstructed movements (Fig. 1). The first major consequence of the realtime requirement (only one IK convergence step per 10Hz movement sample vs 3 to 5 steps for the off-line result) is the low-pass filtering effect that can be observed both on Fig.1. middle row and the yellow curve on Fig. 2a. Although equivalent to a slight loss in expressivity, the yellow curve still reproduces the main inflexions of the original motion without introducing discontinuity (a frequent pitfall of inverse approaches). In addition, the fluidity defined as the ratio between the Quantity of Motion (QoM) of the upper body part to the lower body part, shows that the motion of the lower part is not in agreement with the upper part, and this indicates an artificial motion. Other elements of the analysis have shown that the equilibrium constraint is too strict compared to the musician motion. So, firstly these results suggest that the number of IK convergence steps should not be too high to avoid the artificial introduction of movement energy, and secondly the IK constraints should be adjusted to improve the movement fluidity.

References
[PHWLBTM04] M. Peinado , B. Herbelin , M. Wanderley, B. Le Callennec, R. Boulic, D. Thalmann, D. Méziat, Towards Configurable Motion Capture with Prioritized Inverse Kinematics, Proc. of the third International Workshop on Virtual Rehabilitation (IVWR'04), Lausanne Sept. 16-17th 2004
[CMV04] A. Camurri, B. Mazzarino, G. Volpe (2004), "Analysis of Expressive Gesture: The EyesWeb Expressive Gesture Processing Library", in A. Camurri, G. Volpe (Eds.), "Gesture-based Communication in Human-Computer Interaction", LNAI 2915, pp. 460-467, Springer Verlag, 2004

back



A 1DoF Haptic Lever Device Used in Train Simulators
Elixabete Bengoechea, CEIT, P° de Manuel, Lardizábal, 15, 20018 San Sebastián, Spain, ebengoechea@ceit.es
Emilio Sánchez, CEIT, TECNUN (University of Navarra), P° de Manuel, Lardizábal, 15, 20018 San Sebastián, Spain, esanchez@ceit.es
Jorge Juan Gil, CEIT, TECNUN (University of Navarra), P° de Manuel, Lardizábal, 15, 20018 San Sebastián, Spain, jjgil@ceit.es

Abstract: Haptics is a quite new technology that can improve the way in which the user interacts with a computer and a machine. This technology is progressing quite fast. However, the number of applications in which we can find commercialized haptic devices is still very low. Maybe, this is true because the improvements of using haptic devices often are not clear. Another factor to take into account is that the price of haptics is high.

This presentation shows a real application in which a haptic device not only is recommended but also required to fulfill the customer requirements providing a reasonable low price. This application is the haptic lever used in train simulators.

In this context, haptic devices play an important role, since not only can they simulate the normal behavior of different not present mechanisms in the simulator, but they can simulate system failures as well. This work presents a multipurpose low-cost haptic 1DoF lever developed by CEIT, successfully used in a train simulator commercialized by Lander Training Simulators. A brief review of tested contact models is also presented. Finally, some experimental results are depicted and compared to a cost-effective solution, using a dSPACE.


back



Haptic-Auditory Rendering and Perception of Contact Stiffness
Federico Avanzini

Abstract: This contribution presents an architecture for the synchronized rendering of auditory and haptic stimuli of impulsive and continuous contact. Haptic rendering is performed using a Phantom Omni and the OpenHaptics Toolkit, while sound rendering is performed using physically-based audio contact models that we have developed and implemented within the real-time platform Pure-Data. The two rendering pipelines exchange information through shared memory, thus ensuring low latency in the communication. The auditory and haptic modes are tightly coupled because they are controlled through the same physical parameters.

The proposed architecture has been used to experimentally assess relative contributions of haptic and auditory information to multisensory (i.e., bimodal) judgments of contact stiffness using a rigid probe.

The auditory stimuli were obtained using a physically-based audio model of impact, in which the colliding objects are described as modal resonators that interact through a non-linear impact force. The impact force can be controlled through a stiffness parameter, that influences the contact time of the impact. Previous studies have already indicated that this parameter has a major influence on the auditory perception of hardness/stiffness.

The experiment presented here used the following procedure: subjects had to tap on virtual surfaces, and were presented with audio-haptic stimuli (i.e. , contact forces and impact sounds), one at a time. The stimuli were synthesized using different levels of haptic and auditory stiffnessess. Stiffness magnitude was estimated using an absolute magnitude-estimation procedure: subjects were asked to rate the surfaces on an arbitrary numerical scale, based on their perceived stiffness. The results indicate that when the haptic stiffnesses of the surfaces were the same, subjects consistently ranked the surfaces according to the auditory stimuli.

back



Preliminary Test in a Complex Virtual Dynamic Haptic Audio Environment
Charlotte Magnusson (ULUND), Annie Luciani (INPG), Damien Couroussé (ACROE), Roy Davies (ULUND), Jean-Loup Florens (ACROE)

This paper reports on an explorative test session performed at INPG during December 2005. During this session two different implementations of a virtual PebbleBox were explored freely. The purpose of this test was to explore the importance of different parameters in the different setups, as well as to gain an insight into the strengths and weaknesses of the different approaches.

The virtual PebbleBox-ULUND implementation [1] is based on available software tools. The haptic and visual part is based on OpenHaptics from SensAble (http://www.sensable.com/) in combination with OpenDynamicsEngine (http://www.ode.org/). The audio part uses the playback of recorded sound files with Direct3DSound, and is thus able to render spatial (3D) sound feedback. The haptic hardware used was the PHANToM desktop. The virtual world is three dimensional and consists of a box with a number of moving spheres inside.

The Virtual PebbleBox-INPG implementation was used the TELLURIS simulation platform. The model was implemented using CORDIS-ANIMA Error! Reference source not found., a particle-based physical modeling system. The virtual world was two-dimensional and consisted of a circular container with a number of moving and circular objects inside. The computation of the movements of the objects (that is, nonsounding parts of the model) was computed at 3 kHz. Each of the moving objects could produce a sound, for example after a collision, which was produced by its acoustical deformations. The sounding parts of the model were computed at 30 kHz, and the sound output was monophonic. A 2D graphical representation of the model was performed at 50 Hz on a 21" CRT display. The haptic device used was an ERGOS [3] 3D stick constrained in a horizontal plane. The simulator provided high frequency (3kHz) communication with the haptic device, which allows very precise haptic feedback.

The virtual worlds were selected to put focus on the dynamic - haptic - audio properties of the environment, since this type of environment presents a true challenge for anyone exploring it without visual feedback.

Thus we were able to explore:

Both environments also contained common variables such as object size (with respect to the environment), mass/gravitation, manipulator object size and which type of contact that generates a sound. All these influenced the experience of the test persons, but two main factors were identified as crucial for the understanding of such an environment: The perception of the action - response causality was highlighted by the fact that the more feedback that occurred seemingly without "cause" (without user action) the harder the user found the understanding of the environment. Thus local shape was reasonably easily understood, while for the global understanding the importance of being able to follow an object was seen to be important - when an object can be followed, the feedback obtained will be a direct cause of the user actions, while when objects slip away and collide with numerous other objects the indirect feedback resulting is often hard to understand. The consistency of feedback was highlighted by the influence of particularly the sound feedback on the experience, as well as by the fact that the presence or absence of haptic feedback for the container (box or circle) greatly influenced the results. This consistency involves not only consistency between channels (haptic/auditory) but also consistency with expectations - sound splashing generated throughout a whole volume is less believable since splashing in real life tends to be a surface effect. Finally, the consistency of feedback was related to the quality of the haptic device: the higher quality of the ERGOS haptic device was quoted as providing a more distinct perception of the local surface properties of each of the objects.

The present environment poses interesting questions about what it is that makes "an object" - how do we infer experienced feedback as coming from a specified object and how do we the great influence of vision on the understanding of the scene. This showed up not just directly as a result of the increased ability to obtain an overview, but also indirectly as it was observed to influence exploration patterns and thus also influencing the haptic - audio feedback obtained.

To conclude, the present type of environments show great promise for the investigation of the basic factors underlying believability in haptic - auditory applications, and highlights the need for further investigation of the described type of environments.

Bibliography
[1] G. Essl, C. Magnusson, J. Eriksson, S. O'Modhrain, "Towards evaluation of performance, control of preference in physical and virtual sensorimotor integration", ENACTIVE 2005, the 2nd International Conference on Enactive Interfaces, Genoa- Italy on November 17-18, 2005.
[2] A. Luciani, S. Jimenez, J.-L. Florens, C. Cadoz, and O. Raoult, "Computational physics: a modeler-simulator for animated physical objects," in Proceedings of Eurographics'91 (F. Post and W. Barth, eds.), (Hofburg - Vienna, Austria), pp. 425-436, Elsevier Science Publishers B.V. (North-Holland), September 2-6 1991.
[3] J.-L. Florens, A. Luciani, C. Cadoz, and N. Castagné, "ERGOS: A multi-degrees of freedom and versatile force-feedback panoply," in Proceedings of Eurohaptics 2004 (M. Buss and M. Fritschi, eds.), (Munich, Germany), pp. 356-360, June 5-7 2004.

back



ENACTIVE and Internet Applications: A First Prototype of ENACTIVE Application that Interoperates Haptic Devices and the World Wide Web
Massimo Bergamasco, Carlo Alberto Avizzano, Emanuele Ruffaldi, Mirko Raspolli
Perceptual Robotics Laboratory - Scuola Superiore Sant'Anna
Piazza Martiri della Libertà, 33
Pisa, 56127, Italy
E-mail {bergamasco, carlo, pit, raspolli}@sssup.it

Abstract:
Internet access is strongly penalized by currently existing interaction paradigms. During the last decade, the access to web resources has been extended to support graphics, audio and multimedia. However, in all existing cases the interaction of the user is limited to simple Video Display Terminal and standard input devices.

Nowadays multimodal interfaces allow the user interaction in more engaging and natural fashions than the present available. However such interaction is limited to specific software or system configurations.
In the present paper we will discuss how these technologies can be enhanced in order to create a networked shared laboratory that operates on the hypertext transfer protocol.

We will show how this laboratory allows to create a larger platform to carry out virtual experiments by using resources distributed on the Web. According to this architecture a set of M-MMI (Multisensory Man Machine Interfaces) will be able to interoperate on a common platform. The platform will consist in a distributed architecture that includes computational and simulation resources as well as multisensorial display(audio, video and haptic).

A preliminary demonstrator based on physical models and haptic interaction will then be presented. In this Visuo-Haptic Scenario (a pool table), the user will be able to play, by interacting physically and visually, with other players interconnected to the same Web location.

Finally a discussion on further application and extensions will be provided.

back



Movement Kinematics and Eye-Hand Synchronization in Rhythmical Fitts' Task
Denis Mottet1, Stefano Lazzari1 & Jean-Louis Vercher2
1 Efficiency and Deficiency Laboratory, University of Montpellier-1
2 Mouvement & Perception, University of the Mediterranean

We investigated human behaviour and capabilities regarding perception and action in one of the simplest human perceptual and motor skills : rhythmical pointing. We analysed hand kinematics, eyes kinematics and the eye-hand synchronization during a reciprocal Fitts' task and their evolution when task difficulty (ID) varied. We manipulated the multi-sensory interactions between vision and other senses by comparing a situation where participants had to alternatively move a pointer (and the eyes) to two targets with a similar situation where participants had to alternatively capture an immobile pointer with two mobile targets, keeping the eyes fixed on the pointer placed in the centre of the scene (Figure 1).

We found similar speed accuracy trade-off and and similar movement kinematics in the mobile and immobile eyes conditions, but participants passed from a discrete movement (for high ID values) to a continuous movement (for low ID values) at lower ID values and in a steeper fashion in the presence of ocular movements. These results confirm the abstract nature of the frame of reference in which the task is controlled and movement kinematics produced (i.e., that of the closing of the gap between the pointer and the target), while pointing out a subtle influence of eyes movements on hand movement.

In the condition where both hand and eyes were mobile, we found that the visual system could not follow the rhythm when hand movement frequency was high (i.e., when accuracy requirements were low). This is explained by the high cost for the saccadic system to maintain a 1:1 control when movement frequency is high and to the relatively low benefit of a visual feed-back at each target when precision requirements are low.

Taken together, these results suggest that the functional coupling between multi-modal perception and end effector movement in Fitts' task relies primarily on visual feed-back when accuracy demand is high, and primarily on non visual feed-back when frequency demand is high.

back



Visual And Haptic Perception of Object Elasticity in a Squeezing Virtual Event
Damien Couroussé, Gunnar Jansson, Jean-Loup Florens, Annie Luciani

Keywords: elasticity, stiffness, deformable object, haptic information, visual feedback

Abstract: Numerous studies have been performed on the human perception of object's weight, elasticity or viscosity. Most of them were however based on discrimination tasks, and used or simulated very simple objects with linear elasticity. In the experiment presented, we asked participants to make judgements on the elasticity of a deformable virtual object from haptic and visual information. We found that observers could make judgements on elasticity in an orderly way, and that all of the different stiffness values were correctly discriminated except the two lowest ones. It was found too that visual information, when available, modified the movements parameters, such as movement amplitude and mean manipulation speed, but did not help improving the results of the task.

back



Test of Three Different Audio-haptic Navigation Tools
Charlotte Magnusson (ULUND), Kirsten Rassmus-Gröhn (ULUND), Henrik Danielsson (University of Linköping), Håkan Eftring (ULUND)

The present paper presents a continuation of the work on navigational tools studied in two pilot tests [1], [2]. To further test the tools most popular in these pilot tests, as well as investigate possible influences on the spatial perception by different navigational tools a more focused test was performed with 12 users (11 blindfolded sighted and one visually impaired person) during the autumn of 2005. In this test audio feedback (using the ears in hand metaphor) together with haptic feedback in the shape of either a constant attractive force or a linear fixture was investigated. To test possible effects on spatial memory a task of locating three targets and then reproducing their positions was chosen. The navigational tools were tested separately, but the two haptic tools were also tested in combination with the audio tool, resulting in five different test setups.

The results of these tests confirm the usefulness of the constant, weak, radial attractive force (on its own or with 3D audio). For the fixture, which also was seen to be useful, the sound may have been more important since it provided directional feedback. As for the spatial memory there is really no significant difference between the tools. The force showed a tendency to give better results on the recall of the number of different targets but no such effect was seen for the distances or the fully correct object assignments. In previous tests we had seen a tendency to remember the environment better if you spent longer time in it, but here it seemed as if spending a long time in the environment did not help. Another possible effect that would tend to influence the results in the opposite direction is the number of times you can "check back" or rehearse the object positions. Since no really significant effect on recall was seen it is possible that these two effects so to say cancel each other out with the present test design. Even though the sound with this setup generated significantly longer completion times, the user comments indicate that the 3D sound (ears in hand) may enhance the spatial understanding - it seems as if this sound feedback may heighten the sense of immersion (we cannot say anything definite on this point though, since immersion was not tested for). A factor that may influence the result was that in this test we used the same navigational sound for all objects. We chose this design because we wanted to force the users to actually locate the targets, but one of the advantages of sound is of course that it can be heard from a distance (i.e it provides a possibility for accessing object information before actually reaching the object). This test also points to the fact that navigational feedback may interfere with the actual task although this most likely depends on design as well as modality.

[1] Audio haptic tools for navigation in non visual environments
Charlotte Magnusson Kirsten Rassmus-Gröhn
ENACTIVE 2005, the 2nd International Conference on Enactive Interfaces, Genoa- Italy on November 17-18, 2005.
[2] Audio haptic navigational tools for non-visual environments
Charlotte Magnusson
First ENACTIVE Workshop, Pisa, March 21-22, 2005

back



The Role of Tactile Augmentation of a PHANToM Force Feedback Device Studied on a Task of Goal-directed Displacement
Günnar Declerck, Charles Lenay
COSTECH Research Unit, Université de Technologie de Compiègne, France
gunnar.declerck@utc.fr, charles.lenay@utc.fr

Many investigations have shown that the main limitation of the PHANToM (Sensable Technologies), one of the most commercialized force feedback device (FFD) at present, with regard to the natural haptic perception, is that it only offers the user one point of contact at a time with the virtual objet ([1], [2] and [3]). This restriction, that may be to a large extent compensated for when vision is available, becomes especially critical when the PHANToM used without vision is the only source of information. However it must be noted that the degree of difficulty involved by the restriction of the interaction to a single point differs with the task : for instance the perception of virtual textures with the PHANToM functions well [3]. But the situation is quite different when the PHANToM is used to perceive the form of 3D virtual objects : judgements of object form demand more sophisticated exploratory procedures than judgements of texture, which can be based on movements in only one dimension [4]. As shown by Lederman & Klatzky [5], if edges and textures may still be well perceived when the haptic interaction is restricted to a single point, on the other hand the loss of spatially distributed cutaneous inputs critically impairs performance of spatial tasks that require the tracking of contours and the processing of very fine spatial patterns.

One solution to remedy those difficulties without increasing the number of points of contact at a time (what implies to use another device than the PHANToM) and without adding vision may consist in augmenting the kinesthesic feedback with a tactile feedback furnishing the user with spatial informations on the peripheral zone of his point of contact with the virtual object. If it is necessary to follow the constitutive spatial features (contours and surfaces) of a 3D object to correctly identify its form, a perception of the peripheral environment of the interaction point should help by guiding displacement along the spatial feature that is tracked.

We have developed a "tactile box" which embeds 2 Braille cells (4*4 piezoelectric pins) that can be attached to the stylus of the PHANToM. This device makes it possible to augment with a tactile feedback the force feedback provided to the user when the virtual object is encountered. The point the user moves in the virtual environment is now integrated into a virtual body with a spatial extension whose contact with the virtual objects activate the tactile pins of the cells (a kind of « tactile retina »). We have studied the usability of the device with a task of goal-oriented displacement without vision by comparing the performances of subjects to reach the target area in three conditions : 1. force feedback alone, 2. tactile feedback alone, 3. force feedback and tactile feedback combined. The virtual body controlled by the subject is set at one of the ends of a virtual 3D bridge that has several changes of direction and that is « floating » in the virtual environment. The subject is blindfolded and is required to reach the other end of the virtual bridge trying to lose as few as possible contact with the bridge. Results show that the tactile augmentation of the PHANToM leads to the increasing of performances (success, rate of contact with the bridge, velocity).

Such a device, by allowing the artificial dissociation between kinesthesic and tactile source of informations, might offer a new experimental ground to study the articulation between the tactile and the kinesthesic dimensions in the haptic system, and further to identify more precisely the reasons of the usefulness of augmenting FFDs with tactile feedback.

[1] Jansson, G & Larsson, K. (2002). Identification of haptic virtual objects with different degrees of complexity. In S. A. Wall, B. Riedel, A Crossan & M. R. McGee (Eds.), Eurohaptics 2002, Conference Proceedings, Edinburgh, July 2002 (pp. 57-60).
[2] Jansson, G. (2000). Basic issues concerning visually impaired people's use of haptic displays. In P. Sharkey, A. Cesarani, L. Pugnatti & A. Rizzo (Eds.), The 3rd International Conference on Disability, Virtual Reality and Associated Technologies - Proceedings, 23-25 September, Alghero, Sardinia, Italy (pp. 33-38). To appear also in International Journal of Virtual Reality.
[3] Jansson, G., Billberger, K., Petrie, H., Colwell, C., Kornbrot, D., Fänger, J., König, H., Hardwick, A. & Furner, S. (1999). Haptic virtual environments for blind people: Exploratory experiments with two devices. International Journal of Virtual Reality, 4, 10-20.
[4] Jansson, G. & Billberger, K. (1999). The PHANToM used without visual guidance. Proceedings of the First PHANToM Users Research Symposium.
[5] Lederman, S.J. & Klatzky, R.L. (2004). Haptic Identification of Common Objects: Effects of Constraining the Manual Exploration Process. Perception & Psychophysics, 66(4) , 618-628.

back



Workshop on XVR - A Development Framework for Complex VR Applications
Franco Tecchia

XVR is a development framework for complex VR application. It was used at PERCRO in the past 8 years for a variety of projects dealing with real-time graphics and interaction, and it has been continuously updated to accommodate always-evolving programming needs. XVR started later to be adopted also by other groups, and today it offers a wide range of useful and practical functionality to control the many aspects linked to VR programming, including real-time graphics, sound, interaction, and support to the most common VR devices (trackers, displays, haptics and interaction devices). XVR fundamental design goal is simplicity of use: every new programming construct need to be simple, flexible and effective. This strict design philosophy made XVR a platform able to accommodate both the novices needs as well as "professional" programming. This workshop will expose the overall framework of XVR technology, showing how it can be used in a range of common situations, including high-quality graphical rendering, real-time physics and network programming.

back