In the past two years, the Haptics Lab has been working on several new devices: a handheld vibrotactile device, a contact location display, a planar direct-drive force-feedback device, and a high-density distributed tactile display; named "MicroTactus", "Morpheotron", "Pantograph", and "StreSS", respectively, each targeted at specific approaches to mediate haptic interaction. In this presentation, I will discuss recent enhancements for each of these devices, what drives performance and what provides effectiveness in each case.
In two experiments, we independently varied the degree of cognitive and perceptual difficulty of supra-postural tasks. Increases in perceptual difficulty in Experiment 1 tended to be correlated with decreases in the variability of postural sway, consistent with our hypothesis of functional integration of postural control with supra-postural tasks. In Experiment 2, sway variability was not influenced by changes in the cognitive difficulty of tasks when perceptual difficulty was held constant. We predicted that sway would be related to the difficulty of visual tasks, with reductions in sway during more difficult visual tasks. We further predicted that in the absence of visual demand, variations in the difficulty of cognitive tasks would not influence sway variability.
Subjects removed their shoes and stood with their feet together, 100 cm from a computer screen that was adapted to their height. Postural sway was recorded using a 3-df magnetic tracking system (Flock of Birds; Ascension Technology, Inc.), sampled at 25 Hz. The receiver was attached at approximately the seventh cervical vertebra. In Experiment 1, subjects held a computer mouse in their preferred hand.
Experiment 1 had two experimental conditions with three trials in each condition: In the signal detection task, subjects were presented with pairs of vertical lines, one neutral (lines of equal height, 1.95°) and one critical (lines of 1.95° and 2.12°). They were asked to click the mouse each time they saw a critical signal. In the mental arithmetic task, subjects were asked to count backward by 3 from a randomly assigned number, as fast and accurately as possible. After each condition, participants completed the NASA-TLX, evaluating the mental workload of the condition.
In Experiment 2, there were two conditions with four trials in each condition: The hard mental arithmetic task (same as in Experiment 1) and an easy mental arithmetic task (subjects were instructed iteratively to add 2). Subjects again completed the NASA-TLX after each condition.
Mental arithmetic performance. In Experiment 1, the mean accuracy was 87 % with a mean response rate of 27 per trial. In Experiment 2, the accuracy for the hard task was 91 % with a response rate of 28 responses per minute. For the easy task, the mean accuracy was 99 % at 46 responses per minute. The means of both accuracy (t(14) = 1.941, p < .05) and response rate (t(14) = 8.088, p < .05) were significantly different, indicating that the easy task counting was more accurate and faster.
Subjective workload. In Experiment 1, the mean overall workload scores for the detection and arithmetic conditions were 59.3 and 60.1; these means did not differ, t(11) = -.114, p > .05. In Experiment 2, the easy and heard tasks were rated as having overall workload scores of 38.6 and 61.4 respectively. The difference was significant, t(14) = -4.080, p < .05.
Postural motion. The dependent variables were the standard deviation of torso position in AP and ML axes. In Experiment 1, during the signal detection task sway variability was significantly lower in the AP axis than during the mental arithmetic task, t(11) = 1.988, p < .05), as predicted. In Experiment 2, there were no significant differences in sway between the easy and hard mental arithmetic conditions in AP axis (t(14) = -.512, p > .05) or in ML axis (t(14) = .46, p > .05).
We believe that stance can be modulated in ways that facilitate perceptual contact with the environment (in this case, visual performance). In Experiment 1, the signal detection and mental arithmetic tasks did not differ in subjective mental workload, but sway was reduced during signal detection. In Experiment 2, significant differences in mental workload (between hard and easy mental arithmetic tasks) had no effect on sway. We interpret these results as indicating that changes in sway were functionally related to the degree of oculomotor stability required in the signal detection task. That is, we believe that postural movements were modulated enactively to optimize the acquisition of visual information needed to perform the signal detection task.
Postural coordination in goal-directed stance typically exhibits two coordinative states, i.e., in phase (ankle-hip relative phase Φrel ≈ 20°) and anti-phase (Φrel ≈ 180°), together with self-organized properties such as bifurcation, hysteresis, critical fluctuations, and critical slowing down [Bardy, 2004; Bardy et al., 2002]. Here we explored the complete range of postural coordination by asking standing participants to execute 16 different ankle-hip relative phase patterns (from 0° to 337.5°). Each coordination mode was tested with and without real time visual feedback. First, visual feedback was provided via a Lissajous figure in which the instantaneous discrepancy between the requested relative phase and the actual relative phase was plotted as a real time trajectory. Second, participants attempted to produce the requested coordination without visual feedback.
We found a general influence of the requested relative phase on performed coordination. Our measures capturing the coordination (Φrel, SDΦrel, CE, and AE) indicated that participants did not accurately produce relative phases between 270° and 360° (0°), with constant errors as high as -100°. There was a tendency to overestimate relative phase for requested relative phase below the range [157.5° - 202.5°], and to underestimate relative phase above that range, as expressed by the negative slope of constant error over the scan. Thus, in this study there was a unique attractor around anti-phase, that is, a unique coordinative state which attracted the other patterns toward its relative phase value. This finding is complemented by better performance (smaller CE and AE) when the requested relative phase involved the lead of hip movement (relative to conditions in which the ankle lead), illustrating the asymmetrical nature of postural dynamics. Our results thus do not support a direct correspondence between constrained postural dynamics, that is, the time related postural behavior that emerges out of a coalescence of constraints [Bardy et al., 2002], and imposed dynamics, that is, the postural behavior that is specified by instructions or environmental information [Faugloire et al., 2005] . The task-specific behavior of the postural system may be adequately exploited by the central nervous system by modulating appropriately, under specific task-related circumstances, the order parameter for the coordination. The consequences of these results for training and rehabilitation will be detailed in the presentation.
Haptic and visual channel are key sensory channels in most of our daily interactions. In multimodal environment, interoperation of different modalities gives either positive or negative effects on the perception and the performances. Thus, getting accurate knowledge of cross modal interaction on haptic and visual channel is essential to design efficient visuo-haptic applications. Among the numerous possible interactions between the two modalities, we are interested in studying spatial alignment. Spatial alignment characterizes spatial properties of the mapping between the haptic and the visual workspaces.
The purpose of this paper is to study the effect on the user of the misalignment between the haptic and the visual workspaces. By misalignment between the two workspaces, we mean both the positional misalignment and the directional misalignment. The positional misalignment is a case of two workspaces are displaced in parallel on the physical space. The directional misalignment is a case of one workspace is rotated against the other.
We noticed that depending of the misalignment, the user could get easily used to some misaligned setup while some cases it is not. This is due to the fact that the user is building a coherent representation of the space from the different information he receives. The force from the haptic device gives a 3d representation by the sense of touch, meanwhile the scene displayed gives another 3d representation by the sense of vision. And then the user merges those two kinds of information to build a 3d representation of the virtual world.
The purpose of the experiments proposed is to study the limitation and the particularities of the reconstruction of a common representation of the 3d world. Different level of alignment will be proposed to the user, basically near matched, 'orthogonally' misaligned, and completely misaligned.
The performance of the users is measured by the time they take to achieve a reconstruction task. This task uses an application developed to display and touch a piece of textile. A piece of textile torn apart will be presented and the user has to reconstruct it by selecting one point and then to glue the point in a correspondent place while the sense is simulated and displayed in both haptic and visual sensory channels.
The reconstruction task is performed by 20 to 30 users, and the performance of the user i.e. the time she/he will take to finish the task is analyzed against the level of misalignment. This approach allows first to compare the different performance, then to highlight function of perception to the level of misalignment, also to highlight a threshold in the representation of the 3d world if any. In the interpretation of the results, a special interest is put on the influence of the user's improvement through the experiment to collect information on learning curve by misalignment level.
Networks of Excellence represent new actors in the European system of research aimed at strengthening scientific and technological excellence by integrating the critical mass of resources and expertise at European level. NoEs represent complex entities whose main goals are to overcome the fragmentation of European research, to set up a durable and structured partnerships and to spread scientific excellence beyond the boundaries of its partnership.
They have posed interesting questions regarding the exploitation of the achieved results and they may represent interesting platforms of study from which relevant results can be derived and extended to other context.
NoEs cannot act as "private clubs" keeping the acquired knowledge within their boundaries, but on the contrary have to provide an improvement to the research community by disseminating and exploiting their own results.
The NoE has to protect the knowledge generated within itself and time it should evaluate the potential impact of such knowledge by developing a plan for its use and dissemination.
The ENACTIVE Network of Excellence represents an interesting case of study for the analysis of the relevant factors which are essential for the implementation of a proper exploitation strategy in large research communities.
The aim of this paper is to analyze the key factors needed to implement a successful exploitation and dissemination strategy in the context of a broad research network, and to identify how exploitation results can be determined by the actions undertaken by consortium partners, acting as a network.
The analysis of the current status of the ENACTIVE network policy can help to identify the future steps towards a successful continuation of the exploitation process, even after the conclusion of the Community's funding period.
Networked researches, such as that addressed within the European Networks of excellence, require specific new models for the exploitation that take into account the specific interests of distributed and administratively unrelated partners.
In order to maximize the impact of the research, dissemination and exploitation actions are
usually planned by parallel actions opening several questions on the proper monitoring of these
Networks. Among these the most important to be mentioned are:
How to evaluate the efforts and related exploitation results? How to estimate the status of
dissemination? How to plan new policies for sustaining exploitation after the end of the
Networks? Which factors can determine success for exploitation and dissemination?
This paper focuses its analysis on the above topics, with a specific attention given to the ENACTIVE Network of Excellence. The results of this analysis will act as an internal pushing action focusing on specific aspects and factors which play a relevant role.
The paper proposes the results of a survey based on specific questionnaires which have been administered to several partners of the ENACTIVE Network. All questions are focused on the ENACTIVE research and customized for the partners' expertise in order to identify which factors may improve expectation for dissemination and exploitation.
The questionnaires highlight the actions performed by each partner for transferring and disseminating the knowledge acquired within the Network, trying to quantify the obtained results. A preliminary model for the transfer of achieved results is obtained from the best practise of the partners in dissemination and knowledge transfer. The identified key factors have been mapped at a Network level for the identification of an exploitation policy to adopt in a long term.
In this paper we present an integrated system for multimodal 3D tracking of a performer during an artistic event. This system integrates two tracking algorithms: one audio and one optical. DEI-CSC, University of Padova, has developed the audio tracking algorithm. The optical tracking algorithm has been developed by DIST-InfoMus Lab University of Genova. The two algorithms are integrated using the EyesWeb software platform (www.eyesweb.org), which is responsible of the synchronisation and combination of data flow.
1. Audio Localization SystemWe developed a real time additive synthesizer called Ssynth, with advanced and flexible control functionalities. This research is a further development of Escher, a system developed for studying gestural control in interpolation of digital musical instruments playing. By considering synthesis from the control viewpoint, in terms of design and implementation, Ssynth allows to generate good quality sounds from an instrumental sound database, with a coherent control, providing interpolation and extrapolation of musical playing of digital instruments.
For modularity purpose in the design of digital musical instruments, Ssynth is composed of two parts: a set of Pd patches that implements the different mapping strategies and layers, and a synthesizer1 implemented in C that can be compiled as a stand alone program or as a Pd object, using the Pd scheduler to have output audio. Two parameter conversion layers convert gesture data into higher-level parameters (fundamental frequency, intensity, and dynamics) and higher-level parameters into synthesis parameters. That way, the great number of additive synthesis control parameters is reduced to a smaller set. Additive frames are organized as a 3-dimensional mesh according to pitch (7 values), dynamics (3 values; related to loudness and brightness) and instrument (4 up to now). Morphing between N tones according to those parameters is provided in two steps: interpolated additive frames are weighted from pitch-shifted additive frames with different fundamental frequency and dynamics of the same instrument; then a morpher weights interpolated frames of several instruments. Morphing attack and release requires a time-warping of additive data to provide better quality timbre of the morphed sounds.
The spectral envelope is a function of frequency2 that simplifies the control of partials amplitudes in Ssynth and is useful to morph sounds. When gesturally controlled, it may be necessary to convert a spectral envelope model into another, more suited to provide a stable spectral envelope for a given control. Ssynth uses various models and conversion methods between the following spectral envelope models: formants, cepstrum, LPC class (autoregressive filter, correlation function and reflection coefficients).
The sound database contains additive analysis and spectral envelope models of wind, wood and brass instrument tones (clarinet and oboe as in Escher, plus saxophone and trumpet) from the McGill master samples database. Based on all those components and techniques, Ssynth allows for interpolating and extrapolating the database, synthesizing polyphonic sound, and handling OSC control messages.
Multi-modal integration can greatly enhance the sense of presence and interaction with virtual environments. The use of multiple sensory feedback is typically used to complement and integrate information during the exploration of virtual environments.
The present work focuses on specific aspects of haptic-audio integration during manipulation which is addressed within the ENACTIVE NoE research. Haptic interfaces are useful to represent continuous contacts, gross textures and stiffness of objects. Sound instead has high performances in representing impact impulses and continuous contact during fine texture exploration. However the impact of integration between feedback modalities is hard to be quantified since it is related to subjective interpretation of sensation.
In the proposed work we have defined a multi-modal game to explore how well modalities integrate each other, and which improvements can the feedback add to single modality fruition. The game is based on the concept of the "card memory game", where the content of each card is represented by audio-haptic textures. Players should identify identical features and remove related card from the board. Game parameters are tracked during play and later used to estimate quality factors of integration. In the present paper we will discuss the system setup, the experiment design and preliminary results.
Rather than quantifying the different kinds of movements and presenting such information using visual methods (e.g. graphs or tables), sonification of such gestures provides complementary way of analysing movements. Sonification is "the transformation of data relations into perceived relations in an acoustic signal for the purposes of facilitating communication or interpretation" (1). It helps to reveal structures in data that are not at all obvious in the traditional visual-only analysis (2), and ideally represents data sets with large numbers of changing variables and temporally complex information that may be blurred or missed by visual displays. Moreover, it frees the cognitive load of the listener and enables to focus on important aspects of the data (1).
We used recorded musical performances using video cameras and the high-accuracy Optotrak system 3020 (optical movement tracker with active infra-red markers) to track musicians' movements, and to visualize in detail various markers on the musicians' body using Matlab. Three performers played several times Stravinsky's Three Pieces for Solo Clarinet with three expressiveness manners: normal, immobile and exagerated (3). Sonifications are in semi real-time, i.e. offline pre-processing is done using Matlab, and sonifications are played in realtime under MaxMSP.
We sonified the following 4 ancillary gestures: the circular movement of clarinet bell, the body weight transfer, the body curvature, and the knee bend of a musician. We chose synthesis technique with unique features that makes it identifiable from the rest, so as to be able to simultaneously hear 4 different gestures. Risset's infinite glissandi sound (4) is used to sonify circular motions of the bell. Body weight transfer is sonified using a beat interference technique based on additive synthesis. The body curvature was mapped to brightness of frequency modulation sounds (5). Knee bending was sonified using white noise filtering, and mapped to the filter cut-off frequency (and so forth to brightness). When building this sonification system, we built adequate mappings between gesture data and synthesis techniques, in order to build an efficient auditory scene. Preliminary observations have indicated that sonifying up to 4 musicians ancillary gestures can be heard clearly. Sonification is a complementary tool to identify and qualitatively analyse musicians ancillary movements. Next steps are conducting a formal experiment, in order to reveal if listeners can identify gestures, performers and performance manners, and performing interactive sonifications, in order ot provide more efficient mappings.
ReferencesThe present paper reports on two different applications developed at ULUND. Both applications are designed to allow the user to interact with a haptic-auditory virtual environment. Common problems in both applications concern the design of tools for creating an overview and tools for navigation.
The first application is a virtual haptic-audio drawing application for low vision and nonvision users. It is and will be developed in close collaboration with a user reference group of 5 blind/low vision school children. The objective of the prototype application is twofold. During the early development stages, it will be used as a research vehicle to investigate user interaction techniques and do basic research on navigation strategies and helping tools. Later, the prototype will be tailor-made for use in schoolwork and the final application should be possible to use in different school subjects. The application consists of a haptic "paper" which allows the creation/feeling of positive and negative relief. To this sound feedback (different for free-space, on paper and when drawing) depending on the cursor position is added. Up/down is mapped to pitch while left/right corresponds to the panning (stereo) of the sound.
The second application is an audio-haptic Pacman game. Pacman was an arcade game of the early 1980's, and it is perhaps the best remembered game from that period. Players guide Pacman around a maze eating dots, while avoiding four ghosts. We are developing a haptic version of Pacman, where the player would rely on haptics and sound rather than visual input. Pacman can be guided around the maze using a Phantom device. With the help of 3D-sound, the player can monitor the whereabouts of the ghosts. A previous version of a plain (no pacman) haptic maze has been tested to obtain initial user feedback and the game will be evaluated iteratively with our reference group of youths with various degrees of visual impairments.
The results from the above work indicate that a vertical work area may some times be preferred, although there are ergonomic issues to consider. Another common result is the importance of avoiding the user "falling off" or "getting lost". This indicates the need of some kind of constraining box or similar. Both applications also highlight the importance of active exploration (enactive interaction), since the haptic/auditory feedback will generally not result in an understanding of the environment if it is simply played back to the user. For the drawing application in particular we also have a set of preliminary results:
When developing applications, special attention must be paid on designing the user interface. This is especially true in the case of rich multimodal or enactive interfaces. Due to the inherent complex nature of these solutions, a lot of communication between designers, as well as several experimental prototypes are necessary. This often results in a lot of writing and rewriting programming code.
In order to facilitate the design of such a rich user interface, we have developed a graphical notation which allows describing the user interaction by means of a high level diagram. The notation is high level, freeing the designer from most implementation issues, but on the other hand, it is detailed enough to allow an interpreter to automatically execute the diagram.
As we consider user interaction to be event driven and state driven and we identify the need for data flow and hierarchical reuse of existing components, NiMMiT is built to support these features. Existing notations such as Coloured Petri-Nets [2], InTML [3], or UML [4] all have their strengths, but none of them offer an ideal solution to support the special needs for fully describing multimodal interaction.
In the presentation, we will show the basic primitives and illustrate our notation by means of a practical example. Voodoo Dolls [1] is a well known two-handed interaction metaphor for 3D virtual environments, in which a miniature representation of two virtual objects (one in each hand) are used in order to manipulate both objects in respect to each other. We will illustrate how this metaphor can be described and automatically executed by explaining the NiMMiT diagram and simultaneously show a video assembly of the execution of the diagram.
As we have chosen to develop a new notation, based upon some principles of other existing notations, we will also elaborate in this presentation on the same practical example using the other related notations, such as Coloured Petri-Nets, InTML and UML2.
We end the presentation by formulating our conclusions and pointing out current and future benefits of the NiMMiT notations, such as simplifying the automatic data capturing during the conduction of a user experiment.
[1] Jeffrey S. Pierce, Brian C. Stearns, and Randy Pausch. Voodoo Dolls: Seamless interaction at multiple scales in virtual environments. In Proceedings of the Symposium on Interactive 3D Graphics, pages 141-145, Atlanta, GA, USA, April 26-28 1999.The question of encoding movements such as those produced by human gestures may become central in the coming years, given the growing importance of movement data exchanges between heterogeneous systems and applications (musical applications, 3D motion control, virtual reality interaction, etc.). For the past 20 years, various formats have been proposed for encoding movement, especially gestures. Though, these formats, at different degrees, were designed in the context of quite specific applications (character animation, motion capture, musical gesture, biomechanical concerns). The article introduce a new file format, called GMS (for 'Gesture and Motion Signal'), with the aim of being more low-level and generic, by defining the minimal features a format carrying movement/gesture information needs, rather than by gathering all the information generally given by the existing formats. The article argues that, given its growing presence in virtual reality situations, the "gesture signal" itself must be encoded, and that a specific format is needed. The proposed format features the inner properties of such signals: dimensionality, structural features, types of variables, and spatial and temporal properties. The article first reviews the various situations with multisensory virtual objects in which gesture controls intervene. The proposed format is then deduced, as a mean to encode such versatile and variable "gestural and animated scene".
A panorama of issues related to the role of sound source processing in various aspects of auditory perception and cognition will be presented in an integrated framework including results of past research and future directions of exploration. One of the primary benefits of audition for an organism sensitive to acoustic information is the detection, localization and recognition of mechanical events (actions) in the environment, as well as understanding what the significance of those events is in a given context (McAdams & Drake, 2002). It is therefore important to study aspects of sensory processing of the immediately available acoustic information, but also the processing and interpretation of information in the context of past experience, most often based on knowledge acquired implicitly about the nature of sound sources and their behavior through time. Initial processing must sort out from the acoustic pressure wave the information that originates from different sound sources. The nature and behavior of these sources provide acoustic cues that allow listeners to form perceptual representations of events (at times overlapping with other events) and of streams of events (at times interleaved with other streams), a set of processes referred to as auditory scene analysis. Once these event and stream representations are formed, other processes compute their auditory attributes such as pitch, spatial position, loudness, duration and timbre for events, or relations among these event attributes for streams. Many of these attributes are closely related to the mechanical properties of the sound sources and the way they were set into vibration, and thus serve as cues for source perception and recognition (McAdams, 1993). Quantitative relations can be established between relevant mechanical properties, the perceptual dimensions that represent them, and putative cues derivable from the acoustic signal that serve as a vehicle between the two (McAdams, Chaigne & Roussarie, 2004). Not all cues that listeners can perceive are used when the perceptual goal is to categorize or to identify aspects of the source such as its geometry or the materials from which it is made, suggesting that adult listeners have learned implicitly which cues are most reliable for particular perceptual tasks.
McAdams, S. (1993). Recognition of sound sources and events. In S. McAdams & E. Bigand (Eds.), Thinking in sound: The cognitive psychology of human audition (pp. 146-198). Oxford: Oxford University Press.In everyday speech, auditory and visual information are tightly coupled. Consistent with this, previous research has shown that facial and head motion can improve the intelligibility of speech (Massaro et al., 1996; Munhall et al., 2004; Saldana & Pisoni 1996). The multimodal nature of speech is particularly noticeable for emphatic speech, where it can be exceedingly difficult to produce the proper vocal stress patterns without producing the accompanying facial motion. Using a detection task, Swerts and Krahmer (2004) demonstrated that information about which word is emphasized exists in both the visual and acoustic modalities. It remains unclear as to what the differential roles of visual and auditory information are for the perception of emphasis intensity. Here, we validate a new methodology for acquiring, presenting, and studying verbal emphasis. Subsequently, we can use the newly established methodology to explore the perception and production of believable accentuation.
Experiment:Participants were presented with a series of German sentences, in which a single word was emphasized. For each of the 10 base sentences, two factors were manipulated. First, the semantic category varied -- the accent bearing word was either a verb, an adjective, or a noun. Second, the intensity of the emphasis was varied (no, low, and high). The participants' task was to rate the intensity of the emphasis using a 7 point Likert scale (with a value of 1 indicating weak and 7 strong). Each of the 70 sentences were recorded from 8 Germans (4 male and 4 female), yielding a total of 560 trials.
Results and Conclusion:Overall, the results show that people can produce and recognize different levels of accentuation. All "high" emphasis sentences were ranked as being more intense (5.2, on average) than the "low" emphasis sentences (4.1, on average). Both conditions were rated as more intense than the "no" emphasis sentences (1.9). Interestingly, "verb" sentences were rated as being more intense than either the "noun" or "adjective" sentences, which were remarkably similar. Critically, the pattern of intensity ratings was the same for each of the ten sentences strongly suggesting that the effect was solely due to the semantic role of the emphasized word. We are currently employing this framework to more closely examine the multimodal production and perception of emphatic speech.
References:Coordination between two (or more) people and which is called interpersonal coordination is present all the time and in every situation as soon as two people have perceptual contact (tactile, visual, sound etc). For example, when two people are walking together in the street, while they are holding their hands (or even just talking together), they immediately couple (co-ordinate) their gait by walking in-phase (Courtine et al., 2003). A similar phenomenon can be observed at the end of a performance or show when all applause tends to be in-phase (Neda, 2000). To date, all interpersonal coordination studies are based on specific time constraints such as the end of the show for Neda's study, a specific stride cycle (Courtine et al.) or even a metronome time scale (Schmidt et al. 1990). But no studies have examined what could happen if there are no constraints in time. In our study we use an improvisational dance task between two dancers as the experimental situation because: 1) movements are not planned, 2) moves are made from moment to moment and 3) improvisation begins at the very beginning of the task. The two main questions that arise are what kind of relationships emerges between two dancers involved in an improvisational task? And, can we find a common motor signature within each pair of participants?
MethodSix pairs of expert dancers were seated facing each other and asked to move only their forearm in the sagittal plane. We asked them to move freely while abiding by the instruction to "be tuned in to each other". This instruction helped create an improvisational situation. They started and stopped moving when they wanted. The subject's motor performance was recorded by electrogoniometers and analyses performed on the angle between the forearm and the table. We used the original method of the cross-wavelet transform (Issartel et al. 2005) in order to measure the temporal evolution of the common frequencies between the two dancers as well as the temporal evolution of the continuous relative phase between the experts.
Results and DiscussionThe results revealed a common motor signature within each pair of dancers in an improvisational dance task. Three principal results proved it: 1) we observed that a preferential frequency emerged between the two dancers and 2) we observed a regular alternation of the phase and anti-phase behavior at this preferential frequency. These results showed that an improvisation, which is an artistic and a symbolic task, does not produce random coordination or frequencies but spontaneous coordination and a preferential frequency (for a given pair) as also observed in the studies mentioned above (with specific constraints in time). Moreover, a third result can be observed. Conjointly with the main shared frequency, the dancers each perform secondary frequencies in a particular structure: they perform a coordination, stop it and start again. This kind of motor behavior may represent expert coordination because of the fine and complex motor behavior involved.
ConclusionIn conclusion, a situation with no specific constraints, such as a dance improvisation is definitely not a random or uncertain situation. There is a collective organization that structures improvisation and interpersonal coordination. This finding will help open the door to analyzing artistic tasks in the same ways as traditional laboratory tasks.
Abstract: Real-time postural control of full body 3D articulated figure is finding a renewed interest in the community of enactive interface as it is felt that the user body movement is the natural interaction channel for specifying virtual human postures. One of our goals is to reduce the time for evaluating the suitability of complex virtual prototypes with respect to various interaction tasks involving human beings. It is therefore required to handle potential collisions between the virtual human body and elements of the 3D environment. We present a first evaluation of a real-time Inverse Kinematics motion capture algorithm integrating an automatic collision management. Our collision management integrates an anticipation capacity owing to the concept of smooth collision zone surrounding the obstacles. As a consequence the movement of body parts is damped in the direction of the obstacle while the movement is not altered in other directions (Fig. 1).

Inverse Kinematic Motion Capture: Our prior contribution to the real-time motion capture of the full user body posture [MBRT99] was exploiting a set of 14 sensors (one for the position and orientation measurement andall other for the orientation data only). The objective was to reflect as closely as possible the performer posture over time (e.g. for constituting libraries of recorded movements). Now the trend is to reduce the number of sensors for improving the user comfort while still recovering a believable posture with the help of IK constraints [PHWLBTM04]. It is ultimately expected that the user can be freed from any invasive system with a vision-based approach [BVUPS05]. The present study exploit captured motion from the CMU database [CMU] on which we evaluate the computing cost of managing anticipated collisions within the Inverse Kinematics motion capture loop (Fig. 2).

Smoothing collision with Inverse Kinematic: Our approach relies on the concept of observers introduced in [PBLM05]. An observer is a geometric primitive (point, sphere, segment, cylinder...) for which we detect when it enters a collision zone surrounding the obstacles while the movement is captured (Fig2c). If it is the case (Fig2e) the movement of the observer is altered by declaring it as a traditional IK effector and by reevaluating the solution for the current time step. By construction the computing cost is higher than regular IK, but the case study from Fig1 demonstrates its feasability with an average 13 to 18ms per time step including collision management.
References:Abstract: In this paper we present some preliminary results of the application of qualitative motion analysis evaluation applied on a 3D reconstruction of an expressive movement. This joint work qualitatively evaluates a 3D movement reconstructed with Inverse Kinematics (IK) from partial sensor data [PHWLBTM04]. The input data for the reconstruction algorithm are obtained using an optotrack system with sensors applied on the half body of a clarinet musician. We used the EyesWeb Gesture Processing Library for analysing the video sequences of both the 3D subject, reconstructed with real-time requirements or off-line, and the human player during the performance. The obtained qualitative features are used to compare the movement of the avatar and of the real subject, and also for assessing the impact of the real-time requirement on the resulting movements. The results of the analysis can be used for improving the believability of 3D reconstructed humans, since they focus on the global quality of the produced motion, and put into evidence the areas (or movements) where a qualitative difference between real and virtual players is observed.

Qualitative Analysis and Discussion: The EyesWeb open platform is a software developed for supporting research on multimodal analysis with a special focus on expressive gesture. As such it is the appropriate complementary tool to visual inspection for assessing movement properties of the IK reconstructed movements (Fig. 1). The first major consequence of the realtime requirement (only one IK convergence step per 10Hz movement sample vs 3 to 5 steps for the off-line result) is the low-pass filtering effect that can be observed both on Fig.1. middle row and the yellow curve on Fig. 2a. Although equivalent to a slight loss in expressivity, the yellow curve still reproduces the main inflexions of the original motion without introducing discontinuity (a frequent pitfall of inverse approaches). In addition, the fluidity defined as the ratio between the Quantity of Motion (QoM) of the upper body part to the lower body part, shows that the motion of the lower part is not in agreement with the upper part, and this indicates an artificial motion. Other elements of the analysis have shown that the equilibrium constraint is too strict compared to the musician motion. So, firstly these results suggest that the number of IK convergence steps should not be too high to avoid the artificial introduction of movement energy, and secondly the IK constraints should be adjusted to improve the movement fluidity.

Abstract: Haptics is a quite new technology that can improve the way in which the user interacts with a computer and a machine. This technology is progressing quite fast. However, the number of applications in which we can find commercialized haptic devices is still very low. Maybe, this is true because the improvements of using haptic devices often are not clear. Another factor to take into account is that the price of haptics is high.
This presentation shows a real application in which a haptic device not only is recommended but also required to fulfill the customer requirements providing a reasonable low price. This application is the haptic lever used in train simulators.
In this context, haptic devices play an important role, since not only can they simulate the normal behavior of different not present mechanisms in the simulator, but they can simulate system failures as well. This work presents a multipurpose low-cost haptic 1DoF lever developed by CEIT, successfully used in a train simulator commercialized by Lander Training Simulators. A brief review of tested contact models is also presented. Finally, some experimental results are depicted and compared to a cost-effective solution, using a dSPACE.

Abstract: This contribution presents an architecture for the synchronized rendering of auditory and haptic stimuli of impulsive and continuous contact. Haptic rendering is performed using a Phantom Omni and the OpenHaptics Toolkit, while sound rendering is performed using physically-based audio contact models that we have developed and implemented within the real-time platform Pure-Data. The two rendering pipelines exchange information through shared memory, thus ensuring low latency in the communication. The auditory and haptic modes are tightly coupled because they are controlled through the same physical parameters.
The proposed architecture has been used to experimentally assess relative contributions of haptic and auditory information to multisensory (i.e., bimodal) judgments of contact stiffness using a rigid probe.
The auditory stimuli were obtained using a physically-based audio model of impact, in which the colliding objects are described as modal resonators that interact through a non-linear impact force. The impact force can be controlled through a stiffness parameter, that influences the contact time of the impact. Previous studies have already indicated that this parameter has a major influence on the auditory perception of hardness/stiffness.
The experiment presented here used the following procedure: subjects had to tap on virtual surfaces, and were presented with audio-haptic stimuli (i.e. , contact forces and impact sounds), one at a time. The stimuli were synthesized using different levels of haptic and auditory stiffnessess. Stiffness magnitude was estimated using an absolute magnitude-estimation procedure: subjects were asked to rate the surfaces on an arbitrary numerical scale, based on their perceived stiffness. The results indicate that when the haptic stiffnesses of the surfaces were the same, subjects consistently ranked the surfaces according to the auditory stimuli.
This paper reports on an explorative test session performed at INPG during December 2005. During this session two different implementations of a virtual PebbleBox were explored freely. The purpose of this test was to explore the importance of different parameters in the different setups, as well as to gain an insight into the strengths and weaknesses of the different approaches.
The virtual PebbleBox-ULUND implementation [1] is based on available software tools. The haptic and visual part is based on OpenHaptics from SensAble (http://www.sensable.com/) in combination with OpenDynamicsEngine (http://www.ode.org/). The audio part uses the playback of recorded sound files with Direct3DSound, and is thus able to render spatial (3D) sound feedback. The haptic hardware used was the PHANToM desktop. The virtual world is three dimensional and consists of a box with a number of moving spheres inside.
The Virtual PebbleBox-INPG implementation was used the TELLURIS simulation platform. The model was implemented using CORDIS-ANIMA Error! Reference source not found., a particle-based physical modeling system. The virtual world was two-dimensional and consisted of a circular container with a number of moving and circular objects inside. The computation of the movements of the objects (that is, nonsounding parts of the model) was computed at 3 kHz. Each of the moving objects could produce a sound, for example after a collision, which was produced by its acoustical deformations. The sounding parts of the model were computed at 30 kHz, and the sound output was monophonic. A 2D graphical representation of the model was performed at 50 Hz on a 21" CRT display. The haptic device used was an ERGOS [3] 3D stick constrained in a horizontal plane. The simulator provided high frequency (3kHz) communication with the haptic device, which allows very precise haptic feedback.
The virtual worlds were selected to put focus on the dynamic - haptic - audio properties of the environment, since this type of environment presents a true challenge for anyone exploring it without visual feedback.
Thus we were able to explore:

XVR is a development framework for complex VR application. It was used at PERCRO in the past 8 years for a variety of projects dealing with real-time graphics and interaction, and it has been continuously updated to accommodate always-evolving programming needs. XVR started later to be adopted also by other groups, and today it offers a wide range of useful and practical functionality to control the many aspects linked to VR programming, including real-time graphics, sound, interaction, and support to the most common VR devices (trackers, displays, haptics and interaction devices). XVR fundamental design goal is simplicity of use: every new programming construct need to be simple, flexible and effective. This strict design philosophy made XVR a platform able to accommodate both the novices needs as well as "professional" programming. This workshop will expose the overall framework of XVR technology, showing how it can be used in a range of common situations, including high-quality graphical rendering, real-time physics and network programming.