Research Topics

 

  The project, which aims at establishing a link between the physical behaviour of sound sources and the listeners’ perception and evocations caused by the generated sounds, is interdisciplinary by nature and associates four main research areas, i.e. analysis, synthesis, perception and cognition. The analysis offers signal processing tools based on mathematical concepts and sound perception. The synthesis makes it possible to generate perfectly calibrated and realistic sounds based on physical and/or signal models, giving access to the evaluation of the perceptual importance of each synthesis parameter. The research area linked to perception uses classical methods collecting behavioral data from listening tests. Finally, the cognitive neuroscience area aims at studying the brain activity through various brain imaging techniques such as the Evoked Potential (EP) or functional Magnetic Resonance Imaging (fMRI). These four research areas closely interact in the senSons project, and enable the study of the relations between physics and perception of sounds.

 

 

Time-frequency masking

 

People: Thibaud Necciari (LMA), Sophie Savel (LMA), Sabine Meunier (LMA), Richard Kronland-Martinet (LMA), Bernhard Laback (ARI), Peter Balazs (ARI), Sølvi Ystad (LMA)

 

The coherency between a mathematical representation of a sound and its auditory perception is a subject of great importance for several applications such as virtual reality, sound synthesis, audio coding etc. The time-frequency masking project aims at finding links between signal properties and perception in order to propose perceptually relevant time-frequency representations and synthesis models. This project is part of a collaboration between two research teams at the LMA ; APIM (T. Necciari, S. Savel and S. Meunier) and S2M (R. Kronland-Martinet and S. Ystad) and an Austrian research team at the ARI (Acoustics Research Institute) in Vienna (P. Balazs and B. Laback). One PhD student, Thibaud Necciari, works full time on this project.

Psychoacoustic tests

 

As a first approach to the time-frequency masking approach, Gaussian-windowed stimuli with minimal spread in the time-frequency domain, were used both as targets and masks. The effective signal duration was 9.6 ms and their ERB (Equivalent Rectangular Bandwidth) was 600Hz. Four different experiments were effectuated in order to test the time-frequency masking caused by these stimuli. First their absolute threshold was measured and compared with the absolute threshold of sinusoids of 300ms duration for 11 frequencies. Then, 3 different masking tests were effectuated, i.e. frequency masking, time masking and time-frequency masking. In the frequency masking tests, 11 frequency separations were tested for signals presented simultaneously. This was done for two different masker frequencies; 750Hz and 4000Hz and three different masker levels; 30, 45 and 60 dB SL. In the time masking case, 5 temporal separations between masker and target were tested. Both masker and target had a frequency of 4kHz. Finally, in the time-frequency masking experiment, combinations of 8 frequency separations and 5 temporal separations were tested. The results revealed that the masking patterns caused by Gaussian-windowed stimuli is similar to the masking patterns caused by sinusoids of equal duration. The spectro-temporal properties of the Gaussian-windowed stimuli will make it possible to develop an auditory model coherent with the mathematical properties of time-frequency representations. In addition to the single masker experiments, multiple masker experiments with 2 to 4 maskers are currently effectuated at the ARI (Acoustics Research Institute) in Vienna (P. Balazs and B. Laback).

 

Sound Modelling and Masking

 

This part of the masking project concerns the integration of the masking results in a sound modelling context.Several strategies are envisaged. The general idea is to convolve the so-called "reference atoms" obtained from the psychoacoustic experiments with a suitable signal representation in order to find the masked thresholdof the signal in the time-frequency domain. As a first approach, Gabor and wavelet transforms have been chosen as signal representations thatare to be convolved with the "reference atoms".When the masked threshold is found, the part of the signal that is below the threshold will beremoved, and a threshold shift will be applied to the masking function.The signal content above the masked threshold will finally be used in the sound model.

Related publications:

 

T. Necciari, S. Savel, S. Meunier, S. Ystad, R. Kronland-Martinet, B. Laback & P. Balazs,
Auditory masking using Gaussian-windowed stimuli, Acoustics 2008, Paris, 29 juin - 4 juillet
2008.

 

B. Laback, P. Balazs, G. Toupin, T. Necciari, S. Savel, S. Meunier, S. Ystad & R. Kronland-
Martinet, Additivity of auditory masking using Gaussian-shaped tones, Acoustics 2008, Paris, 29
juin - 4 juillet 2008.

 

T. Necciari, Masquage sonore temps-fréquence. Application à l’analyse-synthèse des sons,

Journées fondatrices du groupe Perception Sonore de la SFA, 18-19 janvier 2007.

 

 

Signal invariant identification

 

People: Adrien Merer (LMA), Anaïk Olivero (LMA), Richard Kronland-Martinet (LMA), Sølvi Ystad (LMA), Mitsuko Aramaki (INCM), Bruno Torrésani (LATP), Philippe Depalle (McGill)

 

Semiotics of sound objects (Adrien Merer)

The aim of this study is to be able to control sound synthesis algorithms with perceptually relevant parameters. As a first step, monophonic sounds were presented to listeners through a free categorization task where the participants were asked to group sounds as a function of evoked movements. Five main categories were identifed by the listeners, i.e. approach, rise, fall down, pass by and turn. A large number of signal descriptors was further used to analyze the sounds and identify common signal patterns in each category. A certain number of descriptors were obtained from the music information retrieval and data mining litterature. This study led to the conclusion that for instance sounds belonging to the category turn is determined by an amplitude determined by the amplitude modulation rate and the ration of frequency modulation. Unfortunately, the highlighted parameters obtained from this study do not necessarily give sufficient clues to identify relevant control parameters for synthesis models. This is the reason why a new approach based on Gabor masks now is explored. The Gabor masks is a tool that make it possible to describe transformations between signals in the time-frequency domain. This approach is part of a newly started PhD (Anaïk Olivero) that is part of a collaboration between LATP and LMA.

 

Gabor masks (Anaïk Olivero)

Time-Frequency analysis produces signal representations which are often display relevant information for non-stationary signals. A Gabor Multiplier is a linear transformation
acting on signals by pointwise multiplication with a fixed Gabor Mask in the Gabor transform domain. The Gabor Mask can be interpreted as a time-frequency transfer function between the Gabor transforms of a source and a target signals, and can be used to highlight differences between the two signals. We develop methods for estimating a Gabor Mask between two signals and generalize this approach in various ways, in particular by estimating a Gabor Mask between two banks of close signals, and combining Gabor multiplers with displacements in the time-frequency domain.

 

 

Related publications:

 

A. Merer, S. Ystad, R. Kronland-Martinet & M. Aramaki, Semiotics of Sounds Evoking
Motions : Categorization and Acoustic Features, Lecture Notes in Computer Science, Vol. 4969,
Springer-Verlag 2008, ISBN 978-3-540-85034-2, pp. 139-158.

 

A. Merer, M. Aramaki, R. Kronland-Martinet & S. Ystad, Toward synthesis tools using evocation as control parameters,

Acoustics 2008, Paris, 29 juin - 4 juillet 2008.

 

Merer, S. Ystad, R. Kronland-Martinet, M. Aramaki, M. Besson and J-L Velay,

Perceptual Categorization of Moving Sounds for Synthesis Applications,

International Computer Music Conference (ICMC), Copenhagen, Denmark, 27-31 August 2007.

 

 

Timbre studies

 

People: Mathieu Barthet (LMA), Richard Kronland-Martinet (LMA), Thierry Voinier (LMA), Sølvi Ystad (LMA), Mitsuko Aramaki (INCM, Philippe Depalle (McGill), Kristoffer Jensen (Univ. Aalborg), Loïc Brancheriau (CIRAD-Foret), Henri Baillères (CIRAD-Foret)

From performer to listener : an acoustical and perceptual analysis of musical timbre (Mathieu Barthet)

 

Uncovering the acoustical parameters which account for musical expression presents a fundamental interest to improve our understanding of musical perception, and nds numerous applications regarding sound synthesis. Musical interpretation is an act during which the performer traduces the notational signal from the composer while expressing his/her own feelings. Many studies on musical interpretation deal with the role of timing, dynamics and pitch, but much less focus on the one of timbre. Timbre seems however to be one of the cornerstones of musical tones, which is strengthened by the fact that some instruments allow performers to act on timbre during tone production. The goal of this work, whose approach relies on analysis-synthesis, is to better understand the role of timbre in musical interpretation. Dissimilarity judgments of synthetic clarinet tones allowed us to characterize the inuence of the control parameters of the instrument (pressure and players lip force on the reed) on the generated timbres. Sonological analyses of a large number of clarinet musical interpretations played by a same expert clarinettist with the same musical intention showed a high consistency of the timbre variations. The nature of these variations is modied when the performer varies his/her interpretive intention. From the perceptual point of view, we showed that the time-varying brightness shaping of the tones from a musical sequence could induce signicant increases in the preferences of listeners. This work supports the fact that temporal morphological aspects of timbre (e.g. brightness temporal evolution) are one of the vectors of musical expressiveness.

 

Timbre and semiotics

 

This approach aims at identifying signal structures responsible for various evocations caused by sounds and to identify the neural networks involved in these processes. Several sound categories are investigated in different studies and the details concerning the brain imaging setups are described in the next section. Although sound synthesis techniques make it possible to imitate almost any sound source, knowledge about the perceptual and cognitive relevance of signal parameters is often lacking. Such knowledge is important for several reasons. Firstly, it makes it possible to simplify synthesis models simulating only perceptually relevant phenomena and secondly it gives access to the control and construction of sounds from a semantic description without need for reference sounds. This is important for instance for virtual reality applications when sounds coherent with a visual scene are to be constructed. In addition to the identification of perceptually relevant signal structures, it is also of interest for neuroscientists to analyse the brain processes activated when listening to a sound to understand how the brain assigns a sense to sounds and whether the processes are similar for sounds and for words.

 

In a couple of studies sounds for which the source cannot be easily identified are used to link signal structures and evocations (i.e. sounds favouring acousmatic listening). The choice of such a sound corpus is made to draw the listeners' attention towards the intrinsic properties of the sounds also favouring the access to the non-sounding area (e.g. motion, types of behaviour, spatial experience, energetic phenomena, psychological tensions, etc.). This also should help reducing the role of linguistic mediation to avoid that the words associated with a sound not only reflect the sound producing source (i.e. the sound of a barking dog would for instance be associated with the word dog). In one case a free categorization test was effectuated on acousmatic sounds to identify categories of evoked movements. The categories were further analyzed and the similarities between signal structures of sounds within the same category were identified. In another study subjects were asked to write down words associated with acousmatic sounds. The most frequencly evoked words were further used in a priming test where associated and non-associated word-sound pairs were presented to listeners who were asked to decide whether or not the pairs were associated. In this case the brain activity was measured by the Evoked Potential method. The acoustical analysis showed in this case that signals with a high spectral density that vary little with time tend to be associated with uncomfortable situations (shivering, cold, pain, etc.). Sustained signals with varying frequencies most generally evoked movements (bouncing, rising, rolling, rotating etc.)

 

 

Timbre and material properties (coll. CIRAD)

 

Xylophone sounds produced by striking wooden bars with a mallet are strongly inuenced by the mechanical properties of the wood species chosen by the xylophone maker. In this project, we addressed the relationship between the sound quality based on the timbre attribute of impacted wooden bars and the physical parameters characterizing wood species. For this, a methodology is proposed that associates an analysis-synthesis process and a perceptual classication test. Sounds generated by impacting 59 wooden bars of different species but with the same geometry, were recorded and classied by a renowned instrument maker. The sounds were further digitally processed and adjusted to the same pitch before being once again classied. The processing is based on a physical model ensuring the main characteristics of the wood are preserved during the sound transformation. Statistical analysis of both classications showed the inuence of the pitch in the xylophone maker judgement and pointed out the importance of two timbre descriptors: the frequency-dependent damping and the spectral bandwidth. These descriptors are linked with physical and anatomical characteristics of wood species, providing new clues in the choice of attractive wood species from a musical point of view.

 

Timbre synthesizer

 

The timbre synthesizer is an on-going project aiming at developing synthesizers that offer a more intuitive control of sounds through perceptually relevant parameters. The previously mentioned timbre studies aim at extracting such parameters for specific sound categories. So far, various material categories have been studied as well as categories evoking movements. As a first approach to this project, the design and the control of a real-time synthesis model dedicated to percussive sounds is developed. We mainly address the simulation of the "sound colorations" (i.e. metallic, wooden or crystal-clear sounds) independently of the mechanical properties of the sound source and their control using a few perceptually relevant descriptors. The synthesis model takes into account both physical and perceptual considerations, leading to a combination of additive and subtractive synthesis processes. The manipulation of the synthesis parameters is not intuitive since the relationship between the parameters and the resulting sound is intricate, especially when dealing with material control. Therefore, we propose a timbre control space based on the results obtained from listening tests aiming at better understanding how the perception of material in terms of sound colorations can be linked to the synthesis parameters. This timbre control space is mainly based on damping and roughness and allows for an intuitive navigation across different material categories.

 

Related publications:

 

M. Barthet, P. Depalle, R. Kronland-Martinet & S. Ystad, From performer to listener: an analysis
of timbre variations, article submitted to Music Perception, septembre 2008.

 

R. Kronland-Martinet & T. Voinier, Real-Time Perceptual Simulation of Moving Sources:

Application of the Leslie Cabinet and 3D sound Immersion,EURASIP Journal on Audio, Speech,
and Music Processing, Volume 2008, article ID 849696, 10 pages, doi : 10.1155/2008/849696,
juillet 2008.

 

M. Barthet, R. Kronland-Martinet & S. Ystad, Improving Musical Expressiveness by Time-
Varying Brightness Shaping, Lecture Notes in Computer Science, Vol. 4969, Springer-Verlag
2008, ISBN 978-3-540-85034-2, pp. 313-336.

 

M. Aramaki, H. Baillères, L. Brancheriau, , R. Kronland-Martinet & S. Ystad, Sound
Quality Assessment of Wood for Xylophone Bars, Journal of the Acoustical .Society of
America, Vol. 121 No. 4, avril 2007, pp 2407-2420.

 

Mitsuko Aramaki & Richard Kronland-Martinet & Thierry Voinier & Sølvi Ystad ,
A Percussive Sound Synthesizer Based on Physical and Perceptual Attributes,
Computer Music Journal (MIT Press), 30:2, pp 34-43, Summer 2006.

 

M. Aramaki, L. Brancheriau, R. Kronland-Martinet & S. Ystad, Perception of impacted materials:

sound retrieval and synthesis control perspectives, Computer Music Modelling and Retrieval
(CMMR) 2008, Copenhagen, Denmark, 19th-23rd of May 2008 pp. 1-8.

 

M. Barthet, P. Guillemain, R. Kronland-Martinet & S. Ystad, Exploration of timbre variations

in music performance, Acoustics 2008, Paris, 29 juin - 4 juillet 2008.

 

M. Barthet, Ph. Depalle, R. Kronland-Martinet and S. Ystad, , The Effect of Timbre in

Clarinet Interpretation, International Computer Music Conference (ICMC), Copenhagen,
Denmark, 27-31 August 2007.

 

M. Aramaki, R. Kronland-Martinet, Th. Voinier and S. Ystad, Timbre Control of a Real-
Time Percussive Synthesizer, International Congress on Acoustics, Madrid, 2-7 September
2007.

 

Mitsuko Aramaki, Loïc Brancheriau, Henri Baillères, Richard Kronland-Martinet &
Sølvi Ystad, Perceptual Classification of Wooden Bars, The Thirteenth International
Conference on Sound and Vibration, Vienna, Austria, July 2-6, 2006.

 

Mathieu Barthet, Richard Kronland-Martinet, Sølvi Ystad, Does Timbre Follow Systematic

Variations in Music Performance?, conference on Digital Audio Effects
(Dafx), Montreal, Canada, 18-20 September 2006

 

M. Barthet, S. Ystad & R. Kronland-Martinet, Evaluation perceptive d’une interprétation

musicale en fonction de trois paramtres dexpression : le Rythme, l’Intensité et le Timbre,
Journées fondatrices du groupe Perception Sonore de la SFA, 18-19 janvier 2007.
sound examples

 

Mitsuko Aramaki et Mireille Besson. Approche électrophysiologique de la sémiotique des
sons, Troisimes Rencontres Interartistiques de l’Observatoire Musical Franais, Mars 2006.

 

Brain Imaging (Evoked Potential) Studies

 

People: Mitsuko Aramaki (INCM), Mireille Besson (INCM), Daniele Schön (INCM), Céline Marie (INCM), Richard Kronland-Martinet (LMA), Sølvi Ystad (LMA)

EEG study of conceptual priming between words and non-verbal sounds

 

The aim of this study is to investigate the way we derive meaning from sounds and to identify the neural networks involved in these processes. To this aim, an experimental protocol based on the priming paradigm and recordings of the Event-Related Potentials (ERPs) has been applied , where one stimulus (word or sound) has been used to create a context that influences the processing of a following target stimulus (word or sound). Earlier studies have used this protocol to examine semantic processing in language and it has been shown that the amplitude of a negative component of the Event-Related Potentials peaking around 400ms post-word onset, the N400 component, is larger for final words unrelated to the preceding word than for related words. Lately, researchers have become interested in studying whether an N400 component can be elicited and modulated by the conceptual relation in a nonlinguistic context, and this is also one of the goals of this study. In the auditory domain, the classical priming paradigm has been used to study conceptual processing of environmental sounds. However, the results of these experiments may reflect linguistic mediated effects, since the identification of the sound producing source might activate a verbal label (if for instance the listener hears the noise of a drill, the verbal label drill might be activated). The purpose of the present study was to try, as far as possible, to reduce the role of linguistic mediation. Hence, sounds were either recorded or synthesized so that it was impossible to identify the source that produced the sounds. Related and unrelated sound-word pairs were presented in Experiment 1 and the order of presentation was reversed in Experiment 2 (word-sound). Results showed that, in both experiments, participants were sensitive to the conceptual relation between the two items. They were able to correctly categorize items as related or unrelated with good accuracy. A relatedness effect developed in the Event-Related brain Potentials between 250 and 600 ms, although with a slightly different scalp topography for word and sound targets. The results were used to propose a tentative model of semiotics of sounds.

 

Categorization of impact sounds

 

In this study, we investigated the categorization of sounds from impacted materials. The aim was to examine the neural bases of the processes involved in the categorization of impact sounds by measuring the electrical brain activity (Event Related Potentials), and to relate these results to the acoustical parameters of sounds in order to point out perceptually relevant parameters for a realistic synthesis of these sounds. Three different materials (Wood, Glass and Steel) were studied. For that purpose, natural sounds were recorded, analyzed and resynthesized, and a sound morphing process was applied to construct hybrid sounds simulating progressive transitions between different material categories. Participants were asked to categorize synthesized sounds as Steel, Wood or Glass. Behavioral data allowed for a definition of typical and ambiguous sounds in each category. They revealed that subjects most often categorized sounds as Steel, and that the RTs were longer for ambiguous than for typical sounds. Concomitantly, electrophysiological data revealed that the processing of Steel sounds differed signicantly from Glass and Wood sounds as early as 150 ms. These ndings demonstrate that both temporal and spectral properties of the signal (related to damping and tonal consonance) were used to categorize sounds from different materials.

 

Language and sounds

 

Two experiments were designed to examine the neural bases of sound categorization and of conceptual priming. Percentage of correct responses, Reaction Times (RTs) and Event- Related brain Potentials (ERPs) were analyzed. In Experiment 1, impact sounds were generated using analysis-synthesis techniques to reproduce realistic synthetic sounds and to build sound continua between material categories (Wood, Metal and Glass). Results showed that ambiguous sounds (intermediate positions on the continua) were associated with longer RTs and increased negativity in the ERPs than typical sounds (extreme positions on the continua). Priming effects for ambiguous and typical sounds and for linguistic stimuli were then compared in Experiment 2 by using a same-different categorization task. Results showed
similarities between conceptual and semantic priming effects but also revealed differences in time course and scalp distribution.

 

In addition to comparisons between sound categories and words, the perception of modifications in meter, rhythm, semantics and harmony in language and music has been investigated. A special time-stretching algorithm was developed to work with natural speech. In the language part, French sentences ending with tri-syllabic congruous or incongruous words, metrically modified or not, were made. In the music part, short melodies made of triplets, rhythmically and/or harmonically modified, were built. These stimuli were presented to a group of listeners that were asked to focus their attention either on meter/rhythm or semantics/harmony and to judge whether or not the sentences/melodies were acceptable. Language ERP analyses indicate that semantically incongruous words are processed independently of the subjects attention thus arguing for automatic semantic processing. In addition, metric incongruities seem to influence semantic processing. Music ERP analyses show that rhythmic incongruities are processed independently of attention, revealing automatic processing of rhythm in music.

 

Related publications:

 

D. Schön, S. Ystad, R. Kronland-Martinet & M. Besson, The evocative power of sounds: EEG
study of conceptual priming between words and nonverbal sounds, submitted to the Journal of Cognitive
Neuroscience, September 2008.

 

M. Aramaki, C. Marie, R. Kronland-Martinet, S. Ystad & M. Besson, Sound categorization,
conceptual and semantic priming: behavioral and electrophysiological approaches, submitted to the
Journal of Cognitive Neuroscience, Juny 2008.

 

S. Ystad, C. Magne, S. Farner, G. Pallone, M. Aramaki, M. Besson & R. Kronland-Martinet,
Electrophysiological Study of Algorithmically Processed Metric/Rhythmic Variations in Language and Music,

EURASIP Journal on Audio, Speech, & Music Processing, vol. 2007,
Article ID 30194, 13 pages, 2007. Doi :10.1155/2007/30194.

 

S. Ystad, R. Kronland-Martinet, D. Schön & M. Besson, Vers une approche acoustique et cognitive

de la sémiotique des objets sonores, in Les Unités Sémiotiques Temporelles (UST),

Nouvel outil d’analyse musicale, Théories et Applications, Collection Musique/Sciences,

éditions Delatour 2007, pp. 73-83.

 

C. Magne, C. Astésano, M. Aramaki, S. Ystad, R. Kronland-Martinet, M. Besson,
Influence of Syllabic Lengthening on Semantic Processing in Spoken French: Behavioural and
Electrophysiological Evidence, Cerebral Cortex, doi :10.1093/cercor/bhl174, Oxford University
Press, janvier 2007.

 

Daniele Schön, Sølvi Ystad, Mireille Besson and Richard Kronland-Martinet, An acoustical and

cognitive approach to the semiotics of sound objects, Bologne, Italy August 2006

 

M. Aramaki, M. Besson, R. Kronland-Martinet & S. Ystad, Catégorisation sonore des matériaux frappés:

Approches perceptive et cognitive, Journées fondatrices du groupe
Perception Sonore de la SFA, 18-19 janvier 2007.
 sound examples

 

S. Ystad, R. Kronland-Martinet, D. Schön & M. Besson, Vers une approche acoustique et

cognitive de la sémiotique des objets sonores, Journées fondatrices du groupe Perception Sonore
de la SFA, 18-19 janvier 2007.

 

 

Industry Collaborations

 

Peugeot-Citroën: Perception of car door and motor noise

People: Vincent Roussarie (PSA), Marie-Céline Bezat (PSA), Jean-Francois Sciabica (PSA), Richard Kronland-Martinet (LMA), Sølvi Ystad (LMA)

 

Two PhD subjects are linked to our Peugot-Citroën collaboration. Marie-Céline Bezat did her PhD on car-door noise, and her PhD thesis was entitled Perception des bruits d’impact. Application au bruit de fermeture de porte automobile. Marie-Céline defended her PhD in December 2007, and her manuscript can be downloaded here.

 

Perception of transient noises applied to car-door closing sounds.

 

The industrial field of door closing sounds consists in translating drivers' expectations into specifications for the mechanical components of the door. A client who manipulates a door in a showroom and perceives from the door closing sound that it is indeed closed also induces complex evocations of quality and solidity to which the manufacturer is particularly attentive. The translation of these complex evocations into technical rules is not an immediate process. A first step consists in characterising what is perceived thanks to criteria extracted from the acoustical signal, and this is then completed by a characterisation of the organic sources and acoustic transfers in order to establish the desired technical specifications. This thesis has endeavoured to study the first step of the process, one that raises fundamental questions on the impact sound perception: this has consisted in understanding what we perceive from an action-related impact sound, and to extract the underlying criteria from the acoustic signal. In Situ experimentation is first carried out in order to observe the qualitative characteristics of the uses and the impressions derived from a natural situation, thus identifying the pertinent environmental factors. This phase is completed with a quantitative study of the influence of environmental factors, such as other door sounds, motion perception and a priori image of the vehicle. After observing the links between In Situ and laboratory perception, door closing sounds are finely decomposed into perceptual properties: analytical properties (which are obtained thanks to sensory analysis), natural properties (linked to perception of sources and events) and evocations. The acoustic characterisation of the sound is then processed by means of an analysis-synthesis model which aims, not at reproducing the exact replica of the door closing sounds, but to synthesize sounds that preserve perceptual properties with a reduced number of signal parameters. The model consists in decomposing the sound in several independent impact sources, each impact being modelled by a set of gains and damping factors in frequency bands. The model is specifically calibrated to reproduce the analytical properties that are more directly linked to the acoustic signal. The controlled sounds are then used to observe the effects of acoustic parameters on the perceptual properties, and to propose underlying acoustic criteria.

 

France Télécom: spatialized sound synthesis.

People: Charles Verron (FT), Gregory Pallone (FT), Mitsuko Aramaki (INCM), Richard Kronland-Martinet (LMA)

 

In virtual auditory environments, sound generation is typically based on a two-stage approach: synthesizing a monophonic signal, implicitly equivalent to a point source, and simulating the acoustic space. The directivity, spatial distribution and position of the source can besimulated thanks to signal processing applied to the monophonic sound. A one-stage synthesis/spatialization approach, taking into account both timbre and spatial attributes of the source as low-level parameters, would achieve a better computational efficiency essential for real-time audio synthesis in interactive environments. Such approach involves a careful examination of sound synthesis and spatialization techniques to reveal how they can be connected together. This paper concentrates on the sinusoidal sound model and 3D positional audio rendering methods. We present a real-time algorithm that combines Inverse Fast Fourier Transform (FFT-1) synthesis and directional encoding to generate sounds whose sinusoidal components can be independently positioned in space. In addition to the traditional frequency-amplitude-phase parameter set, partials positions are used to drive the synthesis engine. Audio rendering can be achieved on a multispeaker setup, or in binaural over headphones, depending on the available reproduction system.

 

Related publications:

 

J.-F. Sciabica, F. Richard, V. Roussarie, Towards an hearing threshold prediction

model in car noise, Acoustics 2008, Paris, 29 juin - 4 juillet 2008.

 

M-C Bezat, V. Roussarie, R. Kronland-Martinet & S. Ystad, Relations between acoustic

parameters and perceptual properties: an approach by regressions tree applied to car door

closure sounds, Acoustics 2008, Paris, 29 juin - 4 juillet 2008.

 

M-C Bézat, R. Kronland-Martinet, V. Roussarie, Th. Voinier and S. Ystad, Car Door
Closure Sounds: Characterization of Perceptual Properties Through Analysis-Synthesis
Approach, 19th International Congress on Acoustics, Madrid, 2-7 September 2007.

 

Marie-Céline Bezat, Vincent Roussarie, Richard Kronland-Martinet, Sølvi Ystad &
Stephen McAdams, Perceptual Analyses of Action-Related Impact Sounds, Euronoise,
30May-1 June 2006, Tampere, Finland.

 

M-C. Bezat, V. Roussarie, R. Kronland-Martinet, S. Ystad & S. McAdams, Qualification

perceptive des bruits dimpact. Application au bruit de fermeture de porte, Journées fondatrices
du groupe Perception Sonore de la SFA, 18-19 janvier 2007.

 

 

C. Verron, M. Aramaki, R. Kronland-Martinet & G. Pallone Spatialized additive synthesis,
Acoustics 2008, Paris, 29 juin - 4 juillet 2008.

C. Verron, M. Aramaki, R. Kronland-Martinet & G. Pallone, Spatialized additive synthesis
of environmental sounds, 125th Audio Engineering Society convention, 2-5 octobre 2008, San
Fransisco, USA.

 

Program and abstracts, senSons meetings  


First senSons meeting, May 5th 2006

Second senSons meeting, January 28-29 2008

 

International conferences co-organized by senSons and LMA

 

CMMR2007 - Sense of Sounds - jointly organized with ICMC’07

 

CMMR2008 - Genesis of Meaning - jointly organized with NTSMB

 

CMMR2009 - Auditory Display - jointly organized with ICAD’09