Icsa 2017
Sep 7th to 10th, 2017
Graz, Austria


ICSA 2017 4th International Conferenceon Spatial Audio | September 7th to 10th, 2017, Graz, Austria #vdticsa

Fuoco Spiral / Finalist in Category 3: Music Recording

Fuoco Spiral is a contribution by Colin Lardier, Noé Faure, and Samuel Débias (France, CNSMDP).


Original Documentation:

Contribution to Europe’s First Student 3D Audio Production Competition in Ambisonics (ICSA 2017)

The present work was realized by three students (Noe Faure, Samuel Debias and Colin Lardier) of the « Advanced music production » program of the Paris Music Conservatoire [1]. As part of our spatial audio lessons, we worked on a recording/mixing project in order to assess different kind of spatial audio paradigms. Four student groups were established, each one focusing on a given technology; our group was in charge of the 3D Ambisonics mix, with the aim of getting familiar with this technique, in a practical production context.

Video is becoming increasingly important in music production and broadcasting; we therefore decided to include live video in this project, even though our training program is devoted to audio only. Moreover, live-video recording and mixing raises a number of artistic concerns and technical challenges, especially when associated with spatial audio. In this regard, we choose to record a jazz quartet (saxophone, piano, double bass and drum), playing different short improvised sequences, in collaboration with a dancer. The piece herein presented is a sequence-shot which shows the dancer from a first-person perspective, moving between the musicians. The overall piece is eight-minute-long, of which the last four minutes are submitted to this competition.


Recording session

Being in charge of the 3D Ambisonics recording, we decided to use mhacoustics Eigenmike [2] as main microphone system. One objective was to investigate the possibilities (and limitations) of the only marketed microphone that allows to record Higher-Order Ambisonics (up to 4th order). Also, Colin (one student of our group) was working on the design of an encoder for the Eigenmike, as part of his recent internship at Ircam. It was a thus good opportunity to test this encoder in real recording conditions, and to finalize last design details of it.

The recording session took place in the Maurice Fleuret’s hall at the Paris Music Conservatoire. The acoustics of this hall can be varied thanks to moveable curtains. We use the less reverberant configuration, and further installed some acoustic damping panels, because the reverberation was too long, especially for the drum. At last, the reverberation time was between 1 and 1.5 second.

Two computers were used to record all the microphones: one was used only for the Eigenmike (with decoding modules for monitoring in head tracked binaural), and the second was dedicated to the other systems and the spot microphones (monitoring in 5.1).

We positioned the quartet in a square configuration, the musicians facing each other. This was the position the musicians were used to rehearse in, and we believed that it was the optimal configuration to understand their music. Moreover, we had to make a compromise between the optimal musicians’ position for the video shooting and for the audio recording. For the audio recording, we wanted to have the double bass and the piano closer from the Eigenmike. However, this was not possible as the video shooting required to leave some space in the middle of the quartet and to keep a fixed distance between the musicians. As we believe that the (spatial) listening experience could benefit from a neat video realisation, we agreed to place the musicians in a configuration that is non-ideal for audio, knowing that we could somehow handle it during the audio mixing process. This resulted in an unoptimized balance between musicians in the Eigenmike recording: the double bass was not very readable and the direct to reverberant ratio was different between the musicians.

We used 18 spot mikes on the different instruments. As it was our first experience in ambisonics recording, we focused on the Eigenmike positionning and we decided to place spot mikes in our « usual » way for stereo production.


Ambisonics mix

The mixing session took place in Ircam’s Studio 1, equipped with a 4th order Ambisonics loudspeaker setup (hemisphere with 24 loudspeakers). A first version of the piece was achieved there. This version was later adapted to the ligeti Hall Ambisonics system (29 loudspeakers, up to 5th order Ambisonics).


Mixing environment

Two main software tools were used during the mix. The first one was Reaper, that we used as a « tape machine » to play-back the recorded audio files, the associated automation tracks, and to bounce the final Ambisonics mix. The Reaper DAW delivered audio signals (via the Jack Pilot) and automation (via OSC) to the second software tool, Panoramix, which is a spatialization engine develop by Ircam [3]. It is designed like a mixing console, and it allows to mix, reverberate and spatialize heterogeneous sound materials (spot microphones, Ambisonics recordings, etc.) in one or multiple formats (in this case, HOA) thanks to a flexible multichannel bus architecture. The OSC communication between the two applications was achieved via the ToscA plugin [4].

The picture below shows the mixing interface of Panoramix. You can see the mono tracks on the left side, with the usual mixing effects (EQ, dynamics, reverb send…). The Eigenmike’s track and the Ambisonics’ bus are visible in the middle (large tracks). At the bottom right, there is a simple geometrical interface that allows to position the desired elements in space.

The choice of Jack to route the audio signals from Reaper to Panoramix can be criticized, because Jack is not the most stable and professional solution. It would have been better to use an extra audio interface (like RME madiface) or a second computer, but Jack has the benefit of not requiring extra device. Thus, after the mixing session, we could still work on the mix on any computer (with enough CPU).


Mixing process

The artistic motivation of this mix was to create an immersive soundscape, and to emphasize the fusion of visual and sonic layers. We first made a « static » mix (without sound source movements), by combining the spot microphones layer with the Eigenmike layer, and adding artificial reverberation in order to have a consistent acoustic space (although most of the reverberation is provided by the Eigenmike recording).

Due to the previously-mentioned recording constraints, we had to boost the spot mikes of the double bass; this tends to generate a very point-like impression, compared to the other instruments. To mitigate that, we applied a so-called « blur » effect to these spot microphones. This Ambisonics blur effect was achieved by reducing the used order, resulting in a wider sound source perception [5-6].

The second stage of the mixing process consisted in slaving the sound sources’ positioning to the movements of the video-camera. We applied rotations to the sound-field according to the camera’s viewpoint. This is one of the great advantages of mixing in Ambisonics, as such rotations can be easily and efficiently operated on the encoded stream, affecting the sound scene as a whole [6]. In the process of creating a fusion between ambisonics mix and video, we wanted the auditory elements to be located where the instruments are seen on the screen. However, the problem is that the perceived visual location greatly depends on the screen width (and on the distance of the viewer to the screen). In order to create a “portable” mix, robust to different playback conditions, we chose to place an instrument closer to the centre (of the image) whenever it is visible, even though it may not match exactly its visual location. We also wanted to enhance the sources presence in relation to the visual cues, and we used for that another Ambisonics effect, referred to as “focus” (in Panoramix) or directional loudness [6]. This effect allows to emphasize some areas of the soundfield, by simulating virtual microphones with adjustable directivity pattern. The pictures below show the “focus” user interface with varying selectivity settings.

We used automation on this effect to adjust the focusing direction towards the instrument displayed on the video. This audio FX does not only affects the perceptible level of the instrument, but also increases its direct sound over reverberation ratio. Depending of the simulated directivity pattern, the instrument in front has more level and direct sound than those on the side; and those on the side more than those in the back. We used a rather low selectivity index (similar to the leftmost picture) in order to keep a certain amount of direct sound for the sources in the back. Interestingly, this FX creates a sort of “dynamic mix”, as if there was automation on tracks’ level and reverb send. This vividly animates the musical and visual narration, and helps to understand the camera’s movements.


Conclusion and future perspectives

This project was our first experience in ambisonics recording (using the Eigenmike) and mixing (using Panoramix). The disposition of an ambisonic microphone in relation with the musicians, just like in a « traditional » stereo recording, requires practice and experience. Moreover, the 3D listening is a new way to approach the sound image, that needs to be learnt. Despite of the induced technical difficulties (number of channels, encoding and decoding parameters…), we find this format has a great potential in terms of recording and mixing creative possibilities. The piece we’ve chosen allowed us to take benefits of the specifics ambisonics effects (rotation, focus, blur) in order to make a POV immersive production.


Listening instructions

We send you the video and the audio file separately. The video is cut to start synchronously with the audio. Import the two files in your favourite DAW, both starting at the same point.

Keep in mind that depending on your playback software and plugins you use to decode the ambisonics mix, there may be a delay between audio and video.

In order to compensate this delay, we added a synchronisation signal at the beginning (white screen on video and « beep » on audio), followed by the speech announcements (« front », « left », « right », « rear » and « top »).



[1] http://www.conservatoiredeparis.fr/en/accueil/

[2] https://mhacoustics.com/

[3] T. Carpentier. Panoramix: 3D mixing and post-production workstation. In Proc. 42nd International Computer Music Conference (ICMC), pages 122 – 127, Utrecht, Netherlands, Sept 2016.

[4] T. Carpentier. ToscA: An OSC Communication Plugin for Object-Oriented Spatialization Authoring. In Proc. of the 41st International Computer Music Conference, pages 368 – 371, Denton, TX, USA, Sept. 2015.

[5] T. Carpentier. Ambisonic spatial blur. In Proc 142nd AES Convention, Berlin, Germany, May 2017.

[6] M. Kronlachner and F. Zotter. Spatial transformations for the enhancement of Ambisonic recordings. In 2nd International Conference on Spatial Audio (ICSA), Erlangen, Germany, February 2014.