|
|
|
Angelo Farina (1), Ralph Glasgal (2), Enrico
Armelloni (1), Anders Torger (1) |
|
|
|
(1) Industrial Engineering Dept., University of
Parma, Via delle Scienze 181/A |
|
Parma, 43100 ITALY – HTTP://pcfarina.eng.unipr.it |
|
|
|
(2) Ambiophonics Institute, 4 Piermont Road,
Rockleigh, New Jersey 07647, USA |
|
HTTP://www.ambiophonics.org |
|
|
|
|
|
|
Ambiophonics is an hybrid method for creating a
realistic spatial reproduction of staged music, starting from two-channel
recordings, but extensible to
various kinds of microphonic arrangements up to discrete multichannel |
|
The system is based on two indipendently
designed groups of loudspeakers: a Stereo Dipole, responsible for the
reproduction of the direct sound and early reflections coming from the
stage, asnd a surround periphonic array, driven by real-time convolution
with room impulse responses |
|
|
|
|
The cross-talk cancellation allows for the
replica of the recorded signals at the ears of the listener |
|
|
|
|
The cross-talk cancellation allows for the
replica of the recorded signals at the ears of the listener |
|
|
|
|
First, a binaural measurement is made in front
of the Stereo Dipole loudspeakers |
|
|
|
|
|
|
The regularization parameter, e, has to be
adjusted by trials |
|
|
|
|
Measured impulse responses h |
|
|
|
|
Computed long-FIR inverse filters f |
|
|
|
|
Todays DSP boards are not powerful enough for
convolving long inverse FIR filters |
|
Warping can be used for concentrating the
computing power in the frequency range where it is most needed |
|
|
|
|
The WFIR structure was coded in assembly on the
AD21061 and on the AD21065L processors: here the assembly code of the main
cycle is shown: |
|
|
|
|
14 normal-hearing subjects (6 females, 8 males) |
|
Two sound samples: binaural recording of natural
sounds and a piece of pop music (Elton John) |
|
5-levels scale (insufficient, mediocre,
sufficient, fair, good ) |
|
The listener was free to switch at will between
the two processing algorithms, denoted simply as A and B |
|
Classic
ANOVA analysis of the subjective response |
|
|
|
|
Measurement of 3D (B-format) impulse responses
in theatres, with two source positions on the stage |
|
The IRs are processed, deriving the responses of
several directive microphones |
|
Each soundtrack of the original stereo recording
is convolved with the corresponding IR |
|
For each loudspeaker, the results of the two
convolutions are mixed |
|
|
|
|
|
The WXYZ channels of a B-format IR can be
processed, extracting a single (mono) response of a virtual microphone
pointing along a given versor r (rx, ry, rz): |
|
|
|
|
When an impulse response is reproduced in
another reverberant space, the resulting reverberant tail is the
convolution of the two reverberant tails |
|
|
|
|
A complete Ambiophonics system can be
implemented, nowadays, coupling a general-purpose DSP unit (cross-talk
cancellation) and convolution-based reverberators |
|
|
|
|
The preferred implementation is by means of a
simple software convolver and a cheap, modern PC. Two solutions are
currently available: |
|
|
|
|
The software implementation is based on
frequency-domain convolution (overlap-and-save), which inherently
introduces some latency. |
|
Furthermore, the audio stream I/O on a PC is
always buffered, so an intrinisic latency is caused by the buffer size |
|
BruteFIR distinguishes himself from other
convolvers by the fact that it implements partitioned convolution: the
impulse response is subdivided in many segments of equal length, and this
reduces the latency to twice the length of a segment, instead of twice the
length of the whole IR. |
|
On modern CPUs, the partitioned convolution is
more efficient than traditional unpartitioned overlap-and-save, with a
reduction of CPU load of 20-50%, and can reduce the overall latency to less
than 100 ms. |
|
Very efficient FFT implementations are freely
available (Intel NSP, FTTW), and thus the computing power of a PC is enough
for real-time convolution of 20 IRs, at 44.1 KHz, 32 bits, each being
65,536 points long. The demonstration machine, installed in room 22, is an
old Pentium-II 400 MHz. |
|
|
|
|
|
|
9 normal-hearing subjects (males) |
|
Three sound samples: |
|
Simple ranking test between three systems:
Stereo-Dipole, Virtual Ambisonics, complete Ambiophonics |
|
Each listener can switch freely among the three
systems during the playback |
|
|
|
|
Ambiophonics revealed to give significant
advantages over the two surround systems which constitutes it. |
|
It recreates a realistic virtual acoustic space
by means of convolution with proper digital filters |
|
The computational power required can be obtained
cheaply by means of a modern PC |
|
The system can be configured for different
number and position of the loudspeakers |
|
The “sweet spot” can easily accomodate three
persons, and also far from this area, the overall acoustic impression
remains that of being in a concert hall. |
|
|