There are a variety of approaches to synthesizing the effect of a reverberant space. Approaches based on direct measurement of a particular room response (convolution techniques) tend to be less extensible and computationally expensive, though possible using special purpose hardware. The use of three-dimensional physical modeling techniques is also limited by computational requirements. Most current work in simulating reverberation is based on ``physically- and perceptually-informed'' techniques that seek to create parametrically-controllable systems. These models can produce very good reverberant responses though they generally cannot be made to correlate with actual room measurements.
Two excellent overviews of artificial reverbation developments are given by
- Gardner, W. G. ``Reverberation Algorithms,'' in Applications of Signal Processing to Audio and Acoustics, M. Kahrs and K. Brandenburg, Eds., Kluwer Academic, Norwell, MA, 1997.
- Välimäki, V., Parker, J., Savioja, L. Smith, J. O., Abel, J. ``Fifty Years of Artificial Reverberation, IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, No. 5, July 2012.
A listener-source setup in a room.
- The simulation of room reverberation ideally involves two transfer functions per sound source per listener (one for each ear). The tranfer functions or filter representations will change if anything in the room changes.
Transfer function approach to reverberation simulation for three sources and one listener.
- For the three source, one listener setup depicted in Fig. 3, the output signals would be computed via six convolutions:
where hij[n] is an FIR filter representation of the impulse response from source j to ear i and Mij is the length of the filter.
- For impulse responses of one second (appropriate when the t60 = 1 second) and a sample rate fs = 50 kHz, each filter would require 50,000 multiplies and additions per sample or 2.5 billion multiply-adds per second. For three sources and two listening points (ears), this corresponds to 30 billion operations per second.
- In addition to being very computationally demanding, this approach requires new filter representations whenever the room setup changes. In general, it is difficult to implement a flexible reverberation control scheme using convolution-based approaches.
- A ``distributed'' physical modeling approach to a reverberant space would allow for dynamic modifications of listener and source positions during an acoustic simulation.
- However, a brute force acoustic simulation of a room response using three-dimensional physical modeling techniques would require nearly 150 million ``mesh'' grid points to simulate a room of only 4 x 4 x 3 meters at a sample rate of 50 kHz.
- In addition, current three-dimensional modeling techniques are plagued by dispersion errors that would limit the quality of this approach. It is possible to use warping techniques to minimize the dispersion errors but this would significantly increase the already prohibitive computational burden.
- Based on perceptual limits, the impulse response of a reverberant space can be divided into two segments:
- The beginning of the impulse response consists of distinct, relatively sparse, early reflections.
- The remainder of the impulse response, called the late reverberation, consists of densely-packed echoes that become impossible to distinguish in time.
- This region of high echo density (which increases as t2) can be approximated by a random time distribution.
- The frequency response of a reverberant space can likewise be divided into two segments:
- The low-frequency region consists of a relatively sparse distribution of resonant modes.
- Higher-frequency modes are packed so densely that they are best characterized by a random frequency distribution with certain statistical properties.
- Parametric controls for an artificial reverberator should include:
- t60(f) in at least three frequency bands
- G2(f) = signal power gain
- C(f) = "clarity" (ratio of impulse response energy in early reflections to that in the late reverb section)
- inter-aural correlation coefficient at the left and right ears
- Early reflections, within the first 100 milliseconds or so, are typically implemented using tapped delay lines (suggested by Schroeder (1970) and implemented by Moorer (1979)).
Early reverberation implemented with a tapped delay line, followed by a late reverberation processing block.
- Early reflections should be calculated for a given geometry and spatialized.
- The delay-line tap outputs should be scaled in proportion to propagation distance.
- Most room surfaces are not perfectly flat, resulting in diffuse scattering. Thus, attempts to exactly reproduce the response of a given room via techniques such as ray tracing are generally unsuccessful.
- A good late reverberation should have a smooth decay and a smooth frequency response.
- Some fluctuation in the short-term energy is needed to achieve a natural sound (Blesser, 2001; Dattorro, 1997).
- Moorer's ideal late reverb: exponentially decaying white noise. But it would be better to say exponentially decaying ``colored'' noise, since the high-frequency energy should decay faster than the low-frequency energy.
- Schroeder (1962) suggested the use of parallel comb filters and cascaded allpass filters to synthesize reverberation.
Cascaded Schroeder allpass sections.
- Allpass filters produce frequency-dependent time shifts, which help diffuse the sound. For this reason, Schroeder allpass sections are sometimes referred to as impulse expanders or impulse diffusers.
- The gain values are typically set around g = 0.7. The delay-line lengths Mi should be mutually prime and span successive orders of magnitude.
Impulse response of three cascaded Schroeder allpass sections (g = 0.7 and Mi = [113, 337, 1051]).
- The impulse response, calculated with the Matlab script allpass.m, of three cascaded Schroeder allpass sections is shown in Fig. 6.
- The feedback comb filters provide coloration and the delay-line lengths are set to mutually prime values.
The JCRev reverberator from CCRMA (based on Schroeder/Moorer).
- The STK classes PRCRev, JCRev, and NRev implement Schroeder reverberators of various complexities. In particular:
- PRCRev implements two series allpass units and two parallel comb filters.
- JCRev implements three series allpass units, four parallel comb filters, and two decorrelation delay lines in parallel at the output.
- NRev implements six parallel comb filters, three series allpass units, a lowpass filter, another allpass filter in series, followed by two allpass filters in parallel at the output.
A feedback delay network structure proposed for artificial reverberation by Jot (1992).
- Figure 8 illustrates an example FDN reverberator using three delay lines proposed by Jot (1992).
- An FDN can be seen as a vector feedback comb filter, with N feedback ``channels'' (N=3 in Fig. 8).
- The ``mixing matrix'' provides diffusion by ``scattering'' energy amongst the N channels. Assuming decay control is handled by the gi coefficients, this matrix should be ``lossless''.
- To achieve frequency-dependent decay control, the gi coefficients can be replaced by low-order digital filters.
- The ``tonal correction'' filter E(z) is a low-order filter that serves to equalize modal energy amongst the three bands.
- The delay-line lengths are generally chosen to be mutually prime. System ``tuning'' remains a manual, trial and error process.
- A ``3-channel'' FDN feedback matrix can be represented as:
- The inner loop calculations of the FDN shown in Fig. 8 can then expressed as:
and the loop output given by
- These expressions can also be written in frequency-domain vector notation as
- The matrix
A = GM is called the state transition matrix. G is typically a diagonal matrix of lowpass filters, each having gain no greater than 1.
- Stability of the FDN is assured when the norm of the state vector x[n] decreases over time when the input signal is zero:
for all , where
- Stable feedback matrices can thus be parameterized in terms of
A = GM, where M is any orthogonal matrix and G is a diagonal matrix having entries less than 1 in magnitude.
- A feedback matrix MN is lossless if and only if its eigenvalues have modulus 1 and its N eigenvectors are linearly independent.
- One choice of feedback matrix MN for FDNs is a specific Householder reflection proposed by Jot (1992):
is the specific vector about which the input vector is reflected in N-dimensional space.
- In addition to being lossless and not requiring any multiplies when N is a power of 2 (for fixed-point implementations), the Householder matrix is attractive because the feedback matrix-times-channel-vector operation can be computed with only 2N-1 additions (by first forming
times the input vector, applying the scale factor 2/N, and subtracting the result from the input vector).
- FDN delay-line lengths are generally chosen to be mutually prime, which maximizes the psuedo-random behavior of the system.
- A rough guide to the average delay-line length is the ``mean free path'' of the desired reverberant environment, which is defined as the average distance a ray of sound travels before it encounters a reflecting obstacle.
- The mean free path can be approximated as
, where V is the total volume and S is the total surface area enclosing the space.
- The desired modal density can guide the determination of the total sum of the delay line lengths. Schroeder suggests a modal density of 0.15 modes per Hz for a 1 second t60. This can be generalized to
- Reverberation time is controlled by lowpass filters implemented within each feedback channel (the G matrix discussed above).
- A lowpass filter in series with a length Mi delay line should approximate
Hi(z) = GMi(z), where G(z) is the ideal per-sample decay filter. In terms of a desired t60, this implies
- Jot proposes first-order filters of the form:
where gi is set to give a desired reverberation time at dc and ai determines the reverberation time at high frequencies.
- From the expression above, we find
gi = 10-3 MiT/t60(0)
and from (Jot and Chaigne, 1991)
- The image method is based on a ray tracing model for room reflections.
- This technique is used to determine ``virtual sources'' at mirror image locations with respect to a reflecting surface.
- Once the virtual sources are determined, propagation distances can be easily calculated from two- or three-dimensional Euclidean geometry.
- This method assumes specular reflections from large, smooth surfaces. It may be useful for determining a set of early reflections in spaces with large, flat walls.
- However, this technique does not account for diffuse scattering.
- Because an acoustic space is by and large a linear, time-invariant system, one can ``simply'' measure its impulse response and use convolution to reproduce the effect of playing a given audio signal in that space.
- Convolution is an expensive computation ... need efficient techniques.
- It is not a ``simple'' process to measure the impulse response of a space.
- A measured impulse response corresponds to a single source-listener configuration.
- A measured impulse response is inflexible to modifications such as shortening the reverberation time.
||©2004-2016 McGill University. All Rights Reserved.|
Maintained by Gary P. Scavone.