Increasing the bit rates
The simplest way of improving DAB sound quality is simply to increase the bit rates to those used in other countries. To
guarantee high fidelity stereo music a bit rate of 192 kbit/s or more is needed. However, with optimised coders, audio
processing and playout systems, a near hi-fi service can be provided at 160 kbit/s (for example BBC 6 music on digital TV).
However, the DAB spectrum in much of the UK is full, so to increase a station's bit rate either this must be done at the
expense of another station, the protection ratio reduced or more spectrum provided for DAB.
The major broadcasters are not going to cut the number of stations they broadcast. However, some of the small
independent stations on DAB may close (as did The Rhythm in 2002). Also, if any of the major radio companies merge, they
are likely to merge stations with similar formats, which may also free some space. Unless the regulator intervenes, any
space freed up is likely to go to new stations from the major radio companies.
Within the BBC multiplex, there is some scope to reshuffle the bit allocations. With improved coders and playout systems,
it might be argued that Radio 3 could be reduced from 192 to 160 kbit/s, enabling Radio 1 or 2 to go up to 160 kbit/s.
Another possibility is to broadcast 5 Live and BBC 7 at the lower (24 kHz) sampling frequency, enabling their bit rates to
be cut from 80 to 64 kbit/s, again enabling Radio 1 or 2 to go up to 160 kbit/s.
Reducing the protection ratio from level 3 to level 4 allows more programme data to be transmitted. However, it reduces
the quality of reception as has been demonstrated in London, where the third local multiplex uses the lower protection
ratio for most of its stations The two national multiplexes could compensate by increasing their transmitter powers, though
this may be limited by adjacent channel interference and interference to other countries for coastal transmitters. This
would significantly increase the transmission cost. For the local multiplexes, boosting the transmitter powers would cause
interference in other parts of the country, so extra transmitters would be needed to fill the gaps, which would be even
more costly.
More spectrum in the current DAB band is likely to be introduced over the next few years. Although some of it will be
needed to plug the holes in the network of local multiplexes, there should be at least one new national multiplex. A new
national multiplex could be used to help 'spread out' the current DAB stations, enabling many to increase their bit rates.
This would obviously increase transmission costs and limit the capacity for more new stations. Therefore, it is up to those
who are unsatisfied with the current DAB sound to lobby Ofcom and the BBC to use new spectrum in this way.
Residual feedback with timeshifting
Residual feedback with timeshifting is my own idea for improving the sound quality at current bit rates without
modifying the DAB transmission standard. The proposal was submitted to the BBC in August 2003.
The basic principle takes advantage of the fact that the human brain averages sound over intervals of around 30 ms and,
in general, the source material will largely repeat itself over this and longer periods (except for the very low frequency
components). Thus, there should be some scope to timeshift parts of the source material within such a window to make it
code better without affecting what the listener hears. However, timeshifting will introduce harmonics. At lower audio
frequencies, the resulting artefacts would be hugely disruptive. However as you increase the audio frequency, the harmonic
becomes a smaller and smaller fraction of the octave until a listener can no longer distinguish it from the original. Thus
for frequencies above a threshold in the 3-6 kHz range, there should be some scope to reprocess the source material to
make it code better (this is based on a source stating that humans can resolve frequencies down to 1/12 semitone). The
timeshifting may also disrupt the stereo image, so it may only be suitable for in the panned mono sub-bands of
joint stereo coding and, of course, for mono audio.
I propose the following process:
1) MPEG code the original source material and then decode it.
2) Subtract the MPEG coded and decoded sound from the original source material and call this the residual.
3) Pass the residual through a high pass filter (i.e. get rid of the low frequencies) with a threshold in the 3-6 kHz
range, to be determined by trial and error (lower thresholds may be suited to lower bit rates as the artefacts introduced
would be less noticeable where the overall quality is poorer).
4) Take the filtered residual and scale it by a factor between 0.5 and 1, to be determined by trial and error.
5) Divide the source material into 12 ms half-epochs (this may be varied) and add the processed residual to all the odd
epochs and subtract it from the even epochs (or vice-versa).
6) MPEG code and broadcast the modified source material.
In step 5, it may work better if the 12 ms residual half-epochs are paired up and averaged i.e. alternately
new_residual(t)=0.5(old_residual(t)+old_residual(t+12 ms) and
new_residual(t)=0.5(old_residual(t)+old_residual(t-12 ms)
What this process does is effectively take the sound rejected by the MPEG coder and nearly double it in alternate
half-epochs so that it will then exceed the noise floor and thus get through the coding process. By removing the same
sound in the other half-epochs, the overall balance of frequencies within a 24 ms MPEG coding epoch should be roughly the
same as in the original source material, so the MPEG coder is likely to select the same scale factors and quantisation
levels during the second coding as in the first coding.
Alternatively, dividing the source material into segments that are not factors of the MPEG coding epoch may lead to a
more efficient allocation of scale factors and quantisation levels during the second coding - I don't know; trial and
error would show this. Longer segments are more likely to lead to audible echo artefacts, whereas shorter segments are
more likely to lead to audible harmonics.
Residual feedback is best implemented within the MPEG coder itself as the quantisation levels can then be distributed
amongst the frequency bands in full knowledge that the residual feedback is taking place. However, it should also work
outside the coder module if this is easier to implement. It may also be possible to implement a double or even triple
residual feedback process, however this would be complex to design and a law of diminishing returns would apply.
The residual feedback technique proposed above should improve the coding resolution of the higher audio frequencies at
a given bit rate by up to a factor of 2. However, how well it will work will depend on how well the original coder
performs at the higher audio frequencies - for example it won't help those frequency bands allocated zero-bits per audio
sample.
Spectral band replication
Spectral band replication (SBR) is a technique developed by Coding
Technologies. Using this technique, the lower part of the audio spectrum is transmitted by conventional means, whilst
the SBR decoder reconstructs the higher frequencies based on an analysis of the lower frequencies, guided by a very low
data rate bit stream. This significantly increases the efficiency of audio coding. More details are given here. Employing SBR in DAB would require a modification to the
receivers. However, unmodified receivers would continue to receive the lower part of the audio spectrum, transmitted using
conventional mp2. Because old receivers would receive a degraded service, broadcasters would have to wait until most
receivers are SBR equipped before switching transmission mode.