Interactive Audio Codecs | 17 | Beep to Boom | Simon N Goodwin

ABSTRACT

Codecs compress and expand data, thereby fitting more sounds into a limited memory space, with varying hardware costs. There’s no ideal codec; this chapter explains why and how to pick the best for your interactive applications.

It explores codec use in popular games, providing C source for a simple cross-platform decoder. It also discusses the related technique of downsampling, Mu-law and similar ITU G-711 telephonic codecs, byte-ordering considerations and the panoply of fast ADPCM codecs including Apple, Intel, Nvidia and Microsoft IMA, Nintendo, Sony and CDi variations, with implementation references.

Expert tips include the quick way to locate ADPCM samples in binary data, how to further boost the effectiveness of ATRAC and XWMA compression, how to “preserve the punch” of compressed samples, and the “glitchfinder general” technique to locate and work around codec distortion, especially for unusual sounds. Constant bitrate codecs capable of compression ratios from 2:1 to 21.4:1 without downsampling are compared.

Noting quirks of the Xbox, Switch, PS3 and PS4 console hardware and ARM NEON and SSE optimisations, advanced psychoacoustic-masking codecs are compared, evaluating their suitability for interactive use, the strengths of AAC and Opus and limitations of MP3, NADPCM streaming, Ogg Vorbis, WMA and XWMA loops.