Opus 1.6: A Major Update with Enhanced Audio Coding Features

audio codecs

Opus 1.6 delivers significant advancements including experimental ML-based Speech Bandwidth Extension, improved Deep Redundancy (DRED) for better intelligibility and smaller models, experimental 96 kHz Opus HD support, and a new 24-bit integer audio API. This update boosts audio quality, robustness, and extends capabilities for high-resolution applications.

Opus 1.6 introduces significant enhancements and new features while maintaining full compatibility with RFC 6716. This major update focuses on improving speech quality, robustness, and extending the codec's capabilities for high-resolution audio.

ML-Based Speech Bandwidth Extension (BWE)

Opus 1.6 introduces an experimental wideband-to-fullband speech enhancer, first presented at WASPAA 2025. This addition to the Opus speech coding enhancement algorithms (covered by a related IETF draft) uses a neural network to generate high-frequency speech content (8-20 kHz) from wideband speech (0-8 kHz) without side information. This allows it to enhance speech from any previous Opus version without breaking compatibility, as phonetic information is already present in lower frequency ranges. This differs from the more challenging narrowband-to-wideband extension, which Opus does not attempt.

The model can optionally decode wideband speech into fullband speech sampled at 48 kHz and can be combined with wideband enhancement methods introduced in Opus 1.5. However, it does not replace highband content encoded in hybrid mode and will not activate for super-wideband or fullband audio.

Wideband speech decoded with (right) and without (left) the BWE option.

A secondary application of the BWE model extends the FARGAN wideband speech vocoder, utilized for deep packet loss concealment (PLC) and Deep Redundancy (DRED) decoding. When combined with BWE, DRED can now achieve fullband quality, ensuring more consistent output for fullband transmissions.

Fullband speech (hybrid mode) with significant DRED-decoded speech, without (right) and with (left) the BWE option.

The combination of NoLACE and BWE significantly improves speech quality, enabling good fullband quality speech at bitrates as low as 9 kb/s.

Subjective MOS evaluation of the new bandwidth extension. For uncoded speech, BWE closes about half the quality gap between wideband and fullband. For coded speech combined with NoLACE at 9 kb/s, it achieves quality similar to Opus 1.4 fullband at double the bitrate (18 kb/s) and closes the gap with the EVS codec. At 9 kb/s and above, Opus now exceeds the quality of purely neural codecs like EnCodec.

To use BWE, enable it during the build process with --enable-osce, and at runtime, explicitly enable it via -enable_osce_bwe. The decoder complexity (introduced in Opus 1.5) must also be set to 4 or higher. BWE will then be used for speech coded in SILK wideband mode, provided the decoder is configured for a 48 kHz sampling rate.

DRED Improvements

Deep REDundancy (DRED), first introduced experimentally in Opus 1.5, has received substantial improvements in Opus 1.6. As DRED is signaled in the bitstream, the new model is incompatible with the Opus 1.5 version. However, due to model versioning, using a 1.5 encoder with a 1.6 decoder (or vice versa) will not cause issues, though DRED information will not be available to an incompatible decoder. The development team hopes this new model will be stable for the final DRED standard. More details on DRED are available in a dedicated paper and an IETF draft.

Better Intelligibility

The original 1.5 model sometimes produced overly smoothed temporal spectral variations, especially at very low bitrates, leading to perceived slurred speech and lower intelligibility. The new 1.6 model, trained with an improved loss function sensitive to larger errors (using a fourth-power error term), enhances speech intelligibility, even if it doesn't always improve perceptual quality.

Intelligibility of the updated DRED model compared to the original Opus 1.5 model. Blue regions indicate improved intelligibility with the new model, while red regions indicate a decrease, measured at low bitrates. Absolute confusion matrices (compared to uncompressed speech) for the new model at low, medium, and high bitrates are available, with results taken from the draft-lechler-mlcodec-test-battery.

Increased Robustness

While DNN-based speech enhancement often handles noisy or reverberant speech, robustness remains valuable. The 1.6 model was trained on a mix of clean and noisy/reverberant speech. This augmented training significantly improves quality and intelligibility in challenging conditions without negatively impacting clean speech performance.

Smaller Model

Despite the improvements, the new DRED encoder and decoder models are approximately 3x smaller than their Opus 1.5 counterparts (600 kB, down from 1800 kB). This size reduction was achieved through architecture fine-tuning, increased sparsity, and the use of bottleneck layers for convolutional layers.

Experimental Opus HD Support

Opus 1.6 introduces experimental support for coding 96 kHz audio with bandwidth beyond the standard 20 kHz range and increased bitrates up to 2 Mb/s. Although 48 kHz audio suffices for most, use cases involving sensors or ultrasonics benefit from higher sampling rates and bitrates. Opus HD is implemented as an extension layer, ensuring backward compatibility; a standard Opus decoder can process an Opus HD stream without utilizing the extended bitrate and bandwidth.

Beyond extended bandwidth, Opus HD also provides increased resolution within the audible 0-20 kHz band. Standard Opus quantizers (RFC 6716) can reach up to 8 bits per coefficient at 510 kb/s. With Opus HD, quantizers can achieve up to 20 bits of depth. This layered implementation also positions Opus as a scalable codec. It's important to note that quantizer resolution is distinct from PCM sample bit resolution; Opus can code the full dynamic range of 24-bit audio, even with bit depths below 8 bits.

To enable Opus HD, use the --enable-qext configure option during build. For encoding, use the -qext option in opus_demo or OPUS_SET_QEXT(1) with the encoder API. If built with Opus HD support, the decoder automatically utilizes any Opus HD layer found, unless OPUS_SET_IGNORE_EXTENSIONS(1) is used.

New 24-bit Audio API

Opus 1.6 introduces a new 24-bit integer audio API for encoding and decoding. While the existing 16-bit integer and 32-bit float APIs remain available, this new option caters to high-resolution audio pipelines that prefer to avoid floating-point arithmetic. It is particularly useful on platforms where floating-point operations are expensive or for applications maintaining a pure integer pipeline.

Since the C standard lacks a native 24-bit integer type, the API uses opus_int32, storing audio data in the lower 24 bits, resulting in a nominal range of [-2^23, +2^23-1]. Crucially, like the floating-point API, the 24-bit API supports values slightly beyond this nominal range without hard clipping, preserving dynamic range peaks that would be lost with the 16-bit API.

New calls use a "24" suffix (e.g., opus_encode24() and opus_decode24()). Comprehensive support covers the standard API, Multistream API (opus_multistream_encode24() / opus_multistream_decode24()), Projection (Ambisonics) API (opus_projection_encode24() / opus_projection_decode24()), Custom Modes (opus_custom_encode24() / opus_custom_decode24()), and DRED (opus_decoder_dred_decode24()).

Miscellaneous Improvements

Header files previously used internal macros with an __opus prefix, violating the C specification. This has been rectified. As a consequence, libopusenc (which wrongly relied on these macros) needs to be updated to libopusenc 0.3 for anyone compiling with it (binaries are unaffected).

A side benefit of the Opus HD development is significantly improved accuracy in the fixed-point implementation, now more closely matching the floating-point implementation.

Architecture-specific optimizations for MIPS have been updated and enhanced.

Run-time CPU detection for x86 SIMD instructions now supports OpenBSD.

Numerous minor issues found in previous versions have also been fixed.

This release builds upon Opus 1.5, continuously evolving the codec's capabilities. Users are encouraged to try Opus 1.6 and provide feedback to help address any emerging issues.