Codec 2
Developer(s)	David Grant Rowe
Initial release	August 25, 2010 (2010-08-25)

Stable release	1.2.0 / June 24, 2023; 12 months ago (2023-06-24)

Repository	github.com/drowe67/codec2
Written in	C99
Platform	Cross-platform
Type	Audio codec
License	GNU LGPL, v2.1
Website	www.rowetel.com?page_id=452

Codec 2 is a low-bitrate speech audio codec (speech coding) that is patent free and open source.^[1] Codec 2 compresses speech using sinusoidal coding, a method specialized for human speech. Bit rates of 3200 to 450 bit/s have been successfully created. Codec 2 was designed to be used for amateur radio and other high compression voice applications.

Overview

The codec was developed by David Grant Rowe, with support and cooperation of other researchers (e.g., Jean-Marc Valin from Opus).^[2]

Codec 2 consists of 3200, 2400, 1600, 1400, 1300, 1200, 700 and 450 bit/s codec modes. It outperforms most other low-bitrate speech codecs. For example, it uses half the bandwidth of Advanced Multi-Band Excitation to encode speech with similar quality.^{[citation needed]} The speech codec uses 16-bit PCM sampled audio, and outputs packed digital bytes. When sent packed digital bytes, it outputs PCM sampled audio. The audio sample rate is fixed at 8 kHz.

The reference implementation is open source and is freely available in a GitHub repository.^[3] The source code is released under the terms of version 2.1 of the GNU Lesser General Public License (LGPL).^[4] It is programmed in C and current source code requires floating-point arithmetic, although the algorithm itself does not require this. The reference software package also includes a frequency-division multiplex digital voice software modem and a graphical user interface based on WxWidgets. The software is developed on Linux and a port for Microsoft Windows created with Cygwin is offered in addition to an Apple MacOS version.

The codec has been presented in various conferences and has received the 2012 ARRL Technical Innovation Award,^[5] and the Linux Australia Conference's Best Presentation Award.^[6]

Technology

Internally, parametric audio coding algorithms operate on 10 ms PCM frames using a model of the human voice. Each of these audio segments is declared voiced (vowel) or unvoiced (consonant).

Codec 2 uses sinusoidal coding to model speech, which is closely related to that of multi-band excitation codecs. Sinusoidal coding is based on regularities (periodicity) in the pattern of overtone frequencies and layers harmonic sinusoids. Spoken audio is recreated by modelling speech as a sum of harmonically related sine waves with independent amplitudes called Line spectral pairs, or LSP, on top of a determined fundamental frequency of the speaker's voice (pitch). The (quantised) pitch and the amplitude (energy) of the harmonics are encoded, and with the LSP's are exchanged across a channel in a digital format. The LSP coefficients represent the Linear Predictive Coding (LPC) model in the frequency domain, and lend themselves to a robust and efficient quantisation of the LPC parameters.^[7]

The digital bytes are in a bit-field format that have been packed together into bytes. These bit fields are also optionally gray coded before being grouped together. The gray coding may be useful if sending raw, but normally an application will just burst the bit fields out. The bit fields make up the various parameters that are stored or exchanged (pitch, energy, voicing booleans, LSP's, etc.).

For example, Mode 3200, has 20 ms of audio converted to 64 bits. So 64 bits will be output every 20 ms (50 times a second), for a minimum data rate of 3200 bit/s. These 64 bits are sent as 8 bytes to the application, which has to unwrap the bit fields, or send the bytes over a data channel.

Another example is Mode 1300, which is sent 40 ms of audio, and outputs 52 bits every 40 ms (25 times a second), for a minimum rate of 1300 bit/s. These 52 bits are sent as 7 bytes to the application or data channel.

Adoption

Codec 2 is currently used in several radios and Software Defined Radio Systems

FreeDV^[8]
FlexRadio 6000 series^[9]
SM1000^[10]
Quisk^[11]
M17 Project^[12]

Codec2 has also been integrated into FreeSWITCH and there's a patch available for support in Asterisk.

There was an FM-to-Codec2 digital voice repeater in earth orbit on amateur radio CubeSat LilacSat-1 (call sign ON02CN, QB50 constellation), which was launched and subsequently deployed from the International Space Station in 2017.^[13]

History

The prominent free software advocate and radio amateur Bruce Perens lobbied for the creation of a free speech codec for operation at less than 5 kbit/s. Since he did not have the background himself, he approached Jean-Marc Valin in 2008, who introduced him to lead developer David Grant Rowe, who has worked with Valin on Speex on several occasions. Rowe himself was also a radio amateur (amateur radio call sign VK5DGR) and had experience in creating and using voice codecs and other signal processing algorithms for speech signals. He obtained a PhD in speech coding in the 1990s and was involved in the development of one of the first satellite telephony systems (Mobilesat).

He agreed to the task and announced his decision to work on a format on August 21, 2009. He built on the research and findings from his doctoral thesis.^[14]^[15] The underlying sinusoidal modelling goes back to developments by Robert J. McAulay and Thomas F. Quatieri (MIT Lincoln labs) from the mid-1980s.

In August 2010, David Rowe published version 0.1 alpha.^[16] Version 0.2 was released towards the end of 2011, introducing a mode with 1,400 bits/s and significant improvements in quantization.

In January 2012, at linux.conf.au, Jean-Marc Valin helped improve the quantization of line spectral pairs, which Rowe is less familiar with.^[17] After several changes to the available bit rate modes in winter and spring 2011/2012, 2,400, 1,400 and 1,200 bit/s modes were available after May of that year.

Codec 2 700C, a new mode with a bit rate of 700 bit/s, was finished in early 2017.^[18]

In July 2018 an experimental 450 bit/s mode was demonstrated, which was developed as part of a master thesis at the University of Erlangen-Nuremberg. By clever training of the vector quantization the data rate could be further reduced based on the principle of the 700C mode.^[19]

References

External links

Multimedia compression and container formats

Video
compression

ISO, IEC, MPEG	DV MJPEG Motion JPEG 2000 MPEG-1 MPEG-2 Part 2 MPEG-4 Part 2 / ASP Part 10 / AVC Part 33 / IVC MPEG-H Part 2 / HEVC MPEG-I Part 3 / VVC MPEG-5 Part 1 / EVC Part 2 / LCEVC
ITU-T, VCEG	H.120 H.261 H.262 H.263 H.264 / AVC H.265 / HEVC H.266 / VVC
SMPTE	VC-1 VC-2 VC-3 VC-5 VC-6
TrueMotion	TrueMotion S VP3 VP6 VP7 VP8 VP9 AV1
Others	Apple Video AVS Bink Cinepak Daala DVI FFV1 Huffyuv Indeo Lagarith Microsoft Video 1 MSU Lossless OMS Video Pixlet ProRes 422 4444 QuickTime Animation Graphics RealVideo RTVideo SheerVideo Smacker Sorenson Video/Spark Theora Thor Ut WMV XEB YULS

Audio
compression

ISO, IEC, MPEG	MPEG-1 Layer II Multichannel MPEG-1 Layer I MPEG-1 Layer III (MP3) AAC HE-AAC AAC-LD MPEG Surround MPEG-4 ALS MPEG-4 SLS MPEG-4 DST MPEG-4 HVXC MPEG-4 CELP MPEG-D USAC MPEG-H 3D Audio
ITU-T	G.711 A-law µ-law G.718 G.719 G.722 G.722.1 G.722.2 G.723 G.723.1 G.726 G.728 G.729 G.729.1
IETF	Opus iLBC Speex Vorbis
3GPP	AMR AMR-WB AMR-WB+ EVRC EVRC-B EVS GSM-HR GSM-FR GSM-EFR
ETSI	AC-3 AC-4 DTS
Bluetooth SIG	SBC LC3
Others	ACELP ALAC Asao ATRAC AVS CELT Codec 2 DRA FLAC iSAC Lyra MELP Monkey's Audio MT9 Musepack OptimFROG OSQ QCELP RCELP RealAudio RTAudio SD2 SHN SILK Siren SMV SVOPC TTA True Audio TwinVQ VMR-WB VSELP WavPack WMA MQA aptX aptX HD aptX Low Latency aptX Adaptive LDAC LHDC LLAC L2HC

Image
compression

IEC, ISO, IETF, W3C, ITU-T, JPEG	CCITT Group 4 GIF HEIC / HEIF HEVC JBIG JBIG2 JPEG JPEG 2000 JPEG-LS JPEG XL JPEG XR JPEG XS JPEG XT PNG TIFF TIFF/EP TIFF/IT
Others	APNG AV1 AVIF BPG DjVu EXR FLIF ICER MNG PGF QOI QTVR WBMP WebP

Containers

ISO, IEC	MPEG-ES MPEG-PES MPEG-PS MPEG-TS ISO/IEC base media file format MPEG-4 Part 14 (MP4) Motion JPEG 2000 MPEG-21 Part 9 MPEG media transport
ITU-T	H.222.0 T.802
IETF	RTP Ogg
SMPTE	GXF MXF
Others	3GP and 3G2 AMV ASF AIFF AVI AU BPG Bink Smacker BMP DivX Media Format EVO Flash Video HEIF IFF M2TS Matroska WebM QuickTime File Format RatDVD RealMedia RIFF WAV MOD and TOD VOB, IFO and BUP

Collaborations

Methods

Entropy
LPC
- ACELP
- CELP
- LSP
- WLPC
Lossless
Lossy
LZ
- DEFLATE
- LZW
PCM
- A-law
- µ-law
- ADPCM
- DPCM
Transforms
- DCT
- FFT
- MDCT
- Wavelet
  - Daubechies
  - DWT

Lists

See Compression methods for techniques and Compression software for codecs

Data compression software

Archivers with
compression
(comparison)

Free software	7-Zip Ark Expander FreeArc GNOME Archive Manager Info-ZIP KGB Archiver PAQ pax PeaZip XAD (decompression only) Xarchiver Zipeg ZPAQ
Freeware	Filzip LHA Lhasa (decompression only) StuffIt Expander (decompression only) The Unarchiver (decompression only) TUGZip ZipGenius
Commercial	ARC ALZip Archive Utility ARJ BetterZip MacBinary PKZIP/SecureZIP PowerArchiver StuffIt WinAce WinRAR WinZip

Non-archiving
compressors

Generic	bzip2 compress gzip lzip lzop pack rzip Snappy XZ Utils zstd
For code	UPX

Audio
compression
(comparison)

Lossy	AAC Fraunhofer FDK AAC Nero AAC Codec FAAC Helix DNA Producer MP3 l3enc LAME TooLAME libavcodec libcelt libopus libspeex Musepack libvorbis Windows Media Encoder
Lossless	ALAC FLAC libavcodec Monkey's Audio mp4als OptimFROG Shorten WavPack L2HC

Video
compression
(comparison)

Lossy

MPEG-4 ASP	3ivx DivX Nero Digital FFmpeg HDX4 Xvid
H.264	CoreAVC Blu-code DivX FFmpeg Nero Digital OpenH264 QuickTime x264
HEVC	DivX x265
Others	CineForm Cinepak Daala DNxHD Helix DNA Producer Indeo libavcodec Schrödinger (Dirac) SBC Sorenson VP7 libtheora libvpx Windows Media Encoder

Lossless

See also: compression methods and compression formats