You do not have permission to edit this page, for the following reasons:

This IP address has been blocked from editing Wikipedia.
This does not affect your ability to read Wikipedia pages.
Most people who see this message have done nothing wrong. Some kinds of blocks restrict editing from specific service providers or telecom companies in response to recent abuse or vandalism, and can sometimes affect other users who are unrelated to that abuse. Review the information below for assistance if you do not believe that you have done anything wrong.

The IP address or range 34.227.0.0/16 has been blocked by ‪Blablubbs‬ for the following reason(s):

The IP address that you are currently using has been blocked because it is believed to be a web host provider or colocation provider. To prevent abuse, web hosts and colocation providers may be blocked from editing Wikipedia.
You will not be able to edit Wikipedia using a web host or colocation provider because it hides your IP address, much like a proxy or VPN.
We recommend that you attempt to use another connection to edit. For example, if you use a proxy or VPN to connect to the internet, turn it off when editing Wikipedia. If you edit using a mobile connection, try using a Wi-Fi connection, and vice versa. If you are using a corporate internet connection, switch to a different Wi-Fi network. If you have a Wikipedia account, please log in.
If you do not have any other way to edit Wikipedia, you will need to request an IP block exemption.

How to appeal if you are confident that your connection does not use a colocation provider's IP address:
If you are confident that you are not using a web host, you may appeal this block by adding the following text on your talk page: ((unblock|reason=Caught by a colocation web host block but this host or IP is not a web host. My IP address is _______. Place any further information here. ~~~~)). You must fill in the blank with your IP address for this block to be investigated. Your IP address can be determined here. Alternatively, if you wish to keep your IP address private you can use the unblock ticket request system. There are several reasons you might be editing using the IP address of a web host or colocation provider (such as if you are using VPN software or a business network); please use this method of appeal only if you think your IP address is in fact not a web host or colocation provider.

Administrators: The IP block exemption user right should only be applied to allow users to edit using web host in exceptional circumstances, and requests should usually be directed to the functionaries team via email. If you intend to give the IPBE user right, a CheckUser needs to take a look at the account. This can be requested most easily at SPI Quick Checkuser Requests. Unblocking an IP or IP range with this template is highly discouraged without at least contacting the blocking administrator.

This block will expire on 11:46, 23 November 2025. Your current IP address is 34.227.114.28.

Even when blocked, you will usually still be able to edit your user talk page, as well as email administrators and other editors.

For information on how to proceed, please read the FAQ for blocked users and the guideline on block appeals. The guide to appealing blocks may also be helpful.

Other useful links: Blocking policy · Help:I have been blocked
This IP address range has been globally blocked.
This does not affect your ability to read Wikipedia pages.
Most people who see this message have done nothing wrong. Some kinds of blocks restrict editing from specific service providers or telecom companies in response to recent abuse or vandalism, and can sometimes affect other users who are unrelated to that abuse. Review the information below for assistance if you do not believe that you have done anything wrong.

This block affects editing on all Wikimedia wikis.
The IP address or range 34.227.0.0/16 has been globally blocked by ‪Jon Kolbert‬ for the following reason(s):

Open proxy/Webhost: See the help page if you are affected

This block will expire on 05:38, 21 February 2029. Your current IP address is 34.227.114.28.

Even while globally blocked, you will usually still be able to edit pages on Meta-Wiki.

If you believe you were blocked by mistake, you can find additional information and instructions in the No open proxies global policy. Otherwise, to discuss the block please post a request for review on Meta-Wiki. You could also send an email to the stewards VRT queue at stewards@wikimedia.org including all above details.

Other useful links: Global blocks · Help:I have been blocked

You can view and copy the source of this page:

In [[computer science]] and [[information theory]], '''Tunstall coding''' is a form of [[entropy coding]] used for [[lossless data compression]].

== History ==

Tunstall coding was the subject of Brian Parker Tunstall's PhD thesis in 1967, while at Georgia Institute of Technology. The subject of that thesis was "Synthesis of noiseless compression codes" <ref>((cite book|last=Tunstall, Brian Parker|title=Synthesis of noiseless compression codes|date=September 1967|publisher=[[Georgia Institute of Technology]]))</ref>

Its design is a precursor to [[Lempel–Ziv]].

== Properties ==

Unlike [[variable-length code]]s, which include [[Huffman coding|Huffman]] and [[Lempel–Ziv|Lempel–Ziv coding]],
Tunstall coding is a [[code]] which maps source symbols to a fixed number of bits.<ref>http://www.rle.mit.edu/rgallager/documents/notes1.pdf, Study of Tunstall's algorithm at [[MIT]]</ref>

Both Tunstall codes and Lempel–Ziv codes represent variable-length words by fixed-length codes.<ref>
"Variable to fixed length adaptive source coding - Lempel-Ziv coding".
[http://www.rle.mit.edu/rgallager/documents/notes2a.pdf]
[http://web.mit.edu/6.441/spring06/handout/sup/Lempel-Ziv_78_-_Notes_2a.pdf]
</ref>

Unlike [[Typical set|typical set encoding]], Tunstall coding parses a stochastic source with codewords of variable length.

It can be shown<ref>[http://ipg.epfl.ch/lib/exe/fetch.php?media=en:courses:2013-2014:itc:tunstall.pdf], Study of Tunstall's algorithm from [[EPFL]]'s Information Theory department</ref>
that, for a large enough dictionary, the number of bits per source letter can be arbitrarily close to <math>H(U)</math>, the [[Entropy (information theory)|entropy]] of the source.

== Algorithm ==

The algorithm requires as input an input alphabet <math>\mathcal{U}</math>, along with a distribution of probabilities for each word input.
It also requires an arbitrary constant <math>C</math>, which is an upper bound to the size of the dictionary that it will compute.
The dictionary in question, <math>D</math>, is constructed as a tree of probabilities, in which each edge is associated to a letter from the input alphabet.
The algorithm goes like this:

 D := tree of <math>|\mathcal{U}|</math> leaves, one for each letter in <math>\mathcal{U}</math>.
 While <math>|D| < C</math>:
     Convert most probable leaf to tree with <math>|\mathcal{U}|</math> leaves.

== Example ==
((cleanup|reason=wrong probabilities|date=August 2014))
Let's imagine that we wish to encode the string "hello, world".
Let's further assume (somewhat unrealistically) that the input alphabet <math>\mathcal{U}</math>
contains only characters from the string "hello, world" — that is, 'h', 'e', 'l', ',', ' ', 'w', 'o', 'r', 'd'.
We can therefore compute the probability of each character based on its statistical appearance in the input string.
For instance, the letter L appears thrice in a string of 12 characters: its probability is <math>3 \over 12</math>.

We initialize the tree, starting with a tree of <math>|\mathcal{U}|=9</math> leaves. Each word is therefore directly associated to a letter of the alphabet.
The 9 words that we thus obtain can be encoded into a fixed-sized output of <math>\lceil \log_2(9) \rceil = 4</math> bits.

[[File:Tunstall-1.png|Tunstall "hello, world" example — one iteration]]

We then take the leaf of highest probability (here, <math>w_1</math>), and convert it to yet another tree of <math>|\mathcal{U}|=9</math> leaves, one for each character.
We re-compute the probabilities of those leaves. For instance, the sequence of two letters L happens once.
Given that there are three occurrences of letters followed by an L, the resulting probability is <math>{1 \over 3} \cdot {3 \over 12} = {1 \over 12}</math>.

We obtain 17 words, which can each be encoded into a fixed-sized output of <math>\lceil \log_2(17) \rceil = 5</math> bits.

[[File:Tunstall-2.png|Tunstall "hello, world" example — two iterations]]

Note that we could iterate further, increasing the number of words by <math>|\mathcal{U}|-1=8</math> every time.

== Limitations ==

Tunstall coding requires the algorithm to know, prior to the parsing operation, what the distribution of probabilities for each letter of the alphabet is.
This issue is shared with [[Huffman coding]].

Its requiring a fixed-length block output makes it lesser than [[Lempel–Ziv]], which has a similar dictionary-based design, but with a variable-sized block output.((clarify |date=February 2017 |reason= both Lempel–Ziv and Tunstall represent variable-length text blocks by fixed-length codewords. ))

== Implied Read for base modification ==
[[File:Ternary Tunstall Tree.png|thumb|Ternary Tunstall Tree]]
This is an example of a Tunstall code being used to read ( for transmit ) any data that is scrambled, e.g. by polynomial scrambling. This particular example helps to modify the base of the data from 2 to 3 in a stream therefore avoiding expensive base modification routines. With base modification we are particularly bound by 'efficiency' of reads, where ideally <math display="inline">\log_{n}</math>bits are used at an average to read the code. This ensures that upon use of the new base, which is duty bound to use at best <math display="inline">\log_{n}</math>bits per code, our reads do not result in lesser margin of efficiency of transmission for which we are employing the base modification in the first place. We can therefore then employ the read-to-modify-base mechanism for efficiently transmitting the data across channels that have a different base. eg. transmitting binary data across say MLT-3 channels with increased efficiency when compared to mapping codes ( with large number of unused codes ).
{| class="wikitable" style="text-align: center;"
!Symbol
!Code
|-
|AA
|010
|-
|AB
|011
|-
|AC
|100
|-
|B
|00
|-
|CA
|101
|-
|CB
|110
|-
|CC
|111
|}
We are essentially reading perfectly scrambled binary data or 'implied data' for the purpose of transmitting it using base-3 channels. Please see leaf nodes in the Ternary Tunstall Tree. As we can see, the read will result in the first digit being 'B' - 25% of the time as it has an implied probability of 25%, being of length 2 trying to read from implied data. A 'B' such read does not read any further, but with 75% probability we read 'A' or 'C', requiring another code. Thus the efficiency of the read is 2.75 ( average length of the size 7 Huffman code ) / 1.75 ( average length of the 1 or 2-digit base - 3 Tunstall code ) = <math display="inline">1.57142857</math> which is as per requirement very close to <math display="inline">\log_{2}3=1.5849625</math> which calculates to an efficiency of <math display="inline">99.15%</math>. We can then transmit the symbols using base-3 channels efficiently.

== References ==
((reflist))

((commons category))

((Compression methods))

[[Category:Lossless compression algorithms]]

Pages transcluded onto the current version of this page (help):

Tunstall coding (edit)
Template:Ambox (view source) (template editor protected)
Template:Category handler (view source) (protected)
Template:Cite book (view source) (protected)
Template:Clarify (view source) (template editor protected)
Template:Cleanup (view source) (template editor protected)
Template:Commons category (view source) (template editor protected)
Template:Compression methods (edit)
Template:DMC (view source) (template editor protected)
Template:DMCA (view source) (template editor protected)
Template:Dated maintenance category (view source) (template editor protected)
Template:Dated maintenance category (articles) (view source) (template editor protected)
Template:Delink (view source) (protected)
Template:FULLROOTPAGENAME (view source) (template editor protected)
Template:Fix-span (view source) (template editor protected)
Template:Fix/category (view source) (protected)
Template:Hlist/styles.css (view source) (protected)
Template:Icon (view source) (template editor protected)
Template:If then show (view source) (template editor protected)
Template:Main other (view source) (protected)
Template:Navbox (view source) (template editor protected)
Template:Ns has subpages (view source) (protected)
Template:Plainlist/styles.css (view source) (protected)
Template:Reflist (view source) (protected)
Template:Reflist/styles.css (view source) (protected)
Template:SUBJECTSPACE formatted (view source) (template editor protected)
Template:Side box (view source) (template editor protected)
Template:Sister project (view source) (template editor protected)
Template:Terminate sentence (view source) (template editor protected)
Module:Arguments (view source) (protected)
Module:Category handler (view source) (protected)
Module:Category handler/blacklist (view source) (protected)
Module:Category handler/config (view source) (protected)
Module:Category handler/data (view source) (protected)
Module:Category handler/shared (view source) (protected)
Module:Check for unknown parameters (view source) (protected)
Module:Citation/CS1 (view source) (protected)
Module:Citation/CS1/COinS (view source) (protected)
Module:Citation/CS1/Configuration (view source) (protected)
Module:Citation/CS1/Date validation (view source) (protected)
Module:Citation/CS1/Identifiers (view source) (protected)
Module:Citation/CS1/Utilities (view source) (protected)
Module:Citation/CS1/Whitelist (view source) (protected)
Module:Citation/CS1/styles.css (view source) (protected)
Module:DecodeEncode (view source) (template editor protected)
Module:Delink (view source) (protected)
Module:Icon (view source) (template editor protected)
Module:Icon/data (view source) (template editor protected)
Module:Message box (view source) (protected)
Module:Message box/ambox.css (view source) (protected)
Module:Message box/configuration (view source) (protected)
Module:Namespace detect/config (view source) (protected)
Module:Namespace detect/data (view source) (protected)
Module:Navbar (view source) (protected)
Module:Navbar/configuration (view source) (protected)
Module:Navbar/styles.css (view source) (protected)
Module:Navbox (view source) (template editor protected)
Module:Navbox/configuration (view source) (template editor protected)
Module:Navbox/styles.css (view source) (template editor protected)
Module:Ns has subpages (view source) (protected)
Module:Side box (view source) (template editor protected)
Module:Side box/styles.css (view source) (template editor protected)
Module:String (view source) (protected)
Module:Text (view source) (template editor protected)
Module:Unsubst (view source) (protected)
Module:WikidataIB (view source) (template editor protected)
Module:WikidataIB/nolinks (view source) (template editor protected)
Module:WikidataIB/titleformats (view source) (template editor protected)
Module:Yesno (view source) (protected)

Return to Tunstall coding.

Retrieved from "https://en.wikipedia.org/wiki/Tunstall_coding"