This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: "LZWL" – news · newspapers · books · scholar · JSTOR (January 2013) (Learn how and when to remove this template message) This article's tone or style may not reflect the encyclopedic tone used on Wikipedia. See Wikipedia's guide to writing better articles for suggestions. (August 2009) (Learn how and when to remove this template message) This article uses bare URLs, which are uninformative and vulnerable to link rot. Please consider converting them to full citations to ensure the article remains verifiable and maintains a consistent citation style. Several templates and tools are available to assist in formatting, such as Reflinks (documentation), reFill (documentation) and Citation bot (documentation). (September 2022) (Learn how and when to remove this template message) (Learn how and when to remove this template message)

LZWL is a syllable-based variant of the character-based LZW compression algorithm[1][2] that can work with syllables obtained by all algorithms of decomposition into syllables. The algorithm can be used for words too.

Algorithm

Algorithm LZWL can work with syllables obtained by all algorithms of decomposition into syllables. This algorithm can be used for words too.

In the initialization step, the dictionary is filled up with all characters from the alphabet. In each next step, it is searched for the maximal string S, which is from the dictionary and matches the prefix of the still non-coded part of the input. The number of phrase S is sent to the output. A new phrase is added to the dictionary. This phrase is created by concatenation of string S and the character that follows S in the file. The actual input position is moved forward by the length of S. Decoding has only one situation for solving. We can receive the number of phrase, which is not from the dictionary. In this case, that phrase can be created by the concatenation of the last added phrase with its first character.

The syllable-based version uses a list of syllables as an alphabet. In the initialization step, the empty syllable and small syllables from a database of frequent syllables are added to the dictionary. Finding string S and coding its number is similar to the character-based version, except that string S is a string of syllables. The number of phrase S is encoded to the output. The string S can be the empty syllable.

If S is the empty syllable, then we must get from the file one syllable K and encode K by methods for coding new syllables. Syllable K is added to the dictionary. The position in the file is moved forward by the length of S. In the case when S is the empty syllable, the input position is moved forward by the length of K.

In adding a phrase to the dictionary there is a difference in the character-based version. The phrase from the next step will be called S1. If S and S1 are both non-empty syllables, then a new phrase is added to the dictionary. The new phrase is created by the concatenation of S1 with the first syllable of S. This solution has two advantages: The first is that strings are not created from syllables that appear only once. The second advantage is that we cannot receive the decoder number of the phrase that is not from the dictionary.

References

  1. ^ http://www.cs.vsb.cz/dateso/2005/slides/slides6.pps
  2. ^ Salomon, David; Motta, Giovanni (2010-01-18). Handbook of Data Compression - David Salomon, D. Bryant, Giovanni Motta - Google Books. ISBN 9781848829039. Retrieved 2014-07-11.