《(6.7.1)--ch6-13HuffmanCoding0821.pdf》由会员分享,可在线阅读,更多相关《(6.7.1)--ch6-13HuffmanCoding0821.pdf(16页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Data Structures and Algorithms 26.7 Huffman Coding The Huffman tree can be used to obtain the code with the shortest average length.Therefore,the Huffman tree has a wide range of applications in information transmission and data compression.In practical applications such as information transmission,
2、the characters appearing in the text need to be binary coded.After transmission,the binary code must be translated into the original characters.This is a typical coding and decoding problem.(1)The code can be decoded uniquely.(2)The code length should be as short as possible.Data Structures and Algo
3、rithmsData Structures and AlgorithmsChapter 6 Trees and Binary TreesIn the design of coding,two principles are usually followed:3Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman Coding1.Related conceptsThe codes with the shortest average length
4、are generally unequal length codes.Equal length coding:The code length of each character is the same.Unequal length coding:Characters with high frequency of use have short codes,and characters with low frequency of use have relatively long codes.4Data Structures and AlgorithmsData Structures and Alg
5、orithmsChapter 6 Trees and Binary Trees6.7 Huffman CodingExample:The character set to be encoded is:A、B、CEqual length coding(two bits per character)A:00,B:01,C:10Unequal length coding can be designed as:A:0,B:11,C:10If the text to be encoded is:AAABAC,the coding results are:Equal length coding:00000
6、0010010,total length is 12 bits.Unequal length coding:00011010,total length is 8 bits.5Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman CodingUnequal length coding has the problem that it cannot be effectively recognized when decoding!Example:Th
7、e character set to be encoded is:A、B、C、DUnequal length coding is:A:0 B:1 C:00 D:01If the text to be encoded is:ABACAD,the encoding result is:01000001The issue is:01000001 can be decoded as ABACAD01000001 can also be decoded as DCCD01000001 can also be decoded as ABAAAAAB6Data Structures and Algorith
8、msData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman CodingFor unequal length coding,in For unequal length coding,in order to effectively identify order to effectively identify the codes without adding the codes without adding delimiters,the codes should be delimiters,the code
9、s should be prefix codesprefix codes.7Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman CodingPrefix encoding:The encoding of any character is not a prefix of another character encoding in the same character set.In the previous example,the code i
10、s designed as A:0,B:1,C:00,D:01,which does not comform to the rules of prefix encoding and cannot be effectively identified!How to design?The The prefix code prefix code that has the shortest average length that has the shortest average length and can be effectively identified.and can be effectively
11、 identified.Use the Huffman tree!Use the Huffman tree!8Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman Coding The Huffman tree is a binary tree with the smallest weighted path length value.Its characteristics are:The idea of designing coding wi
12、th Huffman tree is:Each character to be encoded corresponds to a leaf node;The frequency of utilization of each character corresponds to the weight of the leaf;The encoding of each character corresponds to the path from root to leaf.The greater the weight of the leaf node,the closer to the root!The
13、principle of constructing unequal length coding is:The higher the frequency of character usage,the shorter the code!9Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman Coding01000101101110000111Example:The character set to be encoded is:A、B、C、D、ET
14、he frequency of each character is:5、3、2、7、8The Huffman tree constructed with frequency as the weight is shown in the right figure:ABECDConstructing Huffman codes:Mark each branch of the tree:The left branch is marked with 0,the right branch is marked with 1,the sequence of 0 and 1 corresponding to t
15、he path from the root to the leaf node constitutes the code of the character corresponding to the leaf node.78523510152510Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman CodingConclusion one:Huffman coding is prefix coding.Different characters
16、correspond to different leaves.The path from the root to the different leaves must eventually diverge,so the path from the root to one leaf cannot be the previous part of the path to another leaf,that is,the code of one character cannot be the code of another character prefix,so Huffman coding is pr
17、efix coding.11Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman Coding Because the WPL of the Huffman tree is the smallestBecause the WPL of the Huffman tree is the smallest:Therefore,the Huffman tree is constructed Therefore,the Huffman tree is
18、constructed according to the use frequency(according to the use frequency(P Pi)of the character as)of the character as the weight of the leaf(corresponding to the weight of the leaf(corresponding to W Wi i),and the),and the code length of the character(corresponding to the code length of the charact
19、er(corresponding to the number of code bits Li)is constructed according to the number of code bits Li)is constructed according to the path length(path length(L Li)from the root to the leaf.The average)from the root to the leaf.The average length of the code:must be the smallest,that length of the co
20、de:must be the smallest,that is,Huffman coding is the optimal code.is,Huffman coding is the optimal code.WPL=WPL=W Wi iL Li ii=1n P Pi iL Lii=1nConclusion 2:Huffman coding is the optimal prefix coding.122.Construction of Huffman codeData Structures and AlgorithmsData Structures and AlgorithmsChapter
21、 6 Trees and Binary Trees6.7 Huffman CodingThe node is the left child of the parents,and the current code bit is 0,otherwise it is 1.From the leaf to the root,the construction of the leaf node encoding is completed in reverse.The Huffman tree is stored in a static trifurcate linked list.Under this p
22、remise,the Huffman code is constructed from the Huffman tree.It needs to start from the leaves and walk the path from the leaf to the root from the bottom up.For each branch along a path,a Huffman code value is generated,and the code of each leaf should be generated bit by bit from back to front.13D
23、ata Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman Coding987654321RChildLChildparentweight000000000000000000080002000300070005872552915619104375866873257815510250000010101010111111000101001101114Data Structures and AlgorithmsData Structures and Alg
24、orithmsChapter 6 Trees and Binary Trees Contrary to the process of constructing the code,when decoding according to the Huffman tree,a path from the root node to the leaf node is required.When the current code bit is 0,go to the left subtree,and when the current code bit is 1,go to the right subtree
25、.When the leaf node is reached,the decoding of a character is completed.3.Decoding of Huffman codes6.7 Huffman Coding15Data Structures and AlgorithmsData Structures and AlgorithmsChapter 6 Trees and Binary Trees6.7 Huffman CodingBCDEA00000101010101111110The information to be decoded is:00100100111011A DCBD ESimple and efficient decoding process Thanks