2014年2月15日星期六

week3

This week, we still focus on the dictionary establishment. Last week, the function has past a test with very small amount of data. However, when we try to test is with a very large amount of data(data from 2 second recording with 44100Hz sampling rate), it will cost more than 5 minutes to generate the dictionary, which will make this program have no practical value. Then we try to improve the algorithm.

The algorithm we use in last week is a direct apply of the basic ideal of Huffman coding involved nested cell. Cell is a data type in Matlab, which can contain many types of other data. For the nested cell used in last weeks function, the top level of the cell may have a length of 88200, and each element of this cell will contain a other CELL with length of 88200 and the second level of cell contains a vector in each of its term. This means the program need to process a 88200^2 of vectors and this will cost a huge amount of computing resource.

However, after a further study of the Huffman code, the symbol it self are not very important during the coding, only the index of the symbols need to be considered. Hence, the problem of nested cell was solved.

After editing and improve of the code, 2 second of voice data will cost only about 0.5 second. If this coding method was combined with high performance hardware, it will have a considerable practical value.

Besides, establish the dictionary, we also continued to design the encode and decode method, but it still not finish yet. Hence there is no result can be showed here.

没有评论:

发表评论