Data representation
I am going to mention following things in my blog.
- Bit, Byte, Binary and Hexadecimal
- Image and Sounds Representation
- Data Compression and Encoding
4. Huffman Encoding
First, Bit, Byte, Binary and Hexadecimal
The positional notation. The base of a number determines the number of different digit symbols(numerals) and the values of digit positions.

This formula shows the value of numbers in decimals
Binary numbers and Computers. Computers have storage units called binary digits or bits
It is the smallest unit of data in computing. Bits can be grouped together to make them easier to work with. And one byte equals to eight bits.

This picture shows the different unit of data
Hexadecimal. Hexadecimal is base 16 and has 16 digits: used to represent very large numbers quickly, such as those used in color representation. The digits are 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E.


Class activities.

Human use two’s complement to represent negative values in computers.
Second, Date representation
Analog and digital information. Information can be divided into two ways, Analog data,A continuous representation, analogous to the actual information it represents, and digital data, A discrete representation, breaking the information up into separate elements. Because computer cannot work well with analog data, so we digitize the data, which breaking the data into pieces, and representing those pieces separately.
Electronic signals. An analog signal continually fluctuates in voltage up and down.A digital signal has only a high or low state, corresponding to the two binary digits. All electronic signals (both analog and digital) degrade as they move down a line. The voltage of the signal fluctuates due to environmental effects Periodically, a digital signal is reclockedto regain its original shape.
Representing text. There are a finite number of characters to represent, so list them all and assign each a binary string. Character set is a list of characters and the codes used to represent each one at which computer manufacturers agreed to standardize.


Representing of image. There are two kinds of image, bitmap and vector. Bitmap has resolution to measure if the picture is clear. Vector does not because vector is generated according to math equation by the computer. Usually human use RGB to represent pictures, because the combination of RGB could represent many colors. The color depth means how much red or green or blue does a dot of the picture contain.
Representing audio. Pulse Code Modulation (PCM) is a method used to digitally represent sampled analog signals. It has three steps, sampling, quantizing and encoding. Sampling is a process of acquiring audio sampling rate, which is number of samples per second that are used to digitize a particular sound.Quantization is the process of converting a continuous range of values into a finite range of values. Encoding is a process of converting (information or an instruction) into a particular form.
The picture shows how PCM goes

Third, Compression and encoding.
Activities: in the class we did a interesting activity that we compress a files according to the knowledge we just learn.

There are some characters that are repeated in the files. To minimize the space of the files, we could delete the repeated characters, only record how and what they are repeated. I have similar homework that need me to compress an article. I compressed an article from TOEFL reading. It turned out that I mastered compression skills, the compression rate of the article is surprisingly 10 percent.
Last, Huffman coding. Given a weight of N as a leaf node, construct a binary tree. If the length of the weighted path of the tree is the smallest, the binary tree is called the optimal binary tree, also known as the Huffman Tree. The Huffman tree is the tree with the shortest weight path, and the node with larger weight is closer to the root.
Conclusion, this week, I learn Bit, Byte, Binary and Hexadecimal, Image and Sounds Representation, and Data Compression and Encoding. Those knowledge are very important because the idea behind encoding and compression are very profound.