Data representation

I am going to mention following things in my blog.

Bit, Byte, Binary and Hexadecimal
Image and Sounds Representation
Data Compression and Encoding

4. Huffman Encoding

First, Bit, Byte, Binary and Hexadecimal

The positional notation. The base of a number determines the number of different digit symbols(numerals) and the values of digit positions.

This formula shows the value of numbers in decimals

Binary numbers and Computers. Computers have storage units called binary digits or bits

It is the smallest unit of data in computing. Bits can be grouped together to make them easier to work with. And one byte equals to eight bits.

This picture shows the different unit of data

Hexadecimal. Hexadecimal is base 16 and has 16 digits: used to represent very large numbers quickly, such as those used in color representation. The digits are 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E.

The picture shows how to convert binary into hexadecimals.

The picture simplifies the converting between binary and hexadecimal

Class activities.

This activity trains my skill of converting hexadecimal into binary.

Human use two’s complement to represent negative values in computers.

Second, Date representation

Analog and digital information. Information can be divided into two ways, Analog data,A continuous representation, analogous to the actual information it represents, and digital data, A discrete representation, breaking the information up into separate elements. Because computer cannot work well with analog data, so we digitize the data, which breaking the data into pieces, and representing those pieces separately.

Electronic signals. An analog signal continually fluctuates in voltage up and down.A digital signal has only a high or low state, corresponding to the two binary digits. All electronic signals (both analog and digital) degrade as they move down a line. The voltage of the signal fluctuates due to environmental effects Periodically, a digital signal is reclockedto regain its original shape.

Representing text. There are a finite number of characters to represent, so list them all and assign each a binary string. Character set is a list of characters and the codes used to represent each one at which computer manufacturers agreed to standardize.

Representing of image. There are two kinds of image, bitmap and vector. Bitmap has resolution to measure if the picture is clear. Vector does not because vector is generated according to math equation by the computer. Usually human use RGB to represent pictures, because the combination of RGB could represent many colors. The color depth means how much red or green or blue does a dot of the picture contain.

Representing audio. Pulse Code Modulation (PCM) is a method used to digitally represent sampled analog signals. It has three steps, sampling, quantizing and encoding. Sampling is a process of acquiring audio sampling rate, which is number of samples per second that are used to digitize a particular sound.Quantization is the process of converting a continuous range of values into a finite range of values. Encoding is a process of converting (information or an instruction) into a particular form.

The picture shows how PCM goes

Third, Compression and encoding.

Activities: in the class we did a interesting activity that we compress a files according to the knowledge we just learn.

There are some characters that are repeated in the files. To minimize the space of the files, we could delete the repeated characters, only record how and what they are repeated. I have similar homework that need me to compress an article. I compressed an article from TOEFL reading. It turned out that I mastered compression skills, the compression rate of the article is surprisingly 10 percent.

Last, Huffman coding. Given a weight of N as a leaf node, construct a binary tree. If the length of the weighted path of the tree is the smallest, the binary tree is called the optimal binary tree, also known as the Huffman Tree. The Huffman tree is the tree with the shortest weight path, and the node with larger weight is closer to the root.

Conclusion, this week, I learn Bit, Byte, Binary and Hexadecimal, Image and Sounds Representation, and Data Compression and Encoding. Those knowledge are very important because the idea behind encoding and compression are very profound.

Data representation

Published by tigerhua

Leave a comment Cancel reply

Share this:

Related

Published by tigerhua

Leave a comment Cancel reply