LOSSLESS DATA COMPRESSION TECHNIQUES WITH BLOCK CODING: AN EXPLANATION

Authors: Mathur D., Mathur P.
Source: International Journal of Engineering Trends and Technology (IJETT) – Volume 36 Number 2- June 2016

Abstract

Data compression is the process of encoding information using fewer bits (or other information bearing units) than an uuencoded representation would use, through use of specific encoding schemes. Using this we can achieve reduced resource consumption and security of data as well. Data Compression may be categorized in following two categories.
(A)Lossless Compression algorithms usually exploit statistical redundancy in such a way as to represent the sender’s data more concisely without error.
(B)Lossy Compression is possible if some loss of some fidelity is acceptable .Lossy schemes accept some losss of data in order to achieve higher compression .Lossy data compression is better known as rate-distortion theory. Shannon formulated the theory of data compression .Lossless data compression theory and rate-distortion theories are known collectively as Source Coding theory.

Keywords: Lossy Compression, Lossless Compression, encoding, block coding.

1 INTRODUCTION

Data is defined as the raw facts, figures or anything that is unprocessed .Data compression is a technique in which the data/information is encoded using fewer bits through use of specific encoding schemes. Data compression[2] can be seen as a technique which has two folds objective.Prime objective is to minimize the amount of data to be stored or transmitted. Another objective may be treated as the security of data as well because the data is send in encoded form. Data Compression is useful because it helps in reducing the consumption of resources, such as hard disk space or transmission bandwidth[5]. Also it increase speed of data transfer from disk to memory. This paper discusse[7]s the data compression with emphasis on Lossless technique.

II. DATA OMPRESSION TECHNIQUES

Now if we talk about Data compression[1], it may be divided into following 2 categories:
A . Lossless compression
Is the one which uses the statistical redundancy that occurs in the data in such a way as to represent the sender's data more concisely without error. Lossless compression schemes are reversible so that the original data can be reconstructed [3].
B . Lossy compression
Is the one which is possible only in the case where some loss of fidelity is acceptable. .Lossy data compression is better known as rate-distortion theory. Lossy schemes[11] accept some loss of data in order to achieve higher compression.
Claude E. Shannon formulated the theory of data compression[6]. Lossless Data Compression Theory and Rate-Distortion Theory are known collectively as source coding theory. Now we would discuss the Lossless Source Coding and Rate-Distortion Theory for Lossy Data Compression.

III.LOSSLESS SOURCE CODING

Lossless data compression is used in many applications. For example, it is used in the zip file format. It is also often used as a component within lossy data compression technologies.

It has its base in the block coding[6]. To show this concept, we consider a source of string in which the set of alphabets consists of only two letters namely a and b
A={a,b} (1)

But it is sure that if „a" occurred in the previous character, the probability that „a" occurs again in the present character is 0.9. Similarly, given that if „b" occurred in the previous character, the probability that „b" occurs again in the present character is 0.9.

An n-th order block code is just a mapping which assigns to each block of n consecutive characters a sequence of bits of varying length. . This can be shown by following 2 examples.

A. First-Order Block Code
Here each character is mapped to single bit as shown in table below.
Using Table I we refer to Fig . 1 and note that here 14 bits are used to represent 14 characters -- an average of 1 bit/character.
B. Second-Order Block Code
Here pair of characters are mapped to either one, two, or three bits as shown in table below. Using Table II we refer to Fig.2 and note that here 16 bits are used to represent 18 characters ---an average of 0.88 bits/character. What a decrease in consumption of bits/ character!!!!
Similarly we can move to third ,fourth and so on up to n order.
C. Formula
l(Bn) is the length of the codeword for block Bn. The higher the order, the lower the rate that is better compression is achieved.
As we move to n Order Block Code the consumption of bits / character decrease significantly
We are only interested in lossless data compression code. That is, we are able to rederive the original data. All of the examples given above are lossless.

IV. RATE-DISTORTION THEORY

In lossy data compression[11], the decompressed data need not be exactly the same as the original data. So some amount of distortion is acceptable. A distortion measure specifies exactly how close the approximation is.

V. CONCLUSION

We surveyed the data compression technique that use block coding concept. It shows basic information about data compression, & their techniques are applied on text file. By above discussion we note that as we move gradually from 1 to n Order Block Code the consumption of bits / character decrease significantly as shown by examples.
Both loosy and lossless have their advantage in different situations. These not only minimize the amount of data to be stored or transmitted but also the security of data is strengthened .

ACKNOWLEDGMENT

Deepak Mathur is Assistant Professor in Computer Science Department of Lachoo Memorial College Of Science & Technology(Autonomous) Jodhpur,Rajasthan India. He is an MCA having teaching experience of more than 10 years and pursuing Ph.D. in Computer Applications from JNU,Jodhpur Dr. Prabhat Mathur is Associate Professor in Computer Science Department of Lachoo Memorial College.

REFERENCES

[1] Mrs. Durga Devi and Shobharani D A , A Novel Third Party Cryptosystem for Secure and Scalable data sharing in Cloud”, International Journal of Engineering Trends and Technology (IJETT) ,Vol. 35, pp. 402-404, May 2016
[2] Priyanka Chouksey1 and Dr. Prabhat Patel , “Secret Key Steganography technique based on three-layered DWT and SVD algorithm”, International Journal of Engineering Trends and Technology (IJETT) , Vol. 35 ,pp. 440-445, May 2016
[4] C. H. Kuo , C.F.Chen , and W.Hsia ,”A compression algorithm bassed on classified interpolative block truncation coding and vector quantization”,Journal of Information Science and Engineering ,15:pp 1-9 ,1999.
[5] T.SubhamastanRao, M.Soujanya, T.Hemalath and T.Revathi, “Simultaneous data compression and encryption” (IJCSIT) International Journal of Computer Science and Information Technologies, ISSN 0975-9646, Volume- 2(5), 2011.
[6] http://www.data-compression.com/theory.html
[7] Haroon Altarawneh, Mohammad Altarawneh “Data Compression Techniques on Text Files: A Comparison Study” International Journal of Computer Applications, (0975 – 8887), Volume 26– No.5, July 2011.
[8] AL.Jeeva, Dr.V.Palanisamy and K.Kanagaram “Comparative Analysis of Performance Efficiency and Security Measures of Some Encryption Algorithms” International Journal of Engineering Research and Applications (IJERA), Volume 3, Issue 6, June 2013.
[9] Mr. Vinod Saroha, Suman Mor and Anurag Dagar “Enhancing Security of Caesar Cipher by Double Columnar Transposition Method” International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 10, 2012
[10] Sombir Singh, Sunil K. Maakar, Dr.Sudesh Kumar “Enhancing the Security of DES Algorithm Using Transposition Cryptography Techniques” International Journal of Advanced Research in Computer Science and Software Engineering, ISSN 2249-6343, Volume 2, Issue 1, Jan 19 2012
[11] Nutan Shep1 and Mrs. P.H. Bhagat ,” Implementation of Hamming code using VLSI “,International Journal of Engineering Trends and Technology(IJETT), Vol.4,. pp. 186- 190, 2013