DATA COMPRESSING METHODS



Data compression is the process of converting an input data stream or the source stream or the original raw data into another data stream that has a smaller size. data compression is popular because of two reasons

1)      People like to accumulate data and hate to throw anything away. No matter however large a storage device may be, sooner or later it is going to overflow. Data compression seems useful because it delays this inevitability
2)   People hate to wait a long time for data transfer.  There are many known methods of data compression. They are based on the different ideas and are suitable for different types of data. They produce different results, but they are all based on the same basic principle that they compress data by removing the redundancy from the original data in the source file. The idea of compression by reducing redundancy suggests the general law of data compression, which is to "assign short codes to common events and long codes to rare events". Data compression is done by changing its representation from inefficient to efficient form.
The main aim of the field of data compression is of course to develop methods for better and better compression. Experience shows that fine tuning an algorithm to squeeze out the last remaining bits of redundancy from the data gives diminishing returns. Data compression has become so important that some researches have proposed the "simplicity and power theory". Specifically it says, data compression may be interpreted as a process of removing unnecessary complexity in information and thus maximizing the simplicity while preserving as much as possible of its non redundant descriptive power.
                     
BASIC TYPES OF DATA COMPRESSION
There are two basic types  of data compression.
1.      Lossy compression
2.      Lossless compression

LOSSY COMPRESSION
In lossy compression some information is lost during the processing, where the image data is stored into important and unimportant data. The system then discards the unimportant data
It provides much higher compression rates but there will be some loss of information compared to the original source file. The main advantage is that the loss cannot be visible to eye or it is visually lossless. Visually lossless compression is based on knowledge about colour images and human perception.


LOSSLESS COMPRESSION
In this type of compression no information is lost during the compression and the decompression process. Here the reconstructed image is mathematically and visually identical to the original one. It achieves only about a 2:1 compression ratio. This type of compression technique looks for patterns in strings of bits and then expresses them more concisely.

TECHNIQUES OF DATA COMPRESSION
There are three important techniques of data compression.
1)                 basic technique
2)                 statistical technique
3)                 dictionary method

BASIC TECHNIQUES
These are the techniques, which have been used only in the past. The important basic techniques are run length encoding and move to front encoding.

STATISTICAL TECHNIQUES
            They are based on the statistical model of the data. Under this statistical techniques there comes three important techniques
·         Shannon Fano coding
·         Huffman coding
·         Arithmetic coding

DICTIONARY METHODS
            This method select strings of symbols and encodes each string as a token using a dictionary. The important dictionary methods are
·         LZ77 (sliding window)
·         LZRW1
BASIC TECHNIQUES

1.           1.RUN LENGTH ENCODING

            The basic idea behind this approach to data compression is this: if a data item occurs n consecutive times in the input stream replace the n occurences with a single pair <n d> . the n consecutive occurences of a data item are called run length of n and this approach is called run length encoding or RLE.

RLE IMAGE COMPRESSION
            RLE is a natural candidate for compressing graphical data. A digital image consists of small dots called pixels. Each pixel can be either one bit indicating a black or white dot or several bits indicating one of several colours or shades of gray. We assume that this pixels are stored in an array called bitmap in the memory. Pixels are normally arranged in the bit map in scan lines. So the first bit map pixel is the dot at the top left corner of the image and the last pixel is the one at the bottom right corner. Compressing an image using RLE is based on the observation that if we select a pixel in the image at random there is a good chance that its neighbours will have the same colour. The compressor thus scans the bit map row by row looking for runs of pixels of same colour.

Consider the grayscale bitmap -
12,12,12, 12, 12, 12, 12, 12, 12, 35,76,112,67,87,8787,5, 5, 5, 5, 5, 5,1- - - - - -

Compressed Form --

9, 12, 35, 76, 112, 67, 3, 87, 6, 5, 1- - - - - - - - - - -

1.     MOVE TO FRONT CODING
            The basic idea of this method is to maintain the alphabet A of symbols as a list where frequently occuring symbols are located near the front. A symbol 'a'  is encoded as the no of symbols that precede it in this list. Thus if A=('t','h','e','s') and the next symbol in the input stream to be encoded is 'e', it will be encoded as '2' since it is preceded by two symbols. The next step is that after encoding 'e' the alphabet is modified to A=('e','t','h','s') . This   move to front step reflects the hope that once 'e' has been read from  the input stream it will read many more times and will at least for a while be a common symbol.
Let A = (“t”, “h”, “e”, “s” )
After encoding the symbol “e”, A is modified.

Modified Form:-

A = (“e”, “t”, “h”, “s” )

ADVANTAGE
            This method is locally adaptive since it adapts itself to the frequencies of the symbol in the local areas of input stream. This method produces good results if the input stream satisfies this hope that is if the local frequency of symbols changes significantly from area to area in the input stream.
  
STATISTICAL TECHNIQUES
1.     SHANNON FANO CODING
            Shannon fano coding was the first method developed for finding good variable size codes. We start with a set of n symbols with known probabilities of occurences. The symbols are first arranged in the descending order of the probabilities. The set of symbols is then divided into two subsets that have the same probabilities. All symbols of one subset are assigned codes that start with a zero while the codes of the symbols in the other subset start with a one. Each subset is then recursively divided into two. The second bit of all codes is determined in a similar way. When a subset contains just two symbols their codes are distinguished by adding one more bit to each. The process continues until no subset remains.
            Consider a set of seven symbols, whose probabilities are given. They are arranged in the descending order of the probabilities. The two symbols in the first subset are assigned codes that start with 1, so their final codes are 11 and 10. The second subset is divided in the second step, into two symbols and three symbols. Step 3 divides last three symbols into 1 and 2.

Shannon-Fano Example

            Prob.                                        Steps                                                    Final
            ____________________________________________________________
            1. 0.25             1          1                                                                      :11
            2. 0.20             1          0                                                                      :10
            3. 0.15             0                      1          1                                              :011
            4. 0.15             0                      1          0                                              :010
            5. 0.10             0                      0                      1                                  :001
            6. 0.10             0                      0                      0          1                      :0001
            7. 0.05             0                      0                      0          0                      :0000

The average size of this code is
            = 0.25 x 2 + 0.20x2 + 0.15 x3 + 0.15 x 3 + 0.10 x 3 + 0.10 x 4 + 0.05 x 4
            = 2.7 bits / symbol.
This is a good result because the entropy is ≈ 2.67.

ADVANTAGE
The advantage of this method is that it is very easy to implement.

2.     HUFFMAN CODING
            A commonly used  method for data compression is huffman coding. The method starts by building a list of all the alphabet symbols in descending order of their probabilities. It then constructs a tree with a symbol at every leaf from the bottom up. This is done in steps where at each step the two symbols with smallest probabilities are selected, added to the top of partial tree, deleted from the list and replaced with an auxiliary  symbol representing both of them. When the list is reduced to just one auxiliary symbol the tree is complete. The tree is then traversed to determine the codes of the symbols.
            The huffman method is somewhat similar to shannon fano method. The main difference between the two methods is that shannon fano constructs its codes from top to bottom while huffman constructs a code tree from bottom up.
            This is best illustrated by an example. Given five symbols with  probabilities as shown in Figure. They are paired in the following order:

1.      a4 is combined with a5 and both are replaced by the combined symbol a45, whose probability is 0.2.
2.      There are now four symbols left, a1, with probability 0.4, and a2, a3, and a45, with probabilities 0.2 each. We arbitrarily select a3 and a45 combine them and replace them with the auxiliary symbol a345, whose probability is 0.4.
3.      Three symbols are now left, a1, a2, and a345, with probabilities 0.4, 0.2, and 0.4 respectively. We arbitrarily select a2 and a345, combine them and replace them with the auxiliary symbol a2345, whose probability is 0.6.
4.      Finally, we combine the two remaining symbols a1, and a2345, and replace them with a12345 with probability 1.
            The tree is now complete, “lying on its side” with the root on the right and the five leaves on the left. To assign the codes, we arbitrarily assign a bit of 1 to the top edge, and a bit of 0 to the bottom edge of every pair of edges. This results in the codes 0, 10, 111, 1101, and 1100. The assignments of bits to the edges is arbitrary.

The average size of this code is 0.4 x 1 + 0.2 x 2 + 0.2 x 3 + 0.1 x 4 + 0.1 x  4 = 2.2 bits / symbol, but even more importantly, the Huffman code is not unique.

2.     ARITHMETIC CODING
                        In this method the input stream is read symbol by symbol and appends more to the code each time a symbol is input and processed. To understand this method it is useful to imagine the resulting code as a number in the range [0,1) that is the range of real numbers from 0 to 1 not including one. The first step is to calculate or at least to estimate the frequency of occurrence of each symbol.
            The foresaid  techniques that is the Huffman and the  Shannon Fano techniques rarely produces the best variable size code. The arithmetic coding overcomes this problem by assigning one code to the entire input stream instead of assigning codes to the individual bits.

The main steps of arithmetic coding are
1.       start by defining the current interval as [0,1)
2.       repeat the following two steps for each symbol 's' in the input stream.
2.1) divide the current interval into subintervals whose sizes are proportional to the probability of the symbol.
2.2)  select the sub interval for 's' and define it as the new current interval.
            When the entire input stream has been processed in this way the output should be any number that uniquely identifies the current interval.

Consider the symbols a1, a2, a3
                                    Probabilities – P1= 0.4. P2=0.5, P3=0.1
                                    Subintervals – [0-0.4] [0.4-0.9] [0.9-1]
To encode – “a2 a2 a2 a3
Current interval – [0,1]
a2         [0.4-0.9]
a2         [0.6-0.85]                    {0.4 + (0.9-0.4)0.4 = 0.6}
                                                                        {0.4 + (0.9-0.4)0.9 = 0.85}
a2         [0.7-0.825]                  {0.6 + (0.85-0.6)0.6 = 0.7}
                                                                        {0.6 + (0.85-0.6)0.85 = 0.825}
a3         [0.8125-0,8250]          {0.7 + (0.825-0.7)0.7 = 0.8125}
                                                                        {0.7 + (0.825-0.7)0.825 =0.8250}

DICTIONARY METHODS
            Dictionary methods select strings of symbols and encode each  string as a token using a dictionary. The dictionary holds all the strings of symbols.

1) LZ77  (SLIDING WINDOW)
            The main idea of this method is to use part of previously seen input stream as the dictionary. The encoder maintains a window to the input stream and shifts the input in that window from right to left as strings of symbols are being encoded. The method is thus based on "sliding window". The window is divided into two parts that is search buffer ,which is the current dictionary and lookahead buffer ,containing text yet to be encoded.

2.  LZRW1 TECHNIQUE

The main idea is to find a match in one step using a hash table. The method uses the entire available memory as a buffer and encodes the input string in blocks. A block is read into the buffer and is completely encoded. Then the next block is read and encoded and so on. These two buffers slide along the input block in memory from left to right. It is necessary to maintain only one pointer pointing to the start of look ahead buffer. This pointer is initialized to one and is incremented after each phrase is encoded. The leftmost three characters of the look ahead buffer are hashed into a 12 bit number 'I', which is used to index an array of  212  ie 4096 pointers.  A pointer p is retrieved and is immediately replaced in the array by 'I'. if p points outside the search buffer there is no match, the first character in the look ahead buffer is output as literal and pointer is advanced by one. the same thing is done if p points outside the search buffer but to a string that does not match with the one in look ahead buffer . If  p points to a match of at least three characters the encoder finds the longest match, outputs a match item and advances the pointer by the length of the match.

                               JPEG    
(JOINT PHOTOGRAPHIC EXPERTS GROUP)
            JPEG method is an important method of data compression. It is a sophisticated lossy/lossless compression method for colour or gray scale still images. The main JPEG compression steps are as follows.
1)      colour images are transformed from RGB into a luminance chrominance space.
2)      Colour images are down sampled by creating low resolution pixels from the original ones. The down sampling is not done for the luminance component. The down sampling is done at a ratio of 2:1 both horizontally and vertically.
3)      The pixels of each colour component are organized in groups of 8 X 8 pixels called data units.
4)      The DCT is applied to each data unit to create an 8 X * map of frequency components. DCT means direct cosine transform. It involves transcendental function cosine and involves some loss of information due to limited precision of computer arithmetic. The data units represent the average pixel value and successive higher frequency changes with in groups.
5)      Each of 64 frequency components in a data unit is divided by a separate number called quantisation coefficient and rounded to an integer.
6)      The 64 quantized frequency coefficients of each data unit are encoded using a combination of RLE and huffman coding.
7)      The last step adds headers and all the JPEG parameters used and outputs the result.

Often, the eye cannot see any image degradation at compression ratios of 10:1 or 20:1. There are two main modes: lossy (also called baseline) and lossless. Most implementation support just the lossy mode. This mode includes progressive and hierarchical coding.
            JPEG is a compression method, not a complete standard for image representation. This is why it does not specify image features such as pixel aspect ratio, color space, or interleaving of bitmap rows.
            JPEG has been designed as a compression method for continuous-tone images. The main goals of JPEG compression are the following:
·         High compression ratios, especially in cases where imnage quality is judged as very good to excellent.
·         The use of many parameters, allowing sophisticated users to experiment and achieve the desired compression/quality tradeoff.
·         Obtaining good results with any kind of continuous-tone image, regardless of image dimensions, color spaces, pixel aspect ratios, or other image features.
·         A sophisticated, but not too complex compression method, allowing software and hardware implementations on many platforms.
            Several modes of operation: (a) sequential mode: each image component is compressed in a single left-to-right, top-to-bottom scan; (b) Progressive modes of the image is compressed in multiple blocks (known as “scans”) to be viewed from coarse to fine detail; (c) Lossless mode: important for cases where the user decides that no pixels should be lost and (d) Hierarchical mode: the image is compressed at multiple resolutions allowing lower-resolution blocks to be viewed without first having to decompress the following higher-resolution blocks.
            The progressive mode is a JPEG option. In this mode, higher-frequency DCT coefficients are written on the compressed stream in blocks called “scans”. Each scan read and processed by the decoder results in a sharper image. The idea is to use the first few scans to quickly create a low-quality, blurred preview of the image, then either input the remaining scans or stop the process and reject the image. The tradeoff is that the encoder has to save all the coefficients of all the data units in a memory buffer before they are sent in scans, and also go through all the steps for each scan, slowing down the progressive mode.
            In the hierarchical mode, the encoder stores the image several times in the output stream, at several resolutions. However, each high-resolution part uses information from the low-resolution parts of the output stream, so the total amount of information is less than that required to store the different resolutions separately. Each hierarchical part may use the progressive mode.
            The hierarchical mode is useful in cases where a high-resolution image needs to be output in low resolution. Older dot-matrix printers may be a good example of a low-resolution output device still in use.
            The lossless mode of JPEG calculates a “predicted value for each pixel, generates the difference between the pixel and its predicted value and encodes the difference using the same method. The predicted value is calculated using values of pixels above and to the left of the current pixel.

APPLICATION IN IMAGE COMPRESSION
            Now the following approaches illustrates how all these fore said techniques are applied to image compression. Photographic digital images generate a lot of data taking up large amounts of storage space and this is one of the main problems encountered in digital imaging. To rectify this problem image compression is used depending on the type of data, text, graphics, photographic or video. Image compression reduces image data by identifying patterns in the bit strings, describing pixel values then replacing them with a short code.

BASIC PRINCIPLE OF IMAGE COMPRESSION
            The idea of  losing image information becomes more palatable when we consider how digital images are created. Here are three examples: (1) A real-life image may be scanned from a photograph or a painting and digitized (converted to pixels). (2) An image may be recorded by a video camera that creates pixels and stores them directly in memory. (3) An image may be painted on the screen by means of a paint program. In all these cases, some information is lost when the image is digitized. The fact that the viewer is willing to accept this loss suggests that further loss of information night be tolerable if done properly.
            Digitizing an image involves two steps: sampling and quantization. Sampling an image is the process of dividing the two-dimensional original image into small regions: pixels. Quantization is the process of assigning an integer value to each pixel. Notice that digitizing sound involves the same two steps, with the difference that sound is one-dimensional.

            Here is a simple process to determine qualitatively the amount of data loss in a compressed image. Given an image A, (1) compress it to B, (2) decompress B to C, and (3) subtract d = C – A. if a was compressed without any loss and decompressed properly, then C should be identical to A and image D should be uniformly white. The more data was lost in the compression, the farther will D be from uniformly white.
            The main principles discussed so far were RLE, scalar quantization, statistical methods, and dictionary-based methods. By itself, none is very satisfactory for color or grayscale images.
            RLE can be used for (lossless or lossy) compression of an image. This is simple, and it is used by certain parts of JPEG, especially by its lossless mode. In general, however, the other principles used by JPEG produce much better compression than does RLE alone. Facsimile compression uses RLE combined with Huffman coding and gets good results, but only for bi-level images.
            Scalar quantization can be used to compress images, but its performance is mediocre. Imagine an image with 8-bit pixels. It can be compressed with scalar quantization by cutting off the four least-significant bits of each pixel. This yields a compression ratio of 0.5, not very impressive, and at the same time reduces the number of colors (or grayscales) from 256 to just 16. Such a reduction not only degrades the overall quality of the reconstructed image, but may also create bands of different colors which is a noticeable and annoying effect.
            Statistical methods work best when the symbols being compressed have different probabilities. An input stream where all symbols have the same probabilities will not compress, even though it may not necessarily be random. It turns out that for continuous-tone color or grayscale image, the different colors or shades often have roughly the same probabilities. This is why statistical methods are not good choice for compressing such images, and why new approaches for images with color discontinuities, where adjacent pixels have widely different colors compress better with statistical methods, but it is not easy to predict, just by looking at an image, whether it has enough color discontinuities.
            Dictionary-based compression methods also tend to be unsuccessful in dealing with continuous-tone images. Such an image typically contains adjacent pixels with similar colors, but does not contain repeating patterns. Even an image that contains repeated patterns such as vertical lines may lose them when digitized. A vertical line in the original image may become slightly slanted when the image is digitized, so the pixels in a scan row may end up having slightly different colors from those in adjacent rows, resulting in a dictionary with short strings.
            Another problem with dictionary compression of images is that such methods scan the image row by row, and may thus miss vertical correlations between pixels. Traditional methods are therefore unsatisfactory for image compression, so we turn on to novel approaches. They are all different, but they remove redundancy from an image by using the following principle.
            Image compression is based on the fact that neighbouring pixels are highly correlated.

APPROACH 1
            This is used for bi level images.  A pixel in such an image is represented by 1 bit. Applying the  principle of image compression to it therefore means that the immediate neighbours of a pixel 'p' tends to be similar to 'p'. Thus it makes sense to use run length encoding to compress the image. A compression method for such an image may scan it row by row and compute the run length of black and white pixels. A compression method for such an image may scan it in raster ie, row by row and compute the lengths of runs of black and white pixels. They are encoded by variable size codes and are written on the compressor. An example of such a method is facsimile compression.
            Data compression is especially important when images are transmitted over a communication line because the user is typically waiting at the receiver, eager to see something quickly. Documents transferred between fax machines are sent as bitmaps, so a standard data compression method was needed when those were developed and proposed by the ‘International Telecommunications Union’. Although it has no power enforcement, the standards it recommends are generally accepted and adopted by industry.
            The first data compression standard developed by the ITU were T2 and T3. These are now obsolete and have been replaced by T4 and T6. They have typical speeds of 64 k band. Both methods can produce compression ratios of 10:1 or better, reducing the transmission time of a typical pate to about a minute with former and a few seconds with the later.

APPROACH 2
            This approach is also for bi level images. The principle of image compression is that the neighbours of a pixel tend to be similar to the pixel. We can apply the principle and conclude that if the current pixel has colour C, then pixels of same colour seen in the past tend to have the same immediate neighbours. This approach looks at n of the near neighbours of the current pixel and assigns them an n bit number. This number is the context of the pixel. The encoder counts how many times each context has already  been found in the pixel of colour C.and assigns probabilities to the context accordingly. If the pixel has colour C and its context has probability P the encoder can apply arithmetic encoding to encode the pixel with that probability.
            Next we turn on to grayscale image. A pixel in such an image is represented by n bits and can have one of 2n values. Applying the principle of image compression to a grayscale image implies that the immediate neighbours of a pixel, P are similar to P, but are not necessarily compress such an image. Instead, two more approaches are there.
Gray codes:-
            An image compression method that has been developed specifically for a certain type of image can sometimes be used for other types. Any method for compressing bi-level images, for example, can be used to compress grayscale images by separating the bitplanes and compressing each individually, as if it were a bi-level image. Imagine, for example, an image with 16 grayscale values. Each pixel is defined by four bits, so the image can be separated into four bi-level images. The trouble with this approach is that it violates the general principle of image compression. Imagine two adjacent 4-bit pixels with values 7 = 0112 and 8 = 10002. These pixels have close values, but when separated into four bitplanes, the resulting 1-bit pixels are different in every bitplane! This is because the binary representations of consecutive integers 7 and 8 differ in all four bit positions. In order to apply any bi-level compression method to grayscale images, a binary representation of the integers is needed where consecutive integers have codes differing by one bit only. Such a representation exists and is called reflected Gray code (RGC). This code is easy to generate with the following recursive construction:

            Start with the two 1-bit codes (o,1). Construct two sets of 2-bit codes by duplicating (0,1) and appending, either on the left or on the right, first a zero, then a one, to the original set. The result is (00,01). We know reverse (reflect) the second set, and concatenate the two. The result is the 2-bit RGC (00,01,11,10); a binary code of the integers 0 through 3 where consecutive codes differ by exactly one bit. Applying the rule again produces the two sets (00, 001, 011, 010) and (110, 111, 101, 100), which are concatenated to form the 3-bit RGC. Note that the first and last codes of any RGC also differ by one bit.

APPROACH 3
            Imagine a grayscale image with 4 bit pixels or 16 shades of gray. If two adjacent pixels have values 0000 and 0001, then they are similar. They are also identical in three of the four bi-level images. However, two adjacent pixels with values 011 and 1000 are also similar in grayscale image but differ in all four bi-level images. This problem occurs, because, the binary codes of adjacent integers may differ by several bits. The binary codes of 0 and 1 differ by one bit, those of 1 and 2 differ by two bits and those of 7 and 8 differ by four bits. The solution is to design special binary codes such that the codes of any consecutive integers is and will differ by one bit only.
            The most significant bit planes of an image obey the principle of image compression more than the least-significant ones. When adjacent pixels have values that differ by one unit (such as p and p+1), chances are that the least-significant bits are different and the most-significant ones are identical. Any compression method that compress bit planes individually should therefore treat the  least-significant bit planes differently from the most-significant ones, or RGC instead of binary to to represent pixels.
              In this approach we separate the gray scale image into n bi level images of each with RLE and prefix codes. The principle of image compression implies that two adjacent pixels that are similar in gray scale will be identical in most of the n bi level images. Now we can apply any method for compressing bi level images to compress gray scale images also .Imagine for example an image with 16 gray scale values. Each pixel is defined by 4 bits. So the image can be separated into four bi level images. The important thing  to be needed is the binary representation of integers, where consecutive integers have codes differing by one bit only. Such a representation is called Reflected Gray Code or RGC. 

APPROACH 4
            In this approach we use the context of the pixel to predict its values. The context of a pixel is the values of some of its neighbours.We can examine some neighbours of a pixel P ,compute an average value A and predict that P will have the value A. The principle of image compression tells us that our prediction will be correct in most of the cases, almost correct in many cases and completely wrong in  a few cases. We can say that the predicted value of pixel p represents the redundant information in the pixel P. We can now calculate the difference D=P-A and assign variable size codes to different values of D  such that small values  are assigned  short codes and large values are assigned long codes. The values of D tends to be distributed  according to Laplace distribution and now we can use this distribution to assign probabilities to each values of D and thus arithmetic coding can be efficiently applied to encode D values.
                                                                    
            If P can have values 0 through (n-1), then the value of d are in the range [-(n-1), +(m-1)], and the number of codes needed is (2(m-1) + 1) or (2m-1). The context of a pixel may consist of just one or two of its immediate neighbours however better results maybe obtained when several  neighbour pixels are included in the context. The average A in such a case should be weighted with near neighbours assigned higher weights. Another important consideration is the decoder. In order for it to decode the image, it should be able to calculate the context of every pixel. This means that the context should contain only pixels that have already been encoded. If the image is scanned in raster order, the context should include only pixels located above the current pixels or the same row and to its left.

APPROACH 5
            In this method e transform the values of pixels and encode the transformed values .An image can be compressed by transforming its pixels to a representation where they are decorrelated. Compression is achieved if the new values are smaller than the original ones. The decoder inputs the transformed values from the compressed stream and reconstructs  the original data by applying the opposite transform. There are several methods for transforming the pixel. Of these the most important one is the direct cosine transform or DCT.
            The mathematical concept of a transform is a powerful tool in many areas and can also serve as an approach to image compression. An image can be compressed by transforming its pixels (which are correlated) to a representation where they are decorrelated. Compression is achieved if the new values are smaller, on average, than the original ones. Lossy compression can be achieved by quantizing the transformed values. The decoder inputs the transformed values from the compressed stream and reconstructs the (precise or approximate) original data by applying the opposite transform.
            The term decorrelated means that the transformed values are independent of one another. As a result, they can be encoded independently, which makes it simpler to construct a statistical model. An image can be compressed if its representation has redundancy. The redundancy in images stem from pixel correlation. If we transform the image to a representation where the pixels are decorrelated, we have eliminated the redundancy and the image has been fully compressed.

Orthogonal Transform
            Image transforms used in practice should be fast and preferably also simple to implement. This suggests the use of linear transforms. In such a transform, each transformed value ci is a weighted sum of the data items (the pixels) dj, where each item is multiplied by a weight (or a transform coefficient) wij. Thus, ci = ∑j djwij, for I, j = 1.2,………..n  For the general case, we can write C = W.D. each row of W is called a “basis vector."
            The important issue is the determination of the values of the weights wij. The guiding principle is that we want the first transformed value c1 to be large, and the remaining values c2, c3, …. To be small. The basic relation ci = ∑j djwij  suggests that ci will be large when each weight wij reinforces the corresponding data item dj. This happens, for example, when the vectors wij and dj have similar values and signs. Conversely, ci will be small if all the weights wij are small and half of them have the opposite sign of dj. Thus, when we get a large ci we know that the basis vector wij resembles the data vector dj. A small ci, on the other hand, means that wij and dj have different shapes. Thus, we can interpret the basis vectors wij as tools to extract features from the data vector.
            In practice, the weights should be independent of the data items. Otherwise, the weights would have to be included in the compressed stream, for the use of the decoder. This, combined with the fact that our data items are pixel values, which are normally nonnegative, suggests a way to choose the basis vectors. The first vector, the one that produces c1, should consist of positive, perhaps even identical, values. This will reinforce the nonnegative values of the pixels. Each of the other vectors should have half its elements positive and the other half, negative. When multiplied by nonnegative data items, such a vector tends to produce a small value. Recall that we can interpret the basis vectors as tools for extracting features from the data vector. A good choice would therefore be to have basis vectors that are very different from each other, and so can extract different features. This leads to the idea of basis vectors that are mutually orthogonal. If the transform matrix W is orthogonal, the transform itself is called orthogonal. Another observation that helps to select the basis vectors is that they should feature higher and higher frequencies, thus extracting higher-frequency features from the data as we go along, computing more transformed values.

Discrete Cosine Transform
            One of the most popular transform method is the discrete cosine transform or simply, the DCT. The two dimensional DCT is the method use in practice. The pixels of an image are correlated in two dimensions, not just in one dimension. This is why image compression methods use the two dimensional DCT, given by

                           n-1 n-1
Gij = 1/√2nCiCj ∑ ∑ Pxy cos ((2y+1)jπ) cos((2x + 1))iπ),
                         X=0 y=0
For 0<=i, j<= n-1. The image is broken up into blocks of n x n  pixels Pxy (we use n = 8 as an example), and the Equation is used to produce a block of 8 x 8 DCT coefficients Gij for each block of pixels. If lossy compression is required, the coefficients are quantized. The decoder reconstructs a block of (approximate or precise) data values by computing the inverse DCT (IDCT):
                7   7
Pxy = ¼ ∑ ∑ CiCjGij cos((2x+1)iπ)cos((2y+1)jπ),
                 i=0  j=0
where Cf = { 1/√2, f=0
                           1, f>0

APPROACH 6
            The principle of this approach is to separate a continuous tone colour image into three gray scale images and compress each of the three separately using approaches 2,3 or 4. An important feature of this approach is to use luminance chrominance colour representation. The advantage of this representation is that the eye is sensitive to small changes in luminance but not in chrominance. This allows the loss of considerable data in chrominance components while making it possible to decode the image without a significant loss of quality.

APPROACH 7
            This is for discrete tone images. This type of image contain uniform regions and a region may appear several time in the image. A possible way to compress such an image is to scan it, identify the regions and find the repeating regions. If a region B is identical to an already found region A then B can be compressed by writing a pointer to A on the compressed stream.

APPROACH 8
            In this approach we partition the images to various parts and compress it by processing the parts one by one. Suppose that the next unprocessed image part is 15. Try to match it with parts 1 to 14 that have already been processed. If it matches, the few numbers that specify the combination need be saved and part 15 can be discarded. If part 15 cannot be expressed as a combination , it is saved in new format.

CONCLUSION
            Data compression is still a developing field. It has become so popular that researches are going on in this field to improve the compression rate and speed. Modifying an algorithm by 1% will improve the right time by 10%. Of course the aim of data compression branch is to develop better and better compression techniques.

No comments:

Post a Comment

leave your opinion