library will have two programs: compress, and decompress; compress accepts a text
le and produces a compressed representation of that text le; decompress accepts a le that was
compressed with the compress program, and recovers the original le.
Implementation
Input to the compress is a text le with arbitrary size. Output of the program is a
compressed representation of the original le. You will have to save the codetable in the header of
the compressed le, so that you can use the codetable for decompressing the compressed le. Input
to the decompress is a compressed File, from which the program recovers the original le.
For sanity check, you should have a specic magic word at some position in the header of the compressed file,
so that decompress can identify whether the given file is a valid compressed file.
You should pay attention to the following issues:
The file that I will use for testing can be very large, having size in Gigabytes, so make sure that your program is bug-free and it works for large input file.
You must make sure that your program runs on a Linux Machine.
You must provide a Makele to compile your programs. Also, a [login to view URL] le should be
provided that will have the instruction to compile and run the programs.
Command-Line options
Compression:
C++: ./compress -f [login to view URL] [-o [login to view URL] -s]
Java: sh [login to view URL] -f [login to view URL] [-o [login to view URL] -s]
Decompression:
C++: ./decompress -f [login to view URL] [-o [login to view URL] -s]
Java: sh [login to view URL] -f [login to view URL] [-o [login to view URL] -s]
The command-line options that are within the square bracket are optional. The option \-f" precedes
the input le name, which always has a .txt extension. The \-s" option prints statistics, such as for
compression it prints, how many distinct characters are there, what is the compression ratio, and
the wall clock time that it took for performing the compression task. For decompression, it prints
how many character were written, and the wall clock time it took for performing the decompression
task. The \-o" option precedes the name of an output le. If the output le name is not given,
then we will append .hzip at the end of the input lename to create the output lename.