GitHub - ziad40/Compress-file: it is an command user interface used to compress files using huffman algorithm(Greedy algorithm) with variable length of chunks for compressing

Analysis and Design of Algorithms Programming Assignment 2

Huffman's Algorithm

In this part, it is required to implement Huffman's algorithm that we discussed. Your implementation should allow compressing and decompressing arbitrary files. As discussed in class, the implementation should collect statistics from the input file first, then apply the compression algorithm. Note that you will need to store a representation of the codewords in the compressed file, so that you can decompress the file back. Your program should have the capability of considering more than one byte. For example, instead of just collecting the frequencies and finding codewords for single bytes. The same can be done assuming the basic unit is n bytes, where n is an integer.

Specifications

You will submit a single runnable jar that will be used for both compression and decompression. Your jar should be named as huffman.jar. The jar must include the source code files.
To use it for compressing an input file, the following will be called: java -jar huffman.jar c absolute_path_to_input_file n
c means compressing the file.
n is the number of bytes that will be considered together.
To use it for decompressing an input file, the following be called: java -jar huffman.jar d absolute_path_to_input_file
If the user chooses to compress a file with the name abc.exe, the compressed file should have the name compressed..abc.exe.hc , and should be replaced by n (the number of bytes per group). The compressed file should appear in the same directory of the input file. The program should print the compression ratio and the compression time.
If the user chooses to decompress a file with name abc.exe.hc, the output file should be named extracted.abc.exe. This should appear in the same directory of the input file. The program should print the decompression time in this case.

Analysis Requirements

Submit a report that shows the compression ratio when running your implementation on the following files for different values of n = 1, 2, 3, 4, 5.
File 1: This file is from the NIH genetic sequence database. Note that the file is already compressed. Extract the file first, then run the experiments on the uncompressed file (gbbct10.seq). Note that the experiments here could take time, as the file is large. You should test your program and make sure of correctness on small inputs first. Also, try to implement your code efficiently to save time.
Compare the compression ratio you got in both cases with the compression ratio of 7-zip: https://www.7-zip.org/. Try to explain the observations. Note: The compression ratio can be defined using multiple ways. You can calculate it in the same way as 7-zip does, which is the size of the compressed file divided by the size of the original file.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
src		src
.gitignore		.gitignore
README.md		README.md
compressFile.iml		compressFile.iml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ziad40/Compress-file

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages