In which I take you down the lossless compression rabbit hole ๐ฐ
I'll give a overview of one approach to compression based on Information Theory and explain a few compression algorithms, including Lempel-Ziv and PAQ. I'll also go over some recent compression algorithms based on deep neural networks.
Along the way I'll attempt to answer curious compression questions such as:
- What is the smallest number of bytes you can compress a file to?
- What does compression have to do with Machine Learning?
- Why is there a 500,000โฌ prize to compress Wikipedia?
- Does compression = AI ?
You MLH fellow, you have quite the brain
You fixed all the bugs, and merged into main
You coded some tests (which took quite a while)
Now can you compress this 100kB file?
Submit in the comments number of bytes and how you did it (smallest wins ๐ฅ )
# example
$ curl https://raw.githubusercontent.com/meiji163/compress-demo/main/compressme.txt \
| gzip > compressme.gz \
&& wc -c compressme.gz