serberus
01-20-2004, 09:31 AM
Firstly, i'm sorry for posting this here - I don't go to any other "coder" type forums because I don't code and I couldn't find an off topic/general forum on the SEQ forums so i'm afraid this has ended up here.
I've been thinking about compression and i've been wondering why nobody compresses things at the bit level, seeing as how every file essentially boils down to a very long string of 1's and 0's when you break it down, would this not be a very easy thing to compress?
I imagine a run length compression type scheme would work and could be backed up by something else.
The problem that makes me think if it would work or not is that your storing the 8 bits of data from a byte as 8 individual bytes (if you store the whole file as a binary string) meaning you've just multiplied the size of your file by 8, so you'd need atleast a compression factor of 8 minimum to just keep your file the same size.
I then thought, well, you could use a byte to store 8 bits, but this is what happens anyway so nothing's changing.
I've just hit my confusion point, i'm trying to follow my train of thought but now i'm stuck so I don't know what else to say.
I'll try to re-emphasise my point a bit clearer.
"Text" = "74h 65h 78h 74h" = "01110100011001010111100001110100"
so, because binary is so simple, is it easy to compress? or, because binary is so simple, it's inherently large (8 times larger than hex) and it's not possible to compress it that well because it's simple and limited?
Thank you to anyone who provides some information, i'm sure this has been discussed/attempted before but i'm not sure where to start looking (a search on google turned up information about "unconditional bases and bit-level compression" which didn't help me much).
Thanks
Serberus
I've been thinking about compression and i've been wondering why nobody compresses things at the bit level, seeing as how every file essentially boils down to a very long string of 1's and 0's when you break it down, would this not be a very easy thing to compress?
I imagine a run length compression type scheme would work and could be backed up by something else.
The problem that makes me think if it would work or not is that your storing the 8 bits of data from a byte as 8 individual bytes (if you store the whole file as a binary string) meaning you've just multiplied the size of your file by 8, so you'd need atleast a compression factor of 8 minimum to just keep your file the same size.
I then thought, well, you could use a byte to store 8 bits, but this is what happens anyway so nothing's changing.
I've just hit my confusion point, i'm trying to follow my train of thought but now i'm stuck so I don't know what else to say.
I'll try to re-emphasise my point a bit clearer.
"Text" = "74h 65h 78h 74h" = "01110100011001010111100001110100"
so, because binary is so simple, is it easy to compress? or, because binary is so simple, it's inherently large (8 times larger than hex) and it's not possible to compress it that well because it's simple and limited?
Thank you to anyone who provides some information, i'm sure this has been discussed/attempted before but i'm not sure where to start looking (a search on google turned up information about "unconditional bases and bit-level compression" which didn't help me much).
Thanks
Serberus