Questions tagged [compression]
The compression tag has no summary.
68 questions
5
votes
5
answers
771
views
How does data store compression speed up data warehouses?
I often see the claim that various data warehouse/analytical database systems derive significant performance benefits from compressing their data stores. On the face of it, though, this seems to be ...
0
votes
5
answers
293
views
Load and process (compressed) data from filesystem in the blink of an eye
We have a huge amount of queries hitting our API that request a minor or major extract of some huge files lying around on our mounted hard drives. The data needs to be extracted from the files and ...
1
vote
1
answer
2k
views
How to remove unused code from a jar file? [closed]
I have a jar file, for example foo.jar. My code contains a lot of libraries (almost 75 jar dependencies). I am not using anything like maven or gradle, I'm just using pure java with pure jar files as ...
0
votes
1
answer
388
views
Short and compact barcode
I am writing a c# program where I need to print a lot of small barcodes in a 100x100 grid on a piece of paper. I then scan/photograph the paper and read the barcodes again. Each barcode only need to ...
7
votes
2
answers
645
views
some misunderstanding in concept of Huffman algorithm
What is difference between Average length of codes and Average length of codewords in Huffman Algorithm? is both the same meaning? I get stuck in some facts:
I see a fact that marked as False:
for a ...
7
votes
2
answers
887
views
How does conditional compilation impact product quality, security and code complexity? [closed]
Software libraries targetting resource constrained environments like embedded systems use conditional compilation to allow consumers to shave space by removing unused features from the final binaries ...
2
votes
2
answers
571
views
Compressing EBCDIC file vs UTF8
Today I went across a weird case for which I have no explanation, so here I am.
I have two files with identical content, but one is encoded in UTF-8 and the other one is in IBM EBCDIC. Both of them ...
0
votes
2
answers
139
views
Is it possible to transfer data with a really unique seed of a psudo random number generator
I have thinking about this idea for over 5 years and i don't have the complete technical knowledge to fully grasp the idea I'm having.
The premise of the idea is to have an extremely high base number ...
30
votes
5
answers
8k
views
What is the most efficient way to store a numeric range?
This question is about how many bits are required to store a range. Or put another way, for a given number of bits, what is the maximum range that can be stored and how?
Imagine we want to store a ...
3
votes
2
answers
135
views
Non-precise Input/Using Probability in File Compression
I'm a high school student interested in topics of computer programming.
Recently I became interested in file compression, and in my head I tried to combine this with a completely different part of ...
2
votes
3
answers
2k
views
Algorithm for optimizing text compression
I am looking for text compression algorithms (natural language compression, rather than compression of arbitrary binary data).
I have seen for example An Efficient Compression Code for Text ...
-2
votes
1
answer
197
views
compression techniques for true random permutation of given integer N
Is it possible to compress true random permutation using low order polynomial interpolation? If yes, how it can be achieved?
0
votes
1
answer
492
views
Find Randomized Sequence Seed To Compress files statistically
I was wondering if what I have in mind already exists in any known compression programs/algorithms or not. We know that Seed gives us constant sequence of random numbers. so if we be able to find seed ...
4
votes
1
answer
371
views
Advantages of application-level data compression?
This question was inspired by MessagePack, but I'm looking for a general answer about the advantages of in-app vs. external compression.
For network I/O, doesn't the transport protocol (at least ...
2
votes
1
answer
105
views
I need to find a set of hierarchical symbols that can represent input binary data in near optimal space. What algorithms can I look into? [closed]
I have a stream of binary data. Assume no prior knowledge about the expected pattern in input data.
The symbols can represent binary data or other symbols, hence hierarchical.
The output should ...
4
votes
5
answers
640
views
Is saving disk space a valid reason to forgo migrating to a standard text format (e.g. JSON)?
A while ago I asked a question about custom text data formats, instead of using existing tools such as XML, JSON, YAML, etc. Now, in favor of converting our custom format to a relational database and ...
3
votes
1
answer
521
views
PDF content : text or graphics?
Is there a possible test to check if a PDF file contains text or it is created by scanning paper sheets ?
text : plain text that, for example, I can copy & paste while I am reading the PDF. Not ...
5
votes
2
answers
585
views
finding optimal token definitions for compression
I have a collection of strings which have a lot of common substrings,
and I'm trying to find a good way to define tokens to compress them.
For instance, if my strings are:
s1 = "String"
s2 = "Bool"
...
9
votes
2
answers
1k
views
Best compression algorithm for timelapse photos
I have a folder containing about 9,000 JPEG photos (about 30Gb), which I want to archive with some sort of compression. I understand that compressing JPEGs is not normally very effective, but these ...
1
vote
1
answer
1k
views
Best two-way compression algorithm for 32-bit numbers
I need to compress an id for marketing campaigns. The current campaign id is 32-bit integer but obviously this is too long for a customer to type by hand. I would like to compress this to minimum ...
3
votes
2
answers
277
views
Sparse set lossy compression algorithm
I am looking for algorithm or idea for the following problem.
Suppose we have a data type, say 64-bit integer. Now we have a relatively small set of such items, say few hundred at most. The simplest ...
1
vote
0
answers
2k
views
Calculating uncompressed file size without uncompressing file in zlib
I am writing a python program which parses zip (currently only zlib, using DEFLATE compression) files and verifies the correctness of their headers and data. One of the things I'm trying to achieve is ...
2
votes
2
answers
2k
views
What should I do when using Golomb/Rice code for large values?
When using Golomb/Rice code in image compression, it is inevitable for us to meet large values. Golomb coding uses a tunable parameter M to divide an input value N into two parts : q, the result of a ...
2
votes
2
answers
2k
views
How does Yahoo's Smush.It work and why doesn't everyone use it?
I've recently come across an application by Yahoo called SmushIt. Apparently it does lossless compression on images. Sometimes the image size is reduced by as much as 90%. This of course has major ...
1
vote
2
answers
2k
views
Is hash calculated before/after compression?
I had a question regarding compression and calculation of checksum/hash of data.
I would like to know if checksum has to be calculated before or after the compression of data before transmission. ...