Submitted by samyall t3_122a177 in askscience
kompootor t1_jdpvjj4 wrote
In short, based on what you are describing, a LLM is a terrible tool for the compression of its training data in comparison to virtually any other reasonable compression technique one could think of, by any metric.
When you talk about compression, you're generally talking about some raw data that you run through an algorithm which compresses it into a more manageable form, and then you run it through another algorithm to recover the raw data again with some amount of lossiness (or it can be lossless). AI models can do that, sure, but they are not designed to be data structures for storage and retrieval -- in a simplified ANN model they take new training data that is given to them, and in adjusting their weights the model can now interpolate between this new data and previous training data. That might, however, make it so that now asking this model to recall a specific piece of old training data will result in an even fuzzier, less-faithful output, the tradeoff being that the model can now be asked about hypothetical data between what it's been trained on. (I'll have to find a good intro guide for a simple ANN model that illustrates this with diagrams.) None of this gets into space, time, or resource efficiency, but those are all guaranteed to be worse than a dedicated compression algorithm in any practical as well.
I suppose you can look at a broad overview of how data compression works in general. There are ANN/AI algorithms for compression -- they use the predictive network to essentially tune an existing deterministic compression algorithm, optimizing it for the data that's being compressed. That's not anywhere close to similar to taking an ANN like a large language model and locating the compressed data entirely in the ANN's weights.
I don't know if this helps -- I can try to clarify stuff or provide some better articles if you like.
Viewing a single comment thread. View all comments