unkz t1_je9wuzm wrote on March 30, 2023 at 2:06 PM

Reply to comment by saintshing in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama

Practically speaking, it does have a context limit — that RNN issue has not really been solved. It is a lot of fun to play with though.

unkz t1_j2x7ggd wrote on January 4, 2023 at 4:09 PM

Reply to comment by matth0x01 in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

That’s basically it, cross entropy (sum of negative log likelihood) and perplexity are related by

Perplexity = 2^entropy

So the main two things are, interpretability (perplexity is a measure of how many words the model is choosing from at any point), and scale (small changes in cross entropy result in large changes in perplexity).

unkz t1_j2wzgf3 wrote on January 4, 2023 at 3:16 PM

Reply to comment by matth0x01 in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

Perplexity is one of the key evaluation metrics for how well a language model understands language. Pruning one model decreases perplexity (makes the model better), which is interesting.

unkz t1_j2v9edv wrote on January 4, 2023 at 4:26 AM

Reply to comment by matth0x01 in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

https://en.wikipedia.org/wiki/Perplexity

unkz t1_j2ujn5z wrote on January 4, 2023 at 1:13 AM

Reply to comment by SoulCantBeCut in [R] Do we really need 300 floats to represent the meaning of a word? Representing words with words - a logical approach to word embedding using a self-supervised Tsetlin Machine Autoencoder. by olegranmo

Please don’t, I think we have all heard enough from him.

unkz t1_iudsx4n wrote on October 30, 2022 at 3:57 PM

Reply to Believe it or not, Ember is more than half husky and malamute (the rest is blue tick coon hound and pit bull) by overcomebyfumes

I guess this is being posted because of the pit bull mauling video that’s going around eh.