hosjiu t1_jd1a6az wrote on March 21, 2023 at 2:31 AM

Reply to comment by Civil_Collection7267 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph

"They also have the tendency to hallucinate frequently unless parameters are made more restrictive."

I am not really understand this point in term of technical

hosjiu t1_jcjey3z wrote on March 17, 2023 at 7:21 AM

Reply to comment by learn-deeply in [P] nanoT5 - Inspired by Jonas Geiping's Cramming and Andrej Karpathy's nanoGPT, we fill the gap of a repository for pre-training T5-style "LLMs" under a limited budget in PyTorch by korec1234

sure, but its main focus is to try to help many people in the academic community out there can do “pretraining phase” by themeself for fast, cheap and reproducible research experiments.

hosjiu t1_isi4pyg wrote on October 16, 2022 at 3:59 AM

Reply to comment by rmsisme in [R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! by Singularian2501

the same point of view with u.