Submitted by __Maximum__ t3_11l3as6 in MachineLearning
CKtalon t1_jbdjaxa wrote
Reply to comment by Taenk in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Instead of choosing a huge model and having it undertrained due to limited compute budget, choose the small but biggest model for your compute budget using their estimates. It doesn’t necessarily mean that a small model trained with larger datasets will naturally beat a bigger model.
Viewing a single comment thread. View all comments