[D] Why are so many tokens needed to train large language models? Submitted by blacklemon67 t3_11misax on March 9, 2023 at 4:35 AM in MachineLearning 17 comments 12