Submitted by Business-Lead2679 t3_12618zu in MachineLearning
itsyourboiirow t1_jecqjqd wrote
Reply to comment by Evening_Ad6637 in [D] Training a 65b LLaMA model by Business-Lead2679
Training requires a significant more amount of memory as it it has to keep track of the gradient for every parameter. I would check to see how much memory it takes up on your computer.
Viewing a single comment thread. View all comments