Submitted by ahiddenmessi2 t3_11dzfvf in MachineLearning
KingsmanVince t1_jaboa8m wrote
Knowing the architecture isn't enough. How large is your training dataset? Do you use gradient accumulation?
ahiddenmessi2 OP t1_jacin2n wrote
My dataset size can be varied cuz the data can be generated. Also, I will consider using gradient accumulation to improve performance too. Thanks
Viewing a single comment thread. View all comments