NeoKov t1_j5wmjkr wrote on January 26, 2023 at 1:51 AM

As a novice, I’m not understanding why the test loss continues to increase— in general, but also in Fig. 8.2b, if anyone can explain… The model continues to update and (over)fit throughout testing? I thought it was static after training. And the testing batch is always the same size as the training batch? And they don’t occur simultaneously, right? So the test plot is only generated after the training plot.

SimonJDPrince OP t1_j5yc4n2 wrote on January 26, 2023 at 12:26 PM

You are correct -- they don't usually occur simultaneously. Usually, you would train and then test afterwards, but I've shown the test performance as a function of the number of training iterations, just so you can see what happens with generalization.

(Sometimes people do examine curves like this using validation data, so they can see when the best time to stop training is though)

The test loss goes back up because it classifies some of the test answers wrong. With more training iterations, it becomes more certain about it's answers (e.g., it pushes the likelihood of its chosen class from 0.9 to 0.99 to 0.999 etc.). For the training data, where the everything is classified correctly, that makes it more likely and decreases the loss. For the cases in the test data where its classified wrong, it makes it less likely, and so the loss starts to go back up.

Hope this helps. I will try to clarify in the book. It's always helpful to learn where people are getting confused.

NeoKov t1_j5zvrkz wrote on January 26, 2023 at 6:51 PM

I see, thanks! This seems like a great resource. Thank you for making it available. I’ll post any further questions here, unless GitHub is the preference.

SimonJDPrince OP t1_j648ce9 wrote on January 27, 2023 at 4:32 PM

GitHub or e-mail are better. Only occasionally on Reddit.