[D] Classification with final layer having no activation? Submitted by AbIgnorantesBurros t3_y0y3q6 on October 11, 2022 at 3:11 AM in MachineLearning 7 comments 6
Seankala t1_iruw37k wrote on October 11, 2022 at 5:40 AM Can't speak on behalf of Keras, but for PyTorch's implementation of the cross entropy loss the softmax is calculated with the loss function. Therefore, you'd feed unscaled logits into the loss function. Permalink 16
Viewing a single comment thread. View all comments