Viewing a single comment thread. View all comments

Seankala t1_iruw37k wrote

Can't speak on behalf of Keras, but for PyTorch's implementation of the cross entropy loss the softmax is calculated with the loss function. Therefore, you'd feed unscaled logits into the loss function.

16