[D] Classification with final layer having no activation? Submitted by AbIgnorantesBurros t3_y0y3q6 on October 11, 2022 at 3:11 AM in MachineLearning 7 comments 6
rx303 t1_irvia6u wrote on October 11, 2022 at 10:57 AM Summing log-probs is more stable than multiplying probs. Permalink 2
Viewing a single comment thread. View all comments