Comments

You must log in or register to comment.

elbiot t1_ivdko3y wrote

Maybe your learning rate is way too high. Is this tensor flow or something? Or are you writing this from scratch?

2

JabbaTheWhat01 t1_iveidng wrote

Vanishing gradient problem?

1

Emotional-Fox-4285 OP t1_ivepyvf wrote

I thought that Vanishing GD happen when using Sigmoid every layer...Here i use ReLU in the two Layer

1

HowdThatGoIn t1_ivi1zzn wrote

I can’t say for certain without the code but it looks like the loss is being applied to every hidden unit (as a scalar) rather than being distributed based off of each units contribution to the loss (as a vector). Check the shape of your loss as it moves through each layer?

Edit: also, are you applying the total loss or the mean loss? It should be the latter.

1