Submitted by Thijs-vW t3_yta05n in deeplearning
jobeta t1_iw701lx wrote
Reply to comment by RichardBJ1 in Update an already trained neural network on new data by Thijs-vW
Why freeze bottom and top layers?
RichardBJ1 t1_iw71rpv wrote
Good question; I do not have a source for that, have just heard colleagues saying that. Obviously the reason for freezing layers is that we are trying to avoid loosing all the information we have already gained. Should speed up further training by reducing parameter numbers etc. As to actually WHICH layers are best persevered I don’t know. When I have read on it, people typically say “it depends”. But actually my point was I have never found transfer learning to be terribly effective (apart from years ago when I ran a specific transfer learning tutorial!). In my models it only takes a few days to start from scratch and so this it what I do! Transfer learning obviously makes enormous sense if you are working with someone else’s extravagantly trained model and you may be don’t even have the data. But in my case I always do have all the data…
jobeta t1_iw7228e wrote
It seems intuitive that if possible, fully retraining will yield the best results but it can be costly. I just find it surprising to arbitrarily freeze two layers. What if your model only has two layers anyways? Again I don’t have experience so just guessing
RichardBJ1 t1_iw733qt wrote
Yes …obviously freezing the only two layers would be asinine! There is a keras blog on it, I do not know why particular layers (TL;DR). It doesn’t say top and bottom that’s for sure. …I agree it would be nice to have method in the choice of layers to freeze rather than arbitrary. I guess visualising layer output might help choose if a small model, but I’ve never tried that. So I do have experience of trying transfer learning, but (apart from tutorials) no experience of success with transfer learning!
ContributionWild5778 t1_iw97xid wrote
I believe that is an iterative process when doing transfer learning. First you will always freeze the top layers because low level feature extraction is done over there (extracting lines and contours). Unfreeze the last layers and try to train those layers only where high level features are extracted. At the same time it also depends on how different the new dataset is using which you are training the model. If it contains similar characteristics/features freezing top layers would be my choice
Viewing a single comment thread. View all comments