Viewing a single comment thread. View all comments

RichardBJ1 t1_iw71rpv wrote

Good question; I do not have a source for that, have just heard colleagues saying that. Obviously the reason for freezing layers is that we are trying to avoid loosing all the information we have already gained. Should speed up further training by reducing parameter numbers etc. As to actually WHICH layers are best persevered I don’t know. When I have read on it, people typically say “it depends”. But actually my point was I have never found transfer learning to be terribly effective (apart from years ago when I ran a specific transfer learning tutorial!). In my models it only takes a few days to start from scratch and so this it what I do! Transfer learning obviously makes enormous sense if you are working with someone else’s extravagantly trained model and you may be don’t even have the data. But in my case I always do have all the data…

1

jobeta t1_iw7228e wrote

It seems intuitive that if possible, fully retraining will yield the best results but it can be costly. I just find it surprising to arbitrarily freeze two layers. What if your model only has two layers anyways? Again I don’t have experience so just guessing

2

RichardBJ1 t1_iw733qt wrote

Yes …obviously freezing the only two layers would be asinine! There is a keras blog on it, I do not know why particular layers (TL;DR). It doesn’t say top and bottom that’s for sure. …I agree it would be nice to have method in the choice of layers to freeze rather than arbitrary. I guess visualising layer output might help choose if a small model, but I’ve never tried that. So I do have experience of trying transfer learning, but (apart from tutorials) no experience of success with transfer learning!

1

ContributionWild5778 t1_iw97xid wrote

I believe that is an iterative process when doing transfer learning. First you will always freeze the top layers because low level feature extraction is done over there (extracting lines and contours). Unfreeze the last layers and try to train those layers only where high level features are extracted. At the same time it also depends on how different the new dataset is using which you are training the model. If it contains similar characteristics/features freezing top layers would be my choice

1