Submitted by MazenAmria t3_zhvwvl in deeplearning
sqweeeeeeeeeeeeeeeps t1_izspv5o wrote
Reply to comment by MazenAmria in Advices for Deep Learning Research on SWIN Transformer and Knowledge Distillation by MazenAmria
Showing you can create a smaller model with the same performance means SWIN is overparameterized for that given task. Give it datasets with varying complexity, not just one single one.
Viewing a single comment thread. View all comments