JustOneAvailableName t1_j71yj42 wrote on February 3, 2023 at 2:26 PM

Reply to comment by new_name_who_dis_ in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad

Understanding what is extremely easy and rather useless, to understand a paper you need to understand some level of why. If you have time to go in depth, aim to understand the what not and why not.

So I would argue at least some basic knowledge of CNNs is required.

SAbdusSamad OP t1_j71z0zp wrote on February 3, 2023 at 2:30 PM

Well, I do have idea about CNNs. I have limited knowledge of RNNs. But I don't have knowledge of Attention is All You Need.

Erosis t1_j72rzdl wrote on February 3, 2023 at 5:38 PM

You'll probably be fine learning transformers directly, but a better understanding of RNNs might make some of the NLP tutorials/papers containing transformers more easily comprehensible.

Attention is an very important component of transformers, but attention can be applied to RNNs, too.

SAbdusSamad OP t1_j759v4v wrote on February 4, 2023 at 4:13 AM

I agree that having a background in RNNs and attention with RNNs can make the learning process for transformers, and by extension ViT, much easier.