Submitted by AutoModerator t3_110j0cp in MachineLearning
activatedgeek t1_j9lobdv wrote
Reply to comment by theidiotrocketeer in [D] Simple Questions Thread by AutoModerator
It is not uncommon anymore to model images as patches of tokens, and then send in the sequence to a transformer-based model. So not psychotic at all.
See An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.
Viewing a single comment thread. View all comments