Raphaelll_ t1_j45u38j wrote
Reply to comment by PassingTumbleweed in [R] Is there any research on allowing Transformers to spent more compute on more difficult to predict tokens? by Chemont
Did this ever get any traction?
PassingTumbleweed t1_j46sco1 wrote
That depends on what you mean. I don't think any of the LLMs use it, but it has some citations and follow-up literature.
Viewing a single comment thread. View all comments