Viewing a single comment thread. View all comments

bartvanh t1_jdyd6om wrote

Ugh, yes it's so frustrating to see people not realizing this bit all the time. And also kind of painful to imagine that (presumably - correct me if I'm wrong) all those internal "thoughts" are probably discarded after each word, only to be painstakingly reconstructed almost identically for predicting the next word.

3

was_der_Fall_ist t1_je3ng6m wrote

Maybe that’s part of the benefit of using looped internal monologue/action systems. By having them iteratively store thoughts and otherwise in their context window, they no longer have to use the weights of the neural network to “re-think” every thought each time they predict a token. They could think more effectively by using their computation to do other operations that take the internal thoughts and actions as their basis.

1