ntaylor- t1_je11vt1 wrote on March 28, 2023 at 5:19 PM

Reply to comment by was_der_Fall_ist in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9

Fairly sure the "final" gpt4 model is still using a generate function that predicts one token at a time. Just the training was good and complicated via RLHF. After training it's not doing any "complicated operations".

was_der_Fall_ist t1_je15397 wrote on March 28, 2023 at 5:39 PM

You don’t think the neural network, going through hundreds of billions of parameters each time it calculates the next token, is doing anything complicated?

ntaylor- t1_je5qtl2 wrote on March 29, 2023 at 4:38 PM

Nope. It's the same as all neural networks using transformer architecture. Just a big old series of matrix multiplications with some non linear transformations at end of the day

was_der_Fall_ist t1_je6lfl9 wrote on March 29, 2023 at 7:52 PM

Why are matrix multiplications mutually exclusive with complicated operations?

A computer just goes through a big series of 0s and 1s, yet through layers of abstraction they accomplish amazing things far more complicated than a naive person would think 0s and 1s could represent and do. Why not the same for a massive neural network trained via gradient descent to maximize a goal by means of matrix multiplication?