Viewing a single comment thread. View all comments

Not-Banksy OP t1_je8i7uw wrote

Gotcha, so training is still by a large a human-driven process?

2

Mortal-Region t1_je8jw22 wrote

Typically, humans provide the training data, then a program performs the actual training by looping through the data.

EDIT: One exception would be a game-playing AI that learns via self-play. Rather than humans supplying it training data in the form of games played by experts, the training data consists of the games the AI has played against itself.

9

CollapseKitty t1_je8wa3w wrote

Modern LLMs (large language models), like ChatGPT, use what's called reinforcement learning from human feedback, RLHF, to train a reward model which then is used to train the language model.

Basically, we attempt to instill an untrained model with weights selected through human preference (which looks more like a cat? which sentence is more polite?). This then automates the process and scales it to superhuman levels which are capable of training massive models like ChatGPT with hopefully something close to what the humans initially intended.

2

Thedarkmaster12 t1_je8jaco wrote

Yes, but I believe a company recently trained a model in part on another model. Not sure any statistics but the gist of it is that it can be done by models. And ideally, that’s how we get ASI and the singularity. Only a super powerful AGI could improve on itself in such a way that would create something better than us.

1

scooby1st t1_jeav18q wrote

Not a chance. ASI would be when a system can conceptualize better ideas and theories, build, train, and test entirely new models, from scratch, better than teams of PhDs. It's not going to happen by brute-forcing the same ideas.

1

ShowerGrapes t1_je9b6sw wrote

for now but that's likely to change. my guess is ai will be better than humans, eventually, at figuring out what data is relevant and up-to-date. we'll reach a point where it's not just one neural network, but a bunch running in tandem with bits of it being re-trained and replaced on the fly without missing much of a beat.

1