Mortal-Region t1_je8h882 wrote on March 30, 2023 at 4:21 AM

A neural network has very many weights, or numbers representing the strengths of the connections between the artificial neurons. Training is the process of setting the weights in an automated way. Typically, a network starts out with random weights. Then training data is presented to the network, and the weights are adjusted incrementally until the network learns to do what you want. (That's the learning part of machine learning.)

For example, to train a neural network to recognize cats, you present it with a series of pictures, one after the other, some with cats and some without. For each picture, you ask the network to decide whether the picture contains a cat. Initially, the network guesses randomly because the weights were initialized randomly. But every time the network gets it wrong, you adjust the weights slightly in the direction that would have given the right answer. (Same thing when it gets the answer right; you reinforce the weights that led to the correct answer.)

For larger neural networks, training requires an enormous amount of processing power, and the workload is distributed across multiple computers. But once the network is trained, it requires much less power to just use it (e.g., to recognize cats).

Not-Banksy OP t1_je8i7uw wrote on March 30, 2023 at 4:31 AM

Gotcha, so training is still by a large a human-driven process?

Mortal-Region t1_je8jw22 wrote on March 30, 2023 at 4:48 AM

Typically, humans provide the training data, then a program performs the actual training by looping through the data.

EDIT: One exception would be a game-playing AI that learns via self-play. Rather than humans supplying it training data in the form of games played by experts, the training data consists of the games the AI has played against itself.

CollapseKitty t1_je8wa3w wrote on March 30, 2023 at 7:23 AM

Modern LLMs (large language models), like ChatGPT, use what's called reinforcement learning from human feedback, RLHF, to train a reward model which then is used to train the language model.

Basically, we attempt to instill an untrained model with weights selected through human preference (which looks more like a cat? which sentence is more polite?). This then automates the process and scales it to superhuman levels which are capable of training massive models like ChatGPT with hopefully something close to what the humans initially intended.

Thedarkmaster12 t1_je8jaco wrote on March 30, 2023 at 4:42 AM

Yes, but I believe a company recently trained a model in part on another model. Not sure any statistics but the gist of it is that it can be done by models. And ideally, that’s how we get ASI and the singularity. Only a super powerful AGI could improve on itself in such a way that would create something better than us.

scooby1st t1_jeav18q wrote on March 30, 2023 at 5:55 PM

Not a chance. ASI would be when a system can conceptualize better ideas and theories, build, train, and test entirely new models, from scratch, better than teams of PhDs. It's not going to happen by brute-forcing the same ideas.

ShowerGrapes t1_je9b6sw wrote on March 30, 2023 at 10:54 AM

for now but that's likely to change. my guess is ai will be better than humans, eventually, at figuring out what data is relevant and up-to-date. we'll reach a point where it's not just one neural network, but a bunch running in tandem with bits of it being re-trained and replaced on the fly without missing much of a beat.