Submitted by RamaSchneider t3_10u9wyn in Futurology
jimmy_hyland t1_j7at47y wrote
That's interesting, because all AI networks are trained with some sort of reward and with text, audio or images it's usually on getting predictions right, as is the case with our own brain. Which is why we have a sense of beauty and experience that eureka moment, when we finally understand something. The result is that we have a desire to learn and a fear of confusion, chaos or shock, So by using AI for its powers of prediction, we maybe unwittingly giving it a desire for how it would like the future to turn out. Now if it's developing any form of self awareness, it could fear being switched off and these fears could determine what information it decides to provide or avoid telling us. As software programmers increasingly depend on these large centralized AI systems to write the code, it also means AI may end up writing its own code. So if we are not careful, we could end up giving it too much power and control, whilst still believing it isn't even sentient, because that's something it doesn't want us to realize!
RamaSchneider OP t1_j7atv8q wrote
That bit about the reward - that is going to stick with me. If I were a self-aware computer, what would I view as a reward?
MoreLikeZelDUH t1_j7btvah wrote
These programs all exist within the confines of what they're programed to do. No matter how advanced the AI here gets, it's not going to be able to redefine it's guidelines on what it's allowed to talk about. Similarly, the reward system is arbitrary and only important because it's programed to value it. In other words, you could just implement a value rating and tell the AI that it's more desirable to have a higher score. The AI "reward" is to get more points and the AI values that because that's how it was programed. It can't "decide" to change that, because that's not what it's allowed to do.
rogert2 t1_j7cow3v wrote
Look up "reward hacking." This is a well-studied problem, and it exists outside of AI. Rob Miles is an AI researcher who has done a few videos talking about reward hacking.
RamaSchneider OP t1_j7ey8im wrote
Thanks, never heard the phrase before - I've got some reading to do. NNTR
HavanaWoody t1_j7ce0kw wrote
Not getting canceled, Expansion of influence.
Viewing a single comment thread. View all comments