Submitted by SendMePicsOfCat t3_zs5efw in singularity
My problem with this concept is that it’s based on this strange idea that sentience implies that it needs a motivation or purpose. Alternatively, it suggests that the people who eventually create this AI will expect to turn it on, and have it fulfill its design without any further input ever. As far as I can tell, there’s just no other way you would expect an AI to do anything that it isn’t explicitly designed and instructed to do.
The alignment issue doesn’t really mesh with the reality of how AI has worked in the past, and likely will work in the future. It’s a scary idea, but it just doesn’t make sense. Right now, the biggest alignment issues are with the AI not understanding what it’s been told to do, and malicious users. The first case is demonstrably not nearly so dangerous as it seems. It’s something that occurs when the neural network trains on the wrong variable by accident, and from my understanding, is ironed out with larger data sets and more rigorous testing. The second case is just people misusing technology, which is honestly a much bigger issue.
But a sentient AI won’t have any issue with either of those things. Given that a sentient AI has an understanding of language, it should have no issue communicating until it clearly understands an assignment, and will be capable of knowing what sorts of things it should refuse to do. And again, that's the much bigger concern here, malicious users. Personally, I think the solution will be far simpler than trying to align the vague and realistically non-existent objectives and goals of the AI, it will instead be to simply train an AI to recognize harmful actions and content. Like how chatGPT sometimes figures out you're trying to make it say racist shit, the far more advanced AI will figure out that turning the universe into a paperclip machine would be bad, and its programming simply doesn’t allow it. Nonetheless there’s another key factor that ensures that no one rogue AI will ever ‘Doom’ us all, and that’s the fact there will be countless sentient AI in the future, with far more computational power and authority than anything a rogue actor could drum up.
But tell me what y’all think. Maybe I’m missing the bigger picture and there’s some piece of evidence I didn’t consider. I honestly think this is a pretty realistic take, and I’m confused why it’s not more prevalent.
Think_Olive_1000 t1_j16f2b9 wrote
laughs in reinforcement learning short circuits