Viewing a single comment thread. View all comments

EscapeVelocity83 t1_j0pahok wrote

What is good? Who decides what output is acceptable? If the computer is sentient how is that not violating the computer?

6

eve_of_distraction t1_j0q0pwp wrote

We've been arguing about what is good for thousands of years, but we tend to have an intuition as to what isn't good. You know, things that cause humans to suffer and die. Those are things we probably want to steer any hypothetical future superintelligence away from, if we can. It's very unclear as to whether we can though. The alignment problem is potentially highly disturbing.

10

archpawn t1_j0r8qwo wrote

> If the computer is sentient how is that not violating the computer?

You're sentient. Do your instincts to enjoy certain things violate your rights? The idea here isn't to force the AI to do the right thing. It's to make the AI want to do the right thing.

> Who decides what output is acceptable?

Ultimately, it has to be the AI. Humans suck at it. We can't exactly teach an AI how to solve the trolley problem by training it on it if we can't even agree on an answer ourselves. And there's bound to be plenty of cases where we can agree, but we're completely wrong. But we have to figure out how to make the AI figure out what output is best, as opposed to what makes the most paperclips, or what its human trainers are most likely to think is the best, or what gives the highest number in a model trained for that but it's operating in an area so far outside its training data that it's meaningless.

2