DigThatData t1_j9rzrzd wrote on February 24, 2023 at 3:22 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

if a "sufficiently advanced AI" could achieve "its own goals" that included "humanity going extinct" (at least as a side effect) in such a fashion that humanity did the work of putting itself out of extinction on its own needing only the AGIs encouragement, it would. In other words, the issues I described are indistinguishable from the kinds of bedlam we could reasonably expect an "x-risk AGI" to impose upon us. ipso facto, if part of the alignment discussion is avoiding defining precisely what "AGI" even means and focusing only on potential risk scenarios, the situation we are currently in is one in which it is unclear that a hazardous-to-human-existence AGI doesn't already exist and is already driving us towards our own extinction.

instead of "maximizing paperclips," "it" is just trying to maximize engagement and click-through rate. and just like the paperclips thing, "it" is burning the world down trying to maximize the only metrics it cares about. "it" just isn't a specific agent, it's a broader system that includes a variety of interacting algorithms and platforms forming a kind of ecosystem of meta-organisms. but the nature of the ecosystem doesn't matter for the paperclip maximization parable to apply.

royalemate357 t1_j9s125d wrote on February 24, 2023 at 3:32 AM

> instead of "maximizing paperclips," "it" is just trying to maximize engagement and click-through rate. and just like the paperclips thing, "it" is burning the world down trying to maximize the only metrics it cares about

Isn't there a difference between the two, because the latter concerns a human trying to pursue a certain goal (maximize user engagement), and giving the AI that goal. and so arguably, the latter is "aligned" (for some sense of the word) to the human that's using it to maximize their engagement, in that its doing what a specific human intends it to do. Whereas the paperclip scenario is more like, human tells AI to maximize engagement, yet the AI has a different goal and chooses to pursue that instead.

DigThatData t1_j9s23ds wrote on February 24, 2023 at 3:40 AM

> Isn't there a difference between the two, because the latter concerns a human trying to pursue a certain goal (maximize user engagement), and giving the AI that goal.

in the paperclip maximization parable, "maximize paperclips" is a directive assigned to an AGI owned by a paperclip manufacturer, which consequently concludes that things like "destabilize currency to make paperclip materials cheaper" and "convert resources necessary for human life to exist into paperclip factories" are good ideas. so no, maximizing engagement at the cost of the stability of human civilization is not "aligned" in exactly the same way maximizing paperclip production isn't aligned.

royalemate357 t1_j9s2pf3 wrote on February 24, 2023 at 3:45 AM

hmm i didn't realize that the origin of the paperclip maximizer analogy, but it seems like you're right that some human had to tell it to make paperclips in the first place.

DigThatData t1_j9s4kj0 wrote on February 24, 2023 at 4:00 AM

https://en.wikipedia.org/wiki/Instrumental_convergence