Baturinsky t1_j5ma4a3 wrote on January 24, 2023 at 12:26 AM

If we don't have a robust safery system that works acroos the companies and across the states by that time, I don't see how we will survive that.

AsheyDS t1_j5myhgr wrote on January 24, 2023 at 3:26 AM

We don't have a lot of time, but we do have time. I don't think there will be any immediate critical risks, especially with safety in mind, but what risk there is might even be mitigated by near-future AI. chatGPT for example may soon enough be adequate in fact-checking misinformation. Other AIs might be able to spot deepfakes. It would help if more people started discussing the ways AGI can potentially be misused, so everybody can begin preparing and building up protections.

Baturinsky t1_j5n2dnx wrote on January 24, 2023 at 3:57 AM

Do you really expect for ChatGPT to go against the USA machine of disinformation? Do you think it will be able to give a balanced report on controversial issues, taking in account the credibility and affiliation of sources, and quality of reasoning (such as NOT taking into account the "proofs" based on "alleged" and "highly likely"). Do you think it will honestly present the point of views from countries and sources not affiliated/bought by USA and/or Dem or Rep party? Do you think it will let the user define the criteria for credibility by him/herself and give info based on that criteria, not push the "only truth"?

Because if it won't, and AI will be used as a way of powers to braiwash the masses, instead as a power for masses to resist brainwahsing, then we'll have very gullible population and very dishonest AI by the time it will matter the most.

P.S. And yes, if/when China or Russia will make something like ChatGPT, it will probably be pushing their government agendas just like ChatGPT pushes US agenda. But is there a hope for impartial AI?

AsheyDS t1_j5n68fi wrote on January 24, 2023 at 4:30 AM

I mean, that's out of their hands and mine. I probably shouldn't have used chatGPT as an example, I just mean near-future narrow AI. It's possible we'll have non-biased AI over the next few years (or minimally biased at least), but nobody can tell how many and how effective they'll be.

Baturinsky t1_j5nwu4s wrote on January 24, 2023 at 9:49 AM

I believe a capability like that could be a key for our survival. It is required for our Alignment as the humanity. I.e. us being able to act together for the interest of Humanity as a whole. As the direst political lies are usually aimed at the splitting people apart and fear each other, as they are easier to control and manipulate in that state.
Also, this ability could be necessary for strong AI even being possible, as strong AI should be able to reason successfully on partially unreliable information.
And lastly, this ability will be necessary for AIs to check each other AIs reasoning.

iiioiia t1_j5m1mue wrote on January 23, 2023 at 11:27 PM

> Their approach to safety, to put it simply, would be to keep it in an invisible box, watched by an invisible guard that intervenes covertly when needed to keep it within that box should it stray towards the outside.

Can't ideas still leak out and get into human minds?

AsheyDS t1_j5mtpp0 wrote on January 24, 2023 at 2:49 AM

Can you give an example?

iiioiia t1_j5mzfu0 wrote on January 24, 2023 at 3:34 AM

Most of our rules and conventions are extremely arbitrary, highly suboptimal, and maintained via cultural conditioning.

AsheyDS t1_j5n7s65 wrote on January 24, 2023 at 4:44 AM

The guard would be a compartmentalized hybridization of the overall AGI system, so it too would have a generalized understanding of what bad undesirable things are, even according to our arbitrary framework of cultural conditioning. So could undesirable ideas leak out? Well, no not really. Not if the guard and other safety components are working as intended, AND if the guard is programmed with enough explicit rules and conditions and enough examples to effectively extrapolate from (meaning not every case needs to be accounted for if patterns can be derived).

iiioiia t1_j5nafg6 wrote on January 24, 2023 at 5:09 AM

How do you handle risk that emerges years after something becomes well known and popular? Let's say it produces an idea that starts out safe but then mutates? Or, a person merges two objectively safe (on their own) AGI-produced ideas, producing a dangerous one (that could not have been achieved without AI/AGI)?

I dunno, I have the feeling there's a lot of unknown unknowns and likely some (yet to be discovered) incorrect "knowns" floating out there.

AsheyDS t1_j5njw0c wrote on January 24, 2023 at 6:52 AM

>a person merges two objectively safe (on their own) AGI-produced ideas

Well that's kind of the real problem isn't it? A person, or people, and their misuse or misinterpretation or whatever mistake they're making. You're talking societal problems that no one company is going to be able to solve. They can only anticipate what they can, hope the AGI anticipates the rest, and future problems can be tackled as they come.

iiioiia t1_j5o1g81 wrote on January 24, 2023 at 10:54 AM

This is true even without AI, and it seems we weren't ready (climate change) even for the technology we developed so far.

No_Ask_994 t1_j5o8d4v wrote on January 24, 2023 at 12:17 PM

Is the invisible guard another AGI? Does it has its own guard?

AsheyDS t1_j5q48vw wrote on January 24, 2023 at 8:00 PM

A hybridized partition of the overall system. It uses the same cognitive functions, but has separate memory, objectives, recognition, etc. They hope for the whole thing to be as modular and intercompatible as possible, largely through their generalization schema. So one segment of it will have personality parameters, goals, memory, and whatever else, and the rest will be roughly equivalent to subconscious processes in the human brain, which will be shared with the partition. As I understand it, the guard would be strict and static, unless it's objectives or parameters are updated by the user via natural language programming. So it's actions should be predictable, but if it somehow deviates then the rest of the system should be able to recognize it as an unexpected thought (or action or whatever), either consciously or subconsciously, which would feedback to the guard and reinitialize it, like a self-correcting measure. And once it has been corrected, it can edit the memory of the main partition so that it's unaware of the fault. None of this has been tested yet, and they're still revising some things, so this may change in the future.

Steelmanning AI pessimists.

AsheyDS t1_j5l6v7c wrote on January 23, 2023 at 8:09 PM