Viewing a single comment thread. View all comments

adt t1_j5erdiz wrote

Already in the works (Scott Aaronson is a scientist with OpenAI):

>>we actually have a working prototype of the watermarking scheme, built by OpenAI engineer Hendrik Kirchner. It seems to work pretty well—empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text came from GPT.
>Now, this can all be defeated with enough effort. For example, if you used another AI to paraphrase GPT’s output—well okay, we’re not going to be able to detect that. On the other hand, if you just insert or delete a few words here and there, or rearrange the order of some sentences, the watermarking signal will still be there. Because it depends only on a sum over n-grams, it’s robust against those sorts of interventions.
https://scottaaronson.blog/?p=6823

177

Appropriate_Ant_4629 t1_j5gb1kw wrote

Stable Diffusion already includes one by default:

In particular it uses

Of course with open source software and models, you'd be free to create a fork that doesn't include one, or uses a different one.

30

ThisIsNotAnAlias t1_j5gjblv wrote

Last I checked image watermarks were super weak against rotations, seems to still be the case - but the better methods could cope with cropping way better than these.

9

Appropriate_Ant_4629 t1_j5gy6e3 wrote

> Last I checked image watermarks were super weak against rotations

Obviously depends on the technique. The old-school popular technique of "slap a signature in the painting" like Dürer's stylized A/D logo is very robust to rotations, but not robust to cropping from the bottom in that case.

> seems to still be the case - but the better methods could cope with cropping way better than these.

It's near impossible to have a watermark technology that's robust to all transformations, at least if you reveal what watermark algorithm you used.

One easy attack that works on most some techniques, would be to just re-encode the content, but writing your own watermark over the original using the same watermarking algorithm.

7

Freonr2 t1_j5hx9s5 wrote

Yeah but its trivial to remove that when you run the source yourself.

2

marr75 t1_j5joben wrote

Watermarks are a great way to ensure I use GPT-NeoX and allies instead of Da Vinci and allies.

1

eigenman t1_j5fgmc8 wrote

Ahh of course I should have checked Scott's blog first.

7

Fabulous-Possible758 t1_j5gtzlk wrote

On phone so can’t read the blog yet: does it say how well it handles false positives? ie, flagging stuff not written by GPT as being written by GPT?

I could see a really shitty world coming about where the filter is effectively useless because everyone will need to have to make sure their content will pass the watermark detector.

3

franciscrot t1_j5g8uq1 wrote

Would anyone like to explain to me like I'm five how it can be robust against edits like that?

2

careless25 t1_j5gu7tn wrote

Very simple explanation

Give each word a unique number. Add all the numbers up. That's your unique identifier for the gpt output

If you switched some words around the sum won't change very much.

8