adt t1_j5erdiz wrote
Already in the works (Scott Aaronson is a scientist with OpenAI):
>>we actually have a working prototype of the watermarking scheme, built by OpenAI engineer Hendrik Kirchner. It seems to work pretty well—empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text came from GPT.
>Now, this can all be defeated with enough effort. For example, if you used another AI to paraphrase GPT’s output—well okay, we’re not going to be able to detect that. On the other hand, if you just insert or delete a few words here and there, or rearrange the order of some sentences, the watermarking signal will still be there. Because it depends only on a sum over n-grams, it’s robust against those sorts of interventions.
https://scottaaronson.blog/?p=6823
Appropriate_Ant_4629 t1_j5gb1kw wrote
Stable Diffusion already includes one by default:
In particular it uses
Of course with open source software and models, you'd be free to create a fork that doesn't include one, or uses a different one.
ThisIsNotAnAlias t1_j5gjblv wrote
Last I checked image watermarks were super weak against rotations, seems to still be the case - but the better methods could cope with cropping way better than these.
Appropriate_Ant_4629 t1_j5gy6e3 wrote
> Last I checked image watermarks were super weak against rotations
Obviously depends on the technique. The old-school popular technique of "slap a signature in the painting" like Dürer's stylized A/D logo is very robust to rotations, but not robust to cropping from the bottom in that case.
> seems to still be the case - but the better methods could cope with cropping way better than these.
It's near impossible to have a watermark technology that's robust to all transformations, at least if you reveal what watermark algorithm you used.
One easy attack that works on most some techniques, would be to just re-encode the content, but writing your own watermark over the original using the same watermarking algorithm.
Freonr2 t1_j5hx9s5 wrote
Yeah but its trivial to remove that when you run the source yourself.
marr75 t1_j5joben wrote
Watermarks are a great way to ensure I use GPT-NeoX and allies instead of Da Vinci and allies.
eigenman t1_j5fgmc8 wrote
Ahh of course I should have checked Scott's blog first.
Fabulous-Possible758 t1_j5gtzlk wrote
On phone so can’t read the blog yet: does it say how well it handles false positives? ie, flagging stuff not written by GPT as being written by GPT?
I could see a really shitty world coming about where the filter is effectively useless because everyone will need to have to make sure their content will pass the watermark detector.
SufficientType1794 t1_j5n1c3y wrote
That's just training a GAN with extra steps /s
Fabulous-Possible758 t1_j5sy0hf wrote
I mean... maybe we've invented AI not because machines are trainable but because humans are?
franciscrot t1_j5g8uq1 wrote
Would anyone like to explain to me like I'm five how it can be robust against edits like that?
careless25 t1_j5gu7tn wrote
Very simple explanation
Give each word a unique number. Add all the numbers up. That's your unique identifier for the gpt output
If you switched some words around the sum won't change very much.
[deleted] t1_j5ip3dq wrote
[deleted]
Viewing a single comment thread. View all comments