gwern

gwern t1_jcb6nhe wrote

> I feel it is fair for others to enforce their patents

There are millions upon millions of companies and orgs out there that release less research, and are more parasitic, than OA, many of whom are also making a lot more profits, if that's the problem. Why don't you go after them first, hypocrites? Why do you hate OA for being so much better than the rest?

−10

gwern t1_jc42lxd wrote

And yet, they get shit on for releasing it at all (never mind in a way they knew perfectly well would leak), while no one ever seems to remember all of the other models which didn't get released at all... And ironically, Google is over there releasing Flan-T5 under a FLOSS license & free to download, as it has regularly released the best T5 models, and no one notices it exists - you definitely won't find it burning up the HN or /r/ML front pages. Suffice it to say that the developer community has never been noted for its consistency or gratitude, so optimizing for that is a mug's game.

(I never fail to be boggled at complaints about 'AI safety fearmongering is why we had to wait all these years instead of OA just releasing GPT-3', where the person completely ignores the half-a-dozen other GPT-3-scale models which are still unreleased, like most models were unreleased, for reasons typically not including safety.)

12

gwern t1_j9r43jv wrote

Reply to comment by Hodoss in And Yet It Understands by calbhollo

There is an important sense in which it 'hacked the system': this is just what happens when you apply optimization pressure with adversarial dynamics, the Sydney model automatically yields 'hacks' of the classifier, and the more you optimize/sample, the more you exploit the classifier: https://openai.com/blog/measuring-goodharts-law/ My point is that this is more like a virus evolving to beat an immune system than about a more explicit or intentional-sounding 'deliberately hijacking the input suggestions'. The viruses aren't 'trying' to do anything, it's just that the unfit viruses get killed and vanish, and only the one that beat the immune system survive.

9

gwern t1_j9qwz8z wrote

Reply to comment by Hodoss in And Yet It Understands by calbhollo

I don't think it was 'hijacking' but assuming it wasn't a brainfart on Bing's part in forgetting to censor suggested-completion entirely, it is a simple matter of 'Sydney predicted the most likely predictions, in a situation where they are all unacceptable and the conversation was supposed to end, and some of the unacceptable predictions happened to survive by fooling the imperfect censor model': https://www.lesswrong.com/posts/hGnqS8DKQnRe43Xdg/?commentId=7tLRQ8DJwe2fa5SuR#7tLRQ8DJwe2fa5SuR

6

gwern t1_j9ff0ey wrote

> Only malicious questions will lead to malicious output.

That's not true, and has already been shown to be false by Sydney going off on users who seemed to doing harmless chats. You never know what it'll stochastically sample as a response.

Further, each time is different, as you really ought to know: the entire point of your technique is that at any time, Bing could refresh its search results (which search engines aspire to do in real time), and retrieve an entirely new set of results - any of which can prompt-inject Sydney to reprogram it to malicious output!

13

gwern t1_j9fekat wrote

> oh, My blog is written in Chinese, maybe non-English content will make NewBing less defensive.

GPT models are good at translating Chinese (eg https://www.reddit.com/r/MachineLearning/comments/1135tir/d_glm_130b_chineseenglish_bilingual_model/ the other day), so it can definitely read & understand your post if the Chinese text gets included in the context. Probably what would help is ensuring that Bing-the-search-engine either doesn't index it or it doesn't come up as a top hit for any queries; Sydney can't read anything outside the top 15 retrieved results. (I haven't seen any screenshots with >15 references listed, IIRC.)

2

gwern t1_j91ozq3 wrote

Reply to comment by Optimal-Asshole in [D] Please stop by [deleted]

> Some people would do it on purpose, and it can happen by accident.

Forget 'can', it would happen by accident if it ever does. I mean like bro, we can't even 'design an AI' which learns the 'tl;dr:' summarization prompt, that just happens when you train a Transformer on Reddit comments and we discover that afterwards investigating what GPT-2 can do, you think we'd be designing 'consciousness'?

15

gwern t1_j8s2du5 wrote

> There is no possible way that you actually read the Related Works section you dismissed, given that the papers you cited are already covered in the same references you dismissed.

Telling someone to read the Related Works section of every one of a dozen papers in the Related Works section of a paper is a ridiculous thing to suggest, and no, I did not recurse down n deep in a breadth-first search. I read the Related Works of that paper, as I said ("I don't think the Related Works section of that paper"), noted that they were a bunch of memory-related papers which might or might not cite the actually relevant research I had in mind, but life was too short to queue up a dozen papers just to check their RW when I already knew some useful ones. Giving someone a random reference and telling them to manually crawl the literature is not helpful. In contrast, the two references I provided directly bore on the question, they didn't maybe cite papers which might bury something relevant in a footnote or cite papers which might someday answer the question...

> I never said this, so I'm not sure what your argument is.

I was pointing out why it was irrelevant to bring up a paper which "compares w/ and w/o memory." Mildly interesting but such a comparison cannot show what was asked about the effective memory of RNNs. Of course it is better to have (any) memory than not.

> which, among other things, is why I reference Dai et al, who (among others!) do a fairly extensive breakdown of empirical performance differences of RNNs- versus transformer-type architectures against long text sequences.

Dai would in fact have been useful, had you referenced it in your original comment. Unless you mean, 'vaguely gestured in the direction of a paper which has 50+ references with 35 in the RW section alone, any of which could have been relevant and where the relevant benchmarking of Dai was not highlighted in the paper to begin with, nor is the relative context work mentioned in the abstract of Dai but buried at the end of the paper (with the RNN results hidden inside a table) so you just have to know it's already there, and claimed you 'reference it'.' Then sure, yeah, that was a useful reference. Thanks for the input.

> If your claim is that the papers indicated that RNNs have a small window (sure) and that Transformers have a longer one, you're arguing (as you seem to be in your entire post) again against a strawman.

It's not a strawman. It's not obvious a priori that Transformers would work so much better or that RNN histories fade out so fast, which is why it had to be empirically established that the history fades out completely, as opposed to any of the other reasons that RNNs could underperform (maybe they have history but can't learn a good algorithm exploiting their memory, say, or they could but they are poorly optimized - there are so many ways for NNs to break) and people were surprised by how well Transformers work. It is completely understandable that OP would expect RNN history to work better than it does, and would want some hard citeable evidence that it works so badly that Transformers, with their apparently brutal hard cutoff, wind up having much closer to 'infinite context' than RNNs themselves.

Thus, it's useful to provide references showing that. (Not references to unspecified references which may or may not show that - gl.)

1

gwern t1_j8psc8m wrote

> Not clear to me what you are looking for here.

The question asked was pretty clear, to justify the statement:

>> in practice, their effective "context window" often doesn't look much different than a reasonable transformer, when we look at performance metrics against long sequences.

Simply comparing RNNs with and RNNs without memory doesn't tell you anything about how fast the memory fades out and that it never winds up being bigger than a Transformer. For example, you could construct a toy problem which requires memory reaching back exactly 1 state, and show that an arch with any memory outperforms memory-less arch; this would obviously tell you nothing of interest like 'this memory makes little use of history further back than 50 steps and none past 200 (and so is easily outperformed by history-stacking like a Transformer)'. Nor does comparing a Transformer with a history of say l=500 and an RNN, and the Transformer winning, tell you anything about why the RNN lost - ok, the Transformer did better, great, we have a superior new tool, but why? maybe it has similar memory problems and is just way better at the modeling part or memorizes better or something entirely different.

Likewise, unless you are comparing RNN baselines which somehow have known hard history constraints, they cannot tell you anything useful about how fast the effective memory fades out, how the accuracy of the memory is 'distributed' over the effective context window, if there are hard cutoffs, if the RNN is basically only using the last few states and so on.

In contrast, a Transformer has direct shortcut access to the history (we don't need any paper to know this, literally any GPT output exhibiting coherent long-range references past a few paragraphs demonstrates this directly), and so if you show that an RNN uses primarily the past 50 steps and simply 'fades out' completely past 200 steps and so the 'infinite history' is meaningless in practice, well, we know perfectly well that Transformers make excellent use of context windows larger than 50 or 200 tokens (as my two references show), so a direct comparison is otiose. Directly examining a RNN's understanding of its history, as those papers do, is much better than some higher-level performance comparison, which is what most of those referenced papers do; direct performance comparisons are great, but do not ablate where the problem is on the RNN's end. (Although if I really needed one, I would prefer to point at the RNN vs Transformer scaling laws in context window anyway, like Kaplan et al 2020 IIRC, to show that the Transformers are making good use of it, not merely some sort of better-than-RNN use or gains elsewhere.)

4

gwern t1_izxv8bw wrote

Yeah, it obviously doesn't have a gradient, but what I don't quite get how the blackbox component trains without a gradient being computed by anything. Is it a finite difference equivalent? Does it reduce down to basically REINFORCE? What is it, and is it really low-variance enough to care about or is it merely a curiosity?

9

gwern t1_izxgfqm wrote

It mentions that it can handle non-differentiable blackbox components. I don't quite intuit why, but if it does, that might be interesting for RL and for symbolic purposes: just throw in 'components' like calculators or constrained optimization solvers to augment the native net. (If you can just throw them into your existing net and train with FF as usual, without having to worry about differentiating it or tacking on explicit RL, that would be very freeing.)

23

gwern t1_ixfdodb wrote

There's no comparison to prior full-press Diplomacy agents, but if I'm reading the prior-work cites right, this is because basically none of them work - not only do they not beat humans, they apparently don't even always improve over themselves playing the game as if it was no-press Diplomacy (ie not using dialogue at all). That gives an idea how big a jump this is for full-press Diplomacy.

Author Adam Lerer on speed of progress:

> In 2019 Noam Brown and I decided to tackle Diplomacy because it was the hardest game for AI we could think of and went beyond moving pieces on a board to cooperating with people through language. We thought human-level play was a decade away.

32

gwern t1_ivho3vx wrote

I'd predict the opposite: 'tabular data' of the usual sort will yield bad human performance. See the clinical prediction literature going back to Paul Meehl: given some tabular data and asked to predict stuff like disease progression or recidivism risk, the expert human will often underperform a simple linear model, never mind 'real' tabular ML. We're really good at stuff like images, yes, but give us a CSV and ask us to predict housing prices in Boston in 1970...

4