Nhabls

Nhabls t1_je9anrq wrote

Which team is that? The one at Microsoft that made up the human performance figures in a completely ridiculous way? Basically "We didn't like that pass rates were too high for humans for the hard problems that the model fails on completely so we just divided the accepted number by the entire user base" oh yeah brilliant

The "human" pass rates are also composed of people learning to code trying to see if their solution works. Its a completely idiotic metric, why not go test randos on the street and declare that represents the human coding performance metric while we're at it

1

Nhabls t1_jcbnr3g wrote

That's alpaca, a finetuning on llama and you're just pointing to another of openai's shameless behaviours. Alpaca couldn't be commercial because openai thinks it can forbid usage of outputs from their model to train competing models. Meanwhile they also argue that they can take whatever and any and all copyrighted data from the internet with no permission or compensation needed.

They think they can have it both ways, at this point i'm 100% rooting for them to get screwed as hard as possible in court on their contradiction

19

Nhabls t1_jcbmn7g wrote

> If we want everything to be open sourced then chatgpt as it is now probably wouldn't be possible at all

All of the technology concepts behind chatGPT are openly accessible and have been for the past decade, as was the work before, a lot of it came from big tech companies that work for profit, the profit motive is not an excuse. Only unprecedented greed in the space.

Though it comes to no surprise from the company that thinks it can just take any copyrighted data from the internet without any permission while at the same time forbid others from training models from data they get from the company's products. It's just sleaziness at every level.

>But anyway I think basic theoretical breakthroughs like a new architecture for AI will still be shared among academia since those aren't directly related to money

This is exactly what hasn't happened, they refused outright to share any architectural detail, no one was expecting the weights or even code. This is what people are upset about, and rightly so

5

Nhabls t1_jcbm504 wrote

> Google Deepmind did go that route of secrecy with AlphaGo

AlphaGo had a proper paper released, what are you talking about?

This action by OpenAI to completely refuse to share their procedure for training GPT-4 very much breaks precedent and is horrible for the field as a whole. It shouldn't be glossed over

14

Nhabls t1_j6xemzb wrote

GPT-3 didn't cost a billion to train

It does cost a LOT of money to run, which is why you're unlikely to "see better" for the short and medium term future. Unless you're into paying hundreds to thousands per month for this functionality

26

Nhabls t1_j6uokwb wrote

It's incredibly easy to make giant LLMs regurgitate training data near verbatim. There's very little reason to believe that this won't just start happening more frequently with image models as they grow in scale as well.

Personally i just hope it brings a reality check in the courts to these companies that think they can just monetize generative models trained on copyrighted material without permission

3