Comments
I_will_delete_myself t1_jcb0o9g wrote
OpenAI naming is like the People's Republic of China. Totally for the people and chosen by the people.
amhotw t1_jcbm37m wrote
OpenAI has been open initially. From history books, it looks like RC was never P's.
I_will_delete_myself t1_jcbmqn5 wrote
Whole purpose is to communicate how their naming isn't matching what they are actually doing.
amhotw t1_jcbpd9z wrote
I understand that. I am pointing out the fact that they started on different paths. One of them was actually matching its name with what it was doing; the other was a contradiction from the beginning.
Edit: Wow, people either can't read or don't read enough history.
[deleted] OP t1_jcebuo4 wrote
Not entirely true tbh, I'm willing to bet that most Chinese were supportive of the CCP when it first came to power
NoScallion2450 t1_jcal75g wrote
It seems like many researchers would feel the same way.
satireplusplus t1_jcbpgio wrote
They should rename themselves to ClosedAI. Would be a better name for what the company is doing now.
Caskla t1_jcb6kc8 wrote
Not sure that ignoring OpenAI is really an option at this point.
professorlust t1_jccjn4t wrote
If you can’t replicate their results, then they’re not useful for research
VelveteenAmbush t1_jcd760v wrote
They're purposefully withholding the information you'd need to use their results in research. This proposed research boycott is sort of a "you can't fire me, I quit" response.
professorlust t1_jcddvx5 wrote
Agreed
eposnix t1_jccpohv wrote
How many companies could realistically replicate their results though? We already have a pretty good idea of what's going on under the hood, but would knowing the intricacies of GPT-4 help anyone smaller than Google?
professorlust t1_jcdeiux wrote
The argument from a research perspective is that scale isn’t likely the Holy Grail.
It’s undoubtedly important, yes.
BUT for a researcher, the quest is to determine how important scale truly is AND how to determine ways that help reduce dependence on scale.
BrotherAmazing t1_jcdloe7 wrote
You can still replicate results in private under a non-disclosure agreement or verify/validate results without it getting published to the world though.
I like open research but research that happens in private still can be useful and is reality.
professorlust t1_jce19sb wrote
What researcher is signing an NDA?
That’s literally the opposite of what replication research is supposed to accomplish.
Operating under an NDA is for primary research, not replication
BrotherAmazing t1_jce3zky wrote
I would be happy to sign an NDA if Google allowed me to have access to verify, validate, and run some of their most prized models they keep secret and have not released, and it is incredibly rare for an NDA to last forever.
Also, a lot of research goes on behind closed doors among people who have signed NDAs. They still replicate each other’s work and verify and validate it, they just don’t publish it for you to read.
This thread isn’t specifically about “replication research” across the broad range international community either, is it? OP did not indicate that, and primary research a company performs and then successfully transitions it into a system that empirically outperforms the competition is validation enough that need not be replicated by their competitors. In fact, the whole point is you don’t want anyone to replicate it but it is still did valid useful research if you bring a product to market that everyone demands and finds useful.
When you work for Google or nearly any company and nove away from academia, you don’t have an ability to publish everything the company ever has done that you learn about or everything you do at the company automatically. Are you really under that impression? Have you ever worked in the Corporate world??
professorlust t1_jce4rv6 wrote
Check out Axriv if you think there’s only academic researchers publishing
BrotherAmazing t1_jch1gll wrote
I never said they don’t publish, re-read.
I can tell you firsthand what they publish has to get approval, and a lot of things do jot get approval to publish and are held as trade secrets. It boggles my mind this sub clearly has so many people who have never worked on the Corporate side of this industry and have these strong ideas that the Corporate side is or has ever been fully transparent and allows employees to publish anything and everything. The is so far from the truth it’s not funny.
For every model and paper published, there exists another model and many other papers that are not approved to be published and many exist in a different format as internal publications only. Other internal publications get watered down and a lot of extra work is omitted in order to get approval to publish. or they publish “generation 3” to the world while they’re working on “generation 5” internally.
VelveteenAmbush t1_jcbu8nr wrote
> While they also potentially don't release every model (see Google's PaLM, LaMDA) or only with non-commercial licenses after request (see Meta's OPT, LLaMA), they are at least very transparent when it comes to ideas, architectures, trainings, and so on.
They do this because they don't ship. If you're a research scientist or ML research engineer, publication is the only way to advance your career at a company like that. Nothing else would ever see the light of day. It's basically a better funded version of academia, because it doesn't seem to be set up to actually create and ship products.
Whereas if you can say "worked at OpenAI from 2018-2023, team of 5 researchers that built GPT-4 architecture" or whatever, that speaks for itself. The products you release and the role you had on the teams that built them are enough to build a resume -- and probably a more valuable resume at that.
the_mighty_skeetadon t1_jccdzgr wrote
Many of the interesting developments in deep learning have in fact made their way to Google + FB products, but that those have not been "model-first" products. For example: ranking, personalization, optimization of all kinds, tech infra, energy optimization, and many more are driving almost every Google product and many FB ones as well.
However, this new trend of what I would call "Research Products" which are light layers over a model -- it's a different mode of launching with higher risks, many of which have different risk profiles for Google-scale big tech than it does for OpenAI. Example: ChatGPT would tell you how to cook meth when it first came out, and people loved it. Google got a tiny fact about JWST semi-wrong in one tiny sub-bullet of a Bard example, got widely panned and lost $100B+ in market value.
VelveteenAmbush t1_jccksp9 wrote
Right, Google's use of this whole field has been limited to optimizing existing products. As far as I know, after all their billions in investment, it hasn't driven the launch of a single new product. And the viscerally exciting stuff -- what we're calling "generative AI" these days -- never saw the light of day from inside Google in any form except arguably Gmail suggested replies and occasional sentence completion suggestions.
> it's a different mode of launching with higher risks, many of which have different risk profiles for Google-scale big tech than it does for OpenAI
This is textbook innovator's dilemma. I largely agree with the summary but think basically the whole job of Google's leadership boils down to two things: (1) keep the good times rolling, but (2) stay nimble and avoid getting disrupted by the next thing. And on the second point, they failed... or at least they're a lot closer to failure than they should be.
> Example: ChatGPT would tell you how to cook meth when it first came out, and people loved it. Google got a tiny fact about JWST semi-wrong in one tiny sub-bullet of a Bard example, got widely panned and lost $100B+ in market value.
Common narrative but I think the real reason Google's market cap tanked at the Bard announcement is due to two other things: (1) they showed their hand, and it turns out they don't have a miraculous ChatGPT-killer up their sleeves after all, and (2) the cost structure of LLM-driven search results are much worse than classical search tech, so Google is going to be less profitable in that world.
Tech journalists love to freak out about everything, including LLM hallucinations, bias, toxic output, etc., because tech journalists get paid based on engagement -- but I absolutely don't believe that stuff actually matters, and OpenAI's success is proving it. Google's mistake was putting too much stock in the noise that tech journalists create.
noiseinvacuum t1_jcd2xzt wrote
Completely agree with you on this. This will get much worse IMO, specially with big investment from Microsoft in OpenAI and the fact that MS is now openly and directly challenging Google. This whole AI Alpha aggressive posturing from Satya Nadella has put Google in a difficult spot, I can't see how Google will continue to justify sharing their research openly to its investors.
Majesticeuphoria t1_jccmti9 wrote
Anthropic is another good org to support for AI safety research: https://www.anthropic.com/research
bert0ld0 t1_jcd5w2f wrote
From all of this I wonder where is Apple, did they completely missed the boat?
Smallpaul t1_jcdn6lf wrote
What a wasted lead with Siri.
That said, apple has an even higher reputation around polish and accuracy than Google does. They would need something different than ChatGPT. A lot more curated.
Beatboxamateur t1_jcdm84x wrote
Apple seems to be betting a lot on their upcoming XR projects, which will probably have a lot of AI integrated with the software, similar to Meta's vision. They're hugely hardware focused, so I don't think they'll ever be marketing some kind of LLM on it's own, it'll almost always be built in to support their hardware.
Purplekeyboard t1_jcc7cuo wrote
But without OpenAI, who would have spent the billions of dollars they have burned through creating and then actually giving people access to models like GPT-3 and now GPT-4?
You can use GPT-3, and even versions of GPT-4, today. Or you can stand and look up at the fortress of solitude that is Google's secret mountain lair where models are created and then hoarded forever.
Fidodo t1_jccjaf0 wrote
Lesson is that to be successful you need to actually ship something. You can't stay in research land forever. Who would have thought?
LegacyAngel t1_jcd1idr wrote
>But without OpenAI, who would have spent the billions of dollars they have burned through creating and then actually giving people access to models like GPT-3 and now GPT-4?
Other companies are providing access. OpenAI is just being reckless.
usual disclaimer here
[deleted] OP t1_jcbei6j wrote
[deleted]
ScientiaEtVeritas t1_jcbiupk wrote
It's not only about the model releases, but also the research details. With them, others can replicate the results, and improve on them and that might also lead to more commercial products and open-sourced models that have a less restrictive license. In general, AI progress is certainly the fastest when everyone shares their findings. On the other hand, with keeping and patenting them, you actively hinder progress.
RareMajority t1_jcbku9g wrote
Is it a good thing though for companies to be open-sourcing things like their weights? If there's enough knowledge in the open to build powerful un-aligned AIs that seems rather dangerous to me. I definitely don't want anyone to be able to build their own AGI to use for their own reasons.
ComprehensiveBoss815 t1_jcbowt3 wrote
OpenAI isn't even publishing the architecture or training method. Let alone the weights. They are in full on closed mode but have gall to still ask people to give them free training data.
RareMajority t1_jcbqi56 wrote
I'm not referring to OpenAI here. Meta released the weights to Llama and now anyone can build an AI based on that model for any purpose and without any attempt at alignment. Maybe there's middle ground between the two approaches.
noiseinvacuum t1_jcd41kc wrote
From a research perspective, imo, it's 1000x better to follow the Meta model of release code + architecture openly and share weights with researchers than to be completely closed and call yourself Open. I understand that there are genuine risks with weights been available to adversaries but I think it's still better for the progress of the very young field of AI.
wywywywy t1_jcaxwnu wrote
I'm more worried that other big players (Meta Google Alibaba Nvidia IBM etc) will follow suite and start withholding information :(
bartturner t1_jcbhi3u wrote
Exactly. That is why there should be push back on OpenAI behavior.
twilight-actual t1_jcd0wcs wrote
What exactly would that pushback be? Boycott? Post mean things?
About the only thing that could potentially prevent this is if the algorithms that we put into the public domain are protected by a license like the GPL, or something similar.
I haven't been following code releases, so I don't know if that's being done. And to be honest, I doubt most of the information flow is going by code. Rather, it's in the papers.
Is there a way to protect papers by a "GPL"? I honestly doubt it, because at that level, we're dealing strictly with ideas. And the only way to protect an idea is to patent them.
Perhaps the community, as a whole, should start patenting all their ideas, and then assigning the patents to a public trust that ensures that any derivative technology is published freely, too, under the same patent type.
VelveteenAmbush t1_jcd6opg wrote
You could patent your algorithm and offer some sort of GPL-like patent license, but no one respects software patents anyway (for good reason IMO) and you'd be viewed as a patent troll if you tried to sue to enforce it.
GPL itself is a copyright license and does you no good if OpenAI is using your ideas but not your code. (Plus you'd actually want AGPL to force code release for an API-gated service, but that's a separate issue.)
Smallpaul t1_jcdnffe wrote
Software patents assigned to a public trust are a different idea than randomly suing people.
It might be set up to only sue companies that are not open.
VelveteenAmbush t1_jcdxc8v wrote
Maybe you're onto something.
I guess the trick is coming up with foundational patents that can't be traced back to a large tech company that would worry about being countersued. Like if you make these inventions at Google and then Google contributes them to the GPL-esque patent enforcer entity, and then that entity starts suing other tech co's, you can bet that those tech co's will start asserting their patents against Google, and Google (anticipating that) likely wouldn't be willing to contribute the patents in the first place.
Also patent litigation is really expensive, and you have to prove damages.
But maybe I'm just reaching to find problems at this point. It's not a crazy idea.
twilight-actual t1_jce1tou wrote
The cat's kinda out of the bag at this point. But a non-profit public trust that acted as a patent-store to enforce the public dissemination of any derivative works based on the ideas maintained by the patent-store could make a huge difference ten, twenty years down the road. It would need an initial endowment to get started, retain a lawyer or two to manage it.
And then, publicize the hell out of it, evangelize the foundation over every college campus with a CS department. When students have established new state of art with ML, they can toss the design to the foundation in addition to arxiv, and where ever else they might publish.
Smallpaul t1_jce114b wrote
Just to be clear, I was just elaborating on /u/twilight-actual’s idea.
twilight-actual t1_jcdqi1j wrote
You got it.
bartturner t1_jcetimy wrote
> Post mean things?
Not the terminology I would choose. But yes post things that they should not be doing this. Public opinion is a very, very powerful tool to get people to behave.
1998marcom t1_jcf3er8 wrote
Detail note: to the best of my knowledge, as for what OpenAI is doing right now with their software, they could very well be using GPL code in their stack, and they wouldn't be violating any of the GPL clauses. A stricter licence such as AGPL I guess would be needed to cover as usage cases not only the shipping of software to the customer but also the mere utilization of the software.
Single_Ad_2188 t1_jcer0ba wrote
>It seems like the days for open research in AI are gone.
if the Google not released the paper "Attention is all you need". then GPT is not possible to create
bartturner t1_jcet1m2 wrote
Exactly. I do not like that OpenAI looks to be changing the culture of sharing.
existential_one t1_jcgtu4h wrote
I think the culture of publishing has been dying and people will think OpenAI was the one to trigger it, but in reality other companies already started restricting publications. Deepmind being the biggest one.
bartturner t1_jcgu47z wrote
Love how much DeepMind shares with the papers. Same with Google Brain.
To me the issue is OpenAI. What makes it worse is they use breakthroughs from DeepMind, Google Brain and others and then do not share.
We call them filtches
existential_one t1_jcgur9j wrote
I agree, but what I'm saying is that Deepmind is gonna stop publishing their good stuff. And it's not because of OpenAI.
IMO ml research papers weren't profitable before, and companies benefited for the collective effort, plus to retain talent. But now we're seeing ML models having huge impact on companies and single incremental papers can actually improve the bottom line, so all companies are gonna start closing their doors
bartturner t1_jcgvl8h wrote
> I agree, but what I'm saying is that Deepmind is gonna stop publishing their good stuff. And it's not because of OpenAI.
I do not believe that will happen. But the behavior of OpenAI does not help.
But Google has been more of a leader than a follower so hopefully the crappy behavior by OpenAI does not change anything.
I think the sharing of the research papers was done for a variety of reasons.
First, I fully agree to keep and retain talent. Which Google understood before others that was going to be critical. Why they were able to get DeepMind for $500 million and that would by easily 20x that today.
But the other reason is data. Nobody has more data than Google and also access to more data.
Google has the most popular web site in history and then the second most popular in addition. Then they also have the most popular operating system in history.
So if everyone had access to the same models it still keeps Google in a better position.
But the other reason is Google touches more people than any other company by a wide margin. Google now has 10 different services with over a billion daily active users.
Then the last reason is their hope that someone would not get something they can not get. I believe Google's goal from day 1 has always been AGI. That is what search has been about since pretty much day 1.
They worry that someone will figure it out in some basement somewhere. Very unlikely. But possible. If they can help drive a culture of sharing then it is far less likely to happen.
sweatierorc t1_jcbg3ki wrote
But if they stop publishing, it will hurt adoption, SD1.5 has became the benchmark of txt2img models over dalle-2 or more recent SD models.
Another thing to consider is that not publishing will hurt recruitment. Character.ai founders left google to build their own company after working on Lamda.
BrotherAmazing t1_jcdhklt wrote
All of these companies publish some things, they keep other things trade secrets, patent other things, and so on. Each decision is a business decision.
This thread is baffling to me because so many people seem to have this idea that, at one time, AI/ML or any tech companies were completely “open” and published everything of any value. This is nowhere close to reality.
justprotein t1_jceadxu wrote
No one said they were completely open, but there was a tradition of releasing papers, ideas, architectures, etc at least which really helped the field but now at risk because a set of people leveraged all this and wants people to regret being “open” with their research. I think Open in OpenAI is trolling
BrotherAmazing t1_jch3dkl wrote
I agree with your sentiment and have no problem with that.
There just seem to be more than one or two people here with the idea that Corporate entities have generally been publishing a higher % of their R&D than they actually ever did though. Some people (not saying you personally) seem to go farther and believe it is their duty to publish important IP and research.
I like them publishing and think it’s great, but just believe they never have a “duty” to do so if they don’t want to and have seen companies that “publish” behind the scenes hold a lot back too.
elehman839 t1_jcdjbmg wrote
Researchers and engineers seem to be moving from one organization to another pretty rapidly right now. Hopefully, that undermines efforts to keep technology proprietary.
BrotherAmazing t1_jcdh35r wrote
These companies already have massive amounts of trade secrets they withhold. They all do and lawyers have.
[deleted] OP t1_jcd9gt8 wrote
[deleted]
MysteryInc152 t1_jca93qy wrote
I don't think patent battles will go anywhere. DeepMind could simply stop releasing papers (or curtail it significantly) like they've already hinted they might do.
VelveteenAmbush t1_jcbw6mx wrote
DeepMind's leaders would love to hoard their secrets. The reason they don't is that it would make them a dead end for the careers of their research scientists -- because aside from the occasional public spectacle (AlphaGo vs. Lee Sedol) nothing would ever see the light of day. If they stopped publishing, they'd hemorrhage talent and die.
OpenAI doesn't have this dilemma because they actually commercialize their cutting-edge research. Commercializing its research makes its capabilities apparent to everyone, and being involved in its creation advances your career even without a paper on Arxiv.
sobe86 t1_jccmvfj wrote
This is hearsay, but my understanding was that Hassabis' goal was for Deepmind to be winning one Nobel Prize per year or something like this, so I don't think he's personally up for the closed research model.
VelveteenAmbush t1_jcd6bkq wrote
I think Hassabis' goal is to build a synthetic god and reshape the cosmos, and open research isn't necessary conducive to that except as needed to keep researchers motivated and engaged.
MysteryInc152 t1_jcbwooc wrote
I agree there's a limit to how much they can withhold without releasing anything at all.
Hyper1on t1_jck7qjx wrote
DM already does hoard their secrets, there are successful projects there which are not published. What they show you is what they decide needs to be public to get good PR.
NoScallion2450 t1_jca9r94 wrote
maybe but what about this? https://www.cnbc.com/2019/04/18/apple-paid-5-billion-to-6-billion-to-settle-with-qualcomm-ubs.html
BeautyInUgly t1_jcb88b2 wrote
peanuts compared to a patent war between Google / MSFT
oathbreakerkeeper t1_jcee0rj wrote
Where/when did they hint that?
xEdwin23x t1_jca7kxx wrote
Google has probably used stuff from OpenAI too (decoder only GPT-style training or CLIP or Diffusion or Dall-E ideas maybe?). Anyways, it's clear they (and probably every large tech company with big AI teams) are in an arms race at this point. Its definitely not a coincidence Google, OpenAI / Microsoft released on the same day, and we also heard Baidu is releasing sometime these days. Meta and others will be probably following suite. The publicity (and the market share for this new technologies) is worth too much.
NoScallion2450 t1_jca7ykx wrote
Not saying Google is better or OpenAI is better. But could they now be engaging in patent battles as it seems like now there is significant comercial interest at stake? And also OpenAI not releasing any details means for AI research going forward.
xEdwin23x t1_jca8d58 wrote
It's probably not in their interest as they know they both will end up worse if they decide to follow that path.
Snoo-64902 t1_jcahhn6 wrote
They may be worse off, but the world will be better off.
NoScallion2450 t1_jca8qxo wrote
What do you say so? For Google its probably peanuts in terms of cost. And there is a clear case for them to make that transfomers originated with them.
xEdwin23x t1_jca9kr0 wrote
OpenAI is not a small company either. It may be a "startup" but it's clearly backed by Microsoft and between those two there's probably quite a lot of patents that Google have used in the past too.
NoScallion2450 t1_jcaci1w wrote
Well that depends on whether OpenAI can prove Google is deriving commerical value from OpenAI's patented research. On the other hand for OpenAI, I can see a clear case of using idea from other labs (Google -- Attention is all you need)
But just to clarify, I am not on one side or either. Definitely a bit sad for AI research going forward. But would be interested in knowing how the landscape changes.
MrTacobeans t1_jcagwir wrote
I don't know anything about this side of AI but when it's boiled down it's fancy algorithms, can those be patented?
Maybe that's the driving force of the "open" nature of AI. An algorithm can't be patented but a product based on that can be. Kinda like how LAMBDA has the non-commercial license but if a community rebuilt it under a permissive license that'd be totally kosher.
This may be why openAI is being hush about their innovations because if it's published someone else can copy it without the legal woes.
The_frozen_one t1_jcb4t0y wrote
> Well that depends on whether OpenAI can prove Google is deriving commerical value from OpenAI's patented research.
That's not an issue, people make money due to patented technologies all the time. That's different from infringing on a patent. Either way, it would be an incredibly messy battle. Google invented the T in GPT, I can't imagine Google doesn't have a deep AI patent portfolio.
Kenyth t1_jcbc5p5 wrote
Baidu is set to announce its latest ChatGPT counterpart tomorrow Beijing time.
iJeff t1_jcd94db wrote
Do you happen to have any links to follow it?
utopiah t1_jcjoj65 wrote
in case you didn't follow https://www.reuters.com/technology/chinese-search-giant-baidu-introduces-ernie-bot-2023-03-16/ but nothing open source AFAICT.
iJeff t1_jck5f7y wrote
Thanks!
isthataprogenjii t1_jcfpxap wrote
lol
Jadien t1_jcafyim wrote
The large tech companies largely build their patent portfolios for defensive purposes. Two companies with big portfolios are mutually assured destruction should they start attempting to enforce them against one another.
BrotherAmazing t1_jcdmuqt wrote
That’s not true. If I am not infringing on any of your patents and you are clearly infringing on one or more of mine, cease and desist or lawsuit incoming and no “mutually assured destruction”.
Jadien t1_jce8fcr wrote
The idea is that at Google, Meta, Microsoft scale, the companies and their patent portfolios are so sprawling in what they do and cover that it is improbable that there aren't multiple infringements in both sides. It is in fact impossible to determine how much infringement your company is committing because it is unfeasible to even enumerate everything your company is doing, much less ensure that there is no intersection with a given patent portfolio. So it's a fair assumption that multiple infringements exist in both directions.
BrotherAmazing t1_jch23xt wrote
In the real-world cases I have been involved in, granted it was only four cases, things did not at all play out that way. Once it went to court but the defendant settled on terms favorable to the plaintiff, once the defendant complies with the cease and desist prior to the lawsuit being initiated, and the other two times actually went to trial and weren’t settled (which they told me was rare) with the plaintiffs winning once and the defendants winning once.
What you say really is not true because once you win or lose in court, it cannot be tried again and it’s a settled matter, and that process indeed does legally settle whether there is infringement or not. No one sits around after the verdict is read and scratches their head, wondering whether they are infringing or not.
NoScallion2450 t1_jcagotv wrote
Well that is what I used to think as well about AI research. But the question is will that trend continue or change like in other fields. (https://www.cnbc.com/2019/04/18/apple-paid-5-billion-to-6-billion-to-settle-with-qualcomm-ubs.html)
bartturner t1_jcbgzb3 wrote
It is a pretty scummy move. They would not have been able to create ChatGPT without Google's breakthrough with transformers.
Luckily Google let them use it instead of protecting.
That is how we moving everything forward.
anomhali t1_jcahd1f wrote
Open, my ass
tonsofmiso t1_jcaocla wrote
( ͡° ͜ʖ ͡°)
Quazar_omega t1_jcbzyzy wrote
Saved by a comma
Competitive_Dog_6639 t1_jcbhoi1 wrote
NLP researchers are breathing a massive sigh of release bc if GPT4 is unpublished they dont need to include it in benchmarks for their new papers 😆
LanchestersLaw t1_jcf41pg wrote
Bing Chat runs on GPT-4 and a fully version with multimodality is availible as a research preview
FinancialElephant t1_jcbh9it wrote
I don't like that they're calling themselves OpenAI when they aren't open.
AGI_69 t1_jcccvea wrote
There is Chrome extension, to rename all "OpenAI" mentions to "ClosedAI". It's open source
Chuyito t1_jcbu40y wrote
1, We are about to see a new push for a "robots.txt" equivalent for training data. E.g If yelp had a "datarules.txt file indicating no training on its comments for private use. Idea being that you could specify a license which allows training on your data for open source, but not for profit. Benefit for yelp is similar to the original Netflix training data set we all used at some point.
2, Its going to create a massive push for open frameworks. I can see nvda going down the path of "Appliances" similar to what IBM and many tech companies did for servers with pre-installed software. Many of those were open-source software, configured and ready to use/tune to your app. If you want to adjust the weight on certain bias filters, but not write the model from scratch.. Having an in house instance of your "assistant" will be favorable to many (E.g. if you are doing research on bioFuels, chatGpt will sensor way too much in trying to push "green", and lose track of research in favor of policy.)
thecity2 t1_jcc8549 wrote
Yes to point 1! Not enough people are talking about this aspect. The data wars are on imo. How will Google protect their mountains of video data for example.
lapurita t1_jcf4fs6 wrote
Ugh I'm not psyched about this at all, it will just protect the big companies from competitors and result in worse products for everyone
-Rizhiy- t1_jcblvqs wrote
This is a moot point. Most companies use AI research without contributing back, that is what being a business generally is, nothing new here.
They just need to admit that they are a business now and want to earn money for their own benefit, rather than "benefit all of humanity". Changing the name would be a good idea too)
thecity2 t1_jcc80ex wrote
They already said they are for profit entirely now.
omniron t1_jcbvup9 wrote
All research gets used for productive entrepreneurial purposes. OpenAI is just kind of sad that they started with the mission of being open literally in their name, and now are going the opposite direction.
Google will eat their lunch though. Google has the Worlds largest collection of video and that’s the final frontier of large transformer network Ai.
canopey t1_jcbb1cu wrote
Is there an article or paper where I can read more about this "sudden" pivot from open to private research?
DigThatData t1_jcbdl00 wrote
it started with the GPT-2 non-release for "safety reasons"
[deleted] OP t1_jcdbcc8 wrote
[deleted]
Necessary-Meringue-1 t1_jccim91 wrote
With the ever increasing cost of training LLMs, I feel like we're entering a new phase in AI. Away from open science, back to aggressively protecting IP and business interests.
Microsoft, via OpenAI are taking big steps into that direction. We'll see if others follow suit. I hope not, but I think they will.
[deleted] OP t1_jcdvsjo wrote
[deleted]
I_will_delete_myself t1_jcb193m wrote
Honestly this is like the movie scene where the hero becomes the villain they once pledge to fight. People have been pulling out because of the direction they are going.
pm_me_your_pay_slips t1_jcatwi5 wrote
They don't release that information because they don't want to lose their competitive advantage to other companies. It's a race towards AGI/Transformative AI. It could alsoo be a race for resources: e.g. convincing the US government to concentrate its funding on the leading AI project alone. This means any release of details may come only when OpenAI knows that trainnig for the next generation of models is running without problems.
This is likely based on the idea that newer models can be used to design/build/train the next generation of models, leading to an exponential amplification of capabilities over time that makes any lead time over the competition a decisive factor.
underPanther t1_jcbf1l8 wrote
Firstly, I don't see it as research if it's not published. It's a commercial product if they don't share it and profit from it. If you can reimplement it and publish it, it's yours for the taking.
Secondly, there's so much interesting work outside of large language models.
I don't care too much about what OpenAI get up to. They have a management team trying to become billionaires. That's fine. I'm happy doing science in my living room. Different priorities.
VinnyVeritas t1_jcdelqr wrote
Ultimately AI will become a big boys club, where big corporate will hold all the cards.
OpenAI just made the first leap towards that dystopian near future.
KingsmanVince t1_jcaf167 wrote
Sorry for my lack of knowledge, what do you mean by patents? Which things are the patents applied to? Model's weight? Model's source code? Model's theory (white papers)?
Researchers reuse others ideas and rethink of others work all the time. So if people want to against each other, just don't release white papers.
OptimizedGarbage t1_jcahlxb wrote
Google has patents on a lot of common deep learning methods, most notably dropout. They just don't enforce them (for now).
satireplusplus t1_jcbq2ik wrote
> most notably dropout.
Probably unenforable and math shouldn't be patentable. Might as well try to patent matrix multiplications (I'm sure someone tried). Also dropout isn't even complex math. It's an elementwise multiplication with randomized 1's and 0's, thats all it is.
[deleted] OP t1_jcc8123 wrote
[deleted]
impossiblefork t1_jccknnx wrote
There are workarounds though.
Dropconnect isn't patent encumbered (degrades feature detectors/neurons by dropping connections instead of disabling them) and is, I think better than dropout.
Similarly, with transformers, Google has a patent on encoder-decoder architectures, so everyone uses decoder-only architectures, etc.
Some companies are probably going to patent critical AI/ML things, but that hasn't really happened yet and I don't believe that any patent encumbered method is currently either critical or even optimal.
Deep-Station-1746 t1_jcamy6n wrote
Patenting a dropout feels a lot like NFTs - it's useless. So why bother?
Edit:
What I don't understand is how can anyone prove that someone is multiplying together matrices in some way as long as they don't admit to that themselves.
That's like someone patenting a thought. If you think about a particular patented pair of pants™, can you be sued for propagating a patented neural activity through your bio network? It's absurd.
OptimizedGarbage t1_jcazllh wrote
You can sue people who use it for millions of dollars and drive them out of business. Which is exactly how Google uses most of its other patents, as a club to beat competitors with.
bartturner t1_jcbi0ry wrote
> Which is exactly how Google uses most of its other patents, as a club to beat competitors with.
That is ridiculous. Where has Google gone after anyone? They do it purely for defensive purposes.
DigThatData t1_jcbdeka wrote
i don't see the analogy here, i'm wondering if maybe you're misunderstanding: they have a patent over the technique. not "a dropout", all dropout.
[deleted] OP t1_jcbd87p wrote
[deleted]
sam__izdat t1_jccyxl4 wrote
As a spectator, it's the standard story that's played out a million times now. I see ML as pre-scientific. If capital is allowed to take full control and call all the shots, it's not moving past that "pre" any time soon. It'll be a digital Taylorist discipline for PR industry surveillance and optimizing Amazon packers' pee breaks, and the brief flurry of actually useful progress is probably done.
MrPineApples420 t1_jcda2ix wrote
Can we even call it OpenAI anymore ? That’s literally false advertising…
LanchestersLaw t1_jcf5x9c wrote
I think the most similar historic example is the human genome project where the government and private industry where both racing to be the first to fully decode the human genome but the US government was releasing its data and industry could use it to get even further ahead.
Its the classic prisoners dilemma. If both parties are secretive research is much slower and might never complete but with a small probability of completing the project first for a high private reward for the owner and low reward for society. If one party shares and the other does not, the withholding party gets a huge comparative boost for a high probability of a high private reward. If both parties share we have the best case with parties being able to split the work and share insights so less time is wasted for a very high probability of a high private and high public reward.
I think for AI we need mutual cooperation and to stop seeing ourselves as rivals. The rewards for AI cannot be privatized for the shared mutual good of humanity in general (“Humanity” regrettably does include Google and the spider piloting Zuckerberg’s body). Mutual beneficial agreement with enforceable punishment for contract breakers is what we need to defuse tensions, not an escalation of tensions.
Outside_Donkey2532 t1_jcbw0z2 wrote
human greed
thecity2 t1_jcc8crf wrote
OpenAI is very doublespeak.
super_deap t1_jcdz573 wrote
RIP 💀 scientific progress for "the entire humanity" for the profits of a few. :(
The only way forward is if we as a collective AI community systematically fight against this type of censorship, or we might end up in an AI-dominated Orwellian world.
Ironic that I had read Life 3.0 by Max Teggmark where he was one of the guys raising concerns about the future of AI and trying to build an organization called 'OpenAI' for the benefit of mankind.
TooManyDangPeople t1_jcibdsy wrote
There are so many high paying jobs for ML researchers, just don't work for them. Don't support their company in any way and support the competition.
[deleted] OP t1_jcaexq4 wrote
[deleted]
[deleted] OP t1_jcbfo3n wrote
[deleted]
Uptown-Dog t1_jcboymu wrote
I think that patents over IP relating to software or math (and several other fields) are evil, evil, evil. If we're using them to do anything we're doing things wrong.
jrejectj t1_jcbqgsn wrote
im not in this field, but openai will become windows which largely adopt by normal user, and other become like linux, in term of operating software?
[deleted] OP t1_jcbqkhu wrote
[deleted]
TooManyLangs t1_jcbsew4 wrote
I imagine people ditching closedAI and Microsoft in a few months and start using alternatives instead (Google, open source, others). I don't use Bing, or BingGPT and I still use a chatbot everyday, so...
[deleted] OP t1_jcc1d8w wrote
[removed]
[deleted] OP t1_jcc3bg1 wrote
[deleted]
akaTrickster t1_jcc3stp wrote
Yeah
minhrongcon2000 t1_jcdi9jh wrote
Firstly, since OpenAI has released such a good chatbot right now, there is no point of enforcing patent for google and meta chatbot since patent requires you to public your work for other parties to verify that your work didn't overlap with current patent. Secondly, it's too late for Google to do patent now since it is widely used now :D
bartturner t1_jcgy1re wrote
That is NOT how patent law works. Maybe you are confusing with trademark?
MaximusPrimus01 t1_jcec11t wrote
They can force patents all they want. The true power of AU comes from it being open source. Community > corporations
pyonsu2 t1_jcen9vb wrote
Proving this is possible is already valuable.
Soon-ish open source communities will figure out and build something even “better”
serge_cell t1_jcenssl wrote
Let's fight the fire with gasoline.
SvenAG t1_jceuwcf wrote
I don't think that we are generally in the era of closed research now - there are still many companies sharing their research, ideas and concepts. If you are interested, we are trying to build an open alternative currently: https://opengpt-x.de/en/
But we are still in early stages
[deleted] OP t1_jciywm5 wrote
[removed]
[deleted] OP t1_jcjrtf0 wrote
[removed]
Eaklony t1_jcbfrk2 wrote
There is a reason why the advanced countries in the world are capitalism countries. Money is mechanism to control long term advancement of pretty much anything in our society. I don’t think anybody should be surprised AI development can’t be forever fully open sourced. Maybe some day it will, but certainly not before some major (global) social reformation.
bartturner t1_jcbh70h wrote
But it has been up to this point. ChatGPT is based on a technology breakthrough by Google.
There should be strong push back on OpenAI behavior. Otherwise we might end up with Google and others now sharing their incredible breakthroughs.
Eaklony t1_jcbkqgk wrote
That's not how capitalism works. To produce chatgpt they need a lot of money for a huge GPU farm, which needs to be invested by people who expect profits from it. If we want everything to be open sourced then chatgpt as it is now probably wouldn't be possible at all. But anyway I think basic theoretical breakthroughs like a new architecture for AI will still be shared among academia since those aren't directly related to money. Hopefully, it would just be the detailed implementation of actual products that aren't open source.
Nhabls t1_jcbmn7g wrote
> If we want everything to be open sourced then chatgpt as it is now probably wouldn't be possible at all
All of the technology concepts behind chatGPT are openly accessible and have been for the past decade, as was the work before, a lot of it came from big tech companies that work for profit, the profit motive is not an excuse. Only unprecedented greed in the space.
Though it comes to no surprise from the company that thinks it can just take any copyrighted data from the internet without any permission while at the same time forbid others from training models from data they get from the company's products. It's just sleaziness at every level.
>But anyway I think basic theoretical breakthroughs like a new architecture for AI will still be shared among academia since those aren't directly related to money
This is exactly what hasn't happened, they refused outright to share any architectural detail, no one was expecting the weights or even code. This is what people are upset about, and rightly so
bartturner t1_jcbmg57 wrote
> That's not how capitalism works.
Totally get that it makes no business sense that Google gives away so much stuff. Look at Android. They let Amazon use it for all their stuff.
But I still love it. I wish more companies rolled like Google. They feel like lifting all boats also lifts theirs.
Google being the AI leader for the last decade plus they have set a way of doing things.
OpenAI is not doing the same and that really sucks. I hope the others will not follow the approach by OpenAI and instead continue to roll like they have.
ComprehensiveBoss815 t1_jcbpqhb wrote
Then OpenAI should change their name to CapitalismAI and let a open source team of volunteers use the domain and project name.
VelveteenAmbush t1_jcbv79q wrote
The fact that they make their stuff available commercially via API is enough to make them 100x more "open" than the big tech companies.
master3243 t1_jccfe2n wrote
As a person that heavily relies on both CLIP (released 2 years ago) and Whisper (released just 6 months ago) in his research, I would disagree with the claim that "open research in AI [is] gone".
In addition, I've needed to run the usual benchmarking to compare my own work with several other models and was quite surprised when I was able to run my full benchmark on GPT-3 solely using the free credit provided by them.
Don't get me wrong, I criticize OpenAI for not completely sticking to the mission they built their foundation in (I mean it's literally in the name FFS) but I wouldn't say they completely closed off research from the public.
FigureClassic6675 t1_jce6d2r wrote
OpenAI is a research organization that has made significant contributions to the field of artificial intelligence. While the organization has not always released its research findings publicly, it has also collaborated with other research institutions and made some of its research open-source.
Regarding the issue of OpenAI benefiting from others' research, it is important to note that all research builds upon previous work in the field. OpenAI researchers are likely to have cited and built upon the work of others in their research, just as other researchers have likely cited and built upon OpenAI's work.
As for the question of whether Google Meta should enforce its patents against OpenAI, that is ultimately a decision for Google Meta to make based on its own business interests and legal considerations. It is worth noting that many technology companies engage in patent litigation as a means of protecting their intellectual property and asserting their market position, but this is a complex and contentious issue with many different perspectives and implications. Ultimately, the best outcome would be for all parties involved to find a way to collaborate and share knowledge in a way that benefits everyone in the field of AI research.
[deleted] OP t1_jca9un1 wrote
[removed]
mrfreeman93 t1_jcbgwp7 wrote
I mean LLaMA was apparently trained on outputs from davinci-003 from OpenAI... the rule is whatever works
Nhabls t1_jcbnr3g wrote
That's alpaca, a finetuning on llama and you're just pointing to another of openai's shameless behaviours. Alpaca couldn't be commercial because openai thinks it can forbid usage of outputs from their model to train competing models. Meanwhile they also argue that they can take whatever and any and all copyrighted data from the internet with no permission or compensation needed.
They think they can have it both ways, at this point i'm 100% rooting for them to get screwed as hard as possible in court on their contradiction
crt09 t1_jcbv608 wrote
> Alpaca couldn't be commercial because openai thinks it can forbid usage of outputs from their model to train competing models.
I dont think they claimed this anywhere? It seems that the only reason for Alpaca not releasing weights is Meta's policy for releasing Llama weights.
https://crfm.stanford.edu/2023/03/13/alpaca.html
> We have reached out to Meta to obtain guidance on releasing the Alpaca model weights, both for the 7B Alpaca and for fine-tuned versions of the larger LLaMA models.
Plus they already released the data they got from the GPT API, so anyone who has Llama 7B; an ability to implement the finetuning code in Alpaca; and 100 bucks can replicate it.
(EDIT: they released the code. now all you need is a willingness to torrent Llama 7B and 100 bucks)
Nhabls t1_jcc2tg0 wrote
It's written right after that
>Second, the instruction data is based on OpenAI’s text-davinci-003, whose terms of use prohibit developing models that compete with OpenAI
HyperModerate t1_jcd0lnn wrote
The way AI is used to launder copyright and licensing is concerning. Copyrighted data is used to train a model. The model’s output, now also licensed, is used to finetune a second model, also separately licensed. Finally, this highly licensed model is considered for public release.
The attitude is basically the same as a pirating but there is no similar legal precedent.
To be clear, I think AI research should be open.
EnjoyableGamer t1_jcbe3b3 wrote
There is not much to release from OpenAI, just big model with big data on existing methods. Google Deepmind did go that route of secrecy with AlphaGo, if anything the easy access for anyone to try is cool and new.
In the long run it's their mistake, as research never stops. It won't build on GTP4 but other alternatives that I'm sure will come in the next months.
Nhabls t1_jcbm504 wrote
> Google Deepmind did go that route of secrecy with AlphaGo
AlphaGo had a proper paper released, what are you talking about?
This action by OpenAI to completely refuse to share their procedure for training GPT-4 very much breaks precedent and is horrible for the field as a whole. It shouldn't be glossed over
VelveteenAmbush t1_jcbv0rs wrote
GPT-4 is an actual commercial product though. AlphaGo was just a research project. No sane company is going to treat the proprietary technological innovations at the core of their commercial strategy as an intellectual commons. It's like asking them to give away the keys to the kingdom.
Nhabls t1_jcc30fs wrote
The original transformers (ie the foundational model architecture all GPTs are based on) were also commercial products (they're used for search, summarization, translation,etc) we got them and the paper all the same.
[deleted] OP t1_jcc3o14 wrote
[deleted]
VelveteenAmbush t1_jcc4mvf wrote
Transformers aren't products, they're technology. Search, Maps, Ads, Translation, etc. -- those were the products. Those products had their own business models and competitive moats that had nothing to do with the technical details of the transformer.
Whereas GPT-4 is the product. Access to it is what OpenAI is selling, and its proprietary technology is the only thing that prevents others from commoditizing it. They'd be crazy to open up those secrets.
Nhabls t1_jccgj1w wrote
This a very silly semantic game that i have no interest in engaging with
VelveteenAmbush t1_jccizz1 wrote
It has nothing to do with semantics, it's basic corporate strategy.
ComprehensiveBoss815 t1_jcbpi7w wrote
I read the paper on AlphaGo and I felt it had a enough technical detail for me to reproduce.
geeky_username t1_jcbh0mf wrote
OpenAI still has custom training methods, and whatever other tweaks they've made to the model
UFO_101 t1_jcbkyo1 wrote
Not releasing research is fantastic. We get slightly longer to figure out how prevent AGI from killing everyone.
BrotherAmazing t1_jcdgtka wrote
I don’t understand what OP is worried or complaining about. Every business can choose whether they wish to publish or release IP or withhold it and keep it as a trade secret. That is a business decision.
You are allowed to “benefit from” information other companies publish so long as you don’t break any laws.
OP implies OpenAI is infringing on patents and Google or Meta should enforce their patents and make OpenAI pay royalties, cease and desist, or face legal consequences. What patents is OpenAI infringing on? I have an INCREDIBLY hard time believing Google or Meta wouldn’t go after someone who was infringing on their patents if they became aware of it.
gwern t1_jcb6nhe wrote
> I feel it is fair for others to enforce their patents
There are millions upon millions of companies and orgs out there that release less research, and are more parasitic, than OA, many of whom are also making a lot more profits, if that's the problem. Why don't you go after them first, hypocrites? Why do you hate OA for being so much better than the rest?
ComprehensiveBoss815 t1_jcbpv2t wrote
Whataboutism in full effect.
NoScallion2450 t1_jcbc23l wrote
Discussing a topic does not mean someone is going after someone.
ScientiaEtVeritas t1_jcahkze wrote
I think we should value much more what Meta & Google are doing. While they also potentially don't release every model (see Google's PaLM, LaMDA) or only with non-commercial licenses after request (see Meta's OPT, LLaMA), they are at least very transparent when it comes to ideas, architectures, trainings, and so on.
OpenAI itself changed a lot from being open to being closed but what's worse is that OpenAI could be the reason that the whole culture around AI research changes as well, which is sad and pretty ironic when we consider its name. That's why I'm generally not very supportive of OpenAI. So, as a research community, we should largely ignore OpenAI -- in fact, they proactively opted out of it, and instead let's value and amplify open research from Meta, Google, Huggingface, Stability AI, real non-profits (e.g., EleutherAI), and universities. We need counterbalance now.