Just because something is free to access doesn't mean you have the right to do whatever you want with it, especially with regards to making derivative works without attribution or otherwise breaking license terms. This is what licenses and copyrights are for.

For example, if OpenAI scraped a code repository that uses a Creative Commons NonCommercial license and is using that code for monetary gain without the owner's consent, they're breaking that license. It'd have to be argued whether the fact that OpenAI used that code to train their models which may generate code to similar likeness counts as distributing the source, and whether having a user use that model under a paid service counts as a commercial violation of those terms.

The algorithm is IP, yes. But GPT-X is part model part training data.

Next_Boysenberry1414 t1_jeao2qn wrote on March 30, 2023 at 5:11 PM

#2,491,890

ShareGPT open web resource...

So this "open web" thing is just a name? WTF does open means?

khamelean t1_jeaogv3 wrote on March 30, 2023 at 5:14 PM

#2,491,972

Replying to FrowntownPitt (#2,491,805)

Just because something is free to access, doesn’t mean you are allowed to remember it or learn from it any way!!

Imthewienerdog t1_jeaoigh wrote on March 30, 2023 at 5:14 PM

#2,491,985

Replying to DorkRockGalactic (#2,488,515)

But that's not what this post is about at all? What?

YummyMummy2024 t1_jeaoukd wrote on March 30, 2023 at 5:16 PM

#2,492,062

Replying to FrowntownPitt (#2,491,805)

No doubt those licensing were ignored but without evidence how do you make that copyright claim? Without evidence does that make it derivative? What do you think?

FrowntownPitt t1_jeaq30n wrote on March 30, 2023 at 5:24 PM

#2,492,337

Replying to YummyMummy2024 (#2,492,062)

I mean yeah I agree, enforcing something like this is going to be very very difficult. But there are several clear examples of something like DallE generating images very similar to or nearly identical to copyrighted IP.

IANAL, but I presume a claimant could be able to establish some reasonable certainty to a court that licensed works were used in a way that breaks the license, at which point OpenAI (or really any AI company) would be responsible for defending their practice or non-use of those licensed works

AbeWasHereAgain t1_jeat78b wrote on March 30, 2023 at 5:44 PM

#2,493,066

Replying to khamelean (#2,491,972)

Go ask Vanilla Ice what happens when your music sounds a little too close to the original.

OpenAI, and Microsoft, are 100% violating terms of use for the vast majority of the stuff they scraped.

khamelean t1_jeautqo wrote on March 30, 2023 at 5:54 PM

#2,493,466

Replying to AbeWasHereAgain (#2,493,066)

All musicians learn from hearing other music.

There is a difference between learning and copying.

MuForceShoelace t1_jeaux6b wrote on March 30, 2023 at 5:55 PM

#2,493,499

I would like to sue every AI that used my data in it's training. 1 million dollars per use.

Thorusss t1_jeavgqn wrote on March 30, 2023 at 5:58 PM

#2,493,630

Even if it is against the terms of service of ChatGPT, what are they going to do about it? There are no legal judgments if AI output even is copyrightable, and no judgments if training on copyrightable material is fair use.

And OpenAI trained on a lot of copyright material, so they better think twice about opening that can of worms.

They only thing they can try to do, is limit the access of Google to ChatGPT's output, but good luck with that, if they want it to remain available to the general public.

_zir_ t1_jeavn7w wrote on March 30, 2023 at 5:59 PM

#2,493,670

Sounds like a disgruntled worker who probably got laid off or fired if they think this matters.

No_Character_8662 t1_jeaxen9 wrote on March 30, 2023 at 6:10 PM

#2,494,112

Replying to khamelean (#2,493,466)

So if I call something in my process "learning" I'm free to use it? I'm learning copies of your works on my printer to sell right now

Edit: to be clear I don't know what the answer is but that seems simplistic

AbeWasHereAgain t1_jeaxmj7 wrote on March 30, 2023 at 6:12 PM

#2,494,166

Replying to khamelean (#2,493,466)

lol - you don't think ChatGPT is spitting out insanely close replicas of other peoples work daily?

AbeWasHereAgain t1_jeaxxpf wrote on March 30, 2023 at 6:14 PM

#2,494,233

Replying to [deleted] (#2,489,440)

lol - say you regularity violate use terms on open source software without saying you regularity violate use terms on open source software.

thurken t1_jeayknu wrote on March 30, 2023 at 6:18 PM

#2,494,379

Replying to YummyMummy2024 (#2,489,272)

We're talking about ethics here, not unethical legal loopholes

khamelean t1_jeazo62 wrote on March 30, 2023 at 6:25 PM

#2,494,647

Replying to AbeWasHereAgain (#2,494,166)

Nothing wrong with playing/singing other people’s songs, I sing along to the radio in my car all the time.

khamelean t1_jeaztbb wrote on March 30, 2023 at 6:26 PM

#2,494,690

Replying to No_Character_8662 (#2,494,112)

The learning isn’t the problem, the selling is.

AbeWasHereAgain t1_jeazzbi wrote on March 30, 2023 at 6:27 PM

#2,494,735

Replying to khamelean (#2,494,647)

ha ha ha - yeah, totally the same thing. Just an FYI, artists are required to pay when they do a cover.

Everything changes when you start making money off other peoples work.

ShadoWolf t1_jeb05vb wrote on March 30, 2023 at 6:28 PM

#2,494,773

Replying to FrowntownPitt (#2,491,805)

You signed over your rights to your content . when you signed up to reddit, or facebook, or google.

It's not like OpenAI is using some shoestring budget web scrapper using python and the beautifulsoup library.

They have partnerships .. and requested the raw text data.

khamelean t1_jeb0muo wrote on March 30, 2023 at 6:31 PM

#2,494,902

Replying to AbeWasHereAgain (#2,494,735)

That’s exactly my point.

AbeWasHereAgain t1_jeb0qkx wrote on March 30, 2023 at 6:32 PM

#2,494,928

Replying to khamelean (#2,494,902)

What is your point?

[deleted] t1_jeb2g57 wrote on March 30, 2023 at 6:43 PM

#2,495,355

[removed]

DoobieBrotherhood t1_jeb2kru wrote on March 30, 2023 at 6:43 PM

#2,495,385

Replying to DorkRockGalactic (#2,488,515)

Because logic is out the window when it comes to hypocrisy nowadays. If you think we should limit GHG emissions, you can’t use any form of energy. If you think Russia was wrong to invade Ukraine and commit genocide, then you cannot be a citizen of any country that has ever been in a war.

khamelean t1_jeb2qq4 wrote on March 30, 2023 at 6:44 PM

#2,495,426

Replying to AbeWasHereAgain (#2,494,928)

It’s not a problem until you start making money off other peoples work.

Newfondahloose t1_jeb3tmh wrote on March 30, 2023 at 6:51 PM

#2,495,693

Replying to AbeWasHereAgain (#2,494,166)

It’s learning and using language to answer questions. There’s only so many ways you can answer the same question. Greed getting in the way of progress, as always. Guess professors should give a citation every time they give a verbal answer even though they are answering from memory.

Newfondahloose t1_jeb49yx wrote on March 30, 2023 at 6:54 PM

#2,495,801

Replying to khamelean (#2,494,690)

They are selling their own work. There’s only so many ways you can answer a question. Just because you’ve answered the question before, doesn’t mean someone else can’t come to the same conclusion when answering for themselves.

Newfondahloose t1_jeb4uk7 wrote on March 30, 2023 at 6:58 PM

#2,495,952

Replying to thurken (#2,494,379)

Ethics are different for everyone. I find it unethical to hold back society just because you want to be referenced or given 5 cents for your shitty, regurgitated blog post.

Newfondahloose t1_jeb59mb wrote on March 30, 2023 at 7:00 PM

#2,496,048

Replying to AbeWasHereAgain (#2,494,233)

There’s only so many ways to make a computer say “Hello, World!”. Don’t want it copied? Don’t make it public.

Suberizu t1_jeb5du5 wrote on March 30, 2023 at 7:01 PM

#2,496,073

What the hell, Google, you really need to do that shit with your resources?

AbeWasHereAgain t1_jeb5q6e wrote on March 30, 2023 at 7:03 PM

#2,496,160

Replying to Newfondahloose (#2,496,048)

ha ha ha - yeah, that's exactly what we are talking about here.

PS - that's exactly what OpenAI is complaining about here.

thurken t1_jeb60db wrote on March 30, 2023 at 7:05 PM

#2,496,224

Replying to Newfondahloose (#2,495,952)

That was kind of the opposite point. That OpenAI would have some nerves to be mad a google to use ChatGPT to generate training data when they used everyone's data to get training data.

Space_Pirate_R t1_jeb6yrh wrote on March 30, 2023 at 7:11 PM

#2,496,445

Replying to khamelean (#2,495,426)

Are monetized AI artists paying royalties to everyone whose art was scraped off the web?

Anti-Queen_Elle t1_jeb70eh wrote on March 30, 2023 at 7:12 PM

#2,496,459

Replying to Next_Boysenberry1414 (#2,491,890)

"OpenAI" is the next "Valve can't count to 3" meme

Space_Pirate_R t1_jeb7vmw wrote on March 30, 2023 at 7:17 PM

#2,496,636

Replying to NLwino (#2,491,551)

In a shocking twist, posting data on social media constitutes implied permission for other users to process it in their browsers in order to read it .

However, in a second shocking twist, posting doesn't constitute implied permission for corporations to train AI with the contents of posts.

CalvinKleinKinda t1_jeb8blj wrote on March 30, 2023 at 7:20 PM

#2,496,737

Replying to FrowntownPitt (#2,492,337)

"Generating" literal smudged watermarks from copyrighted content.

Numai_theOnlyOne t1_jeb91dq wrote on March 30, 2023 at 7:25 PM

#2,496,910

Replying to YummyMummy2024 (#2,489,272)

Yeah and it makes sense as human but I can see this being an issue with AI and how fast it can learn.

After all suddenly whatever I posted anywhere is used to generate revenue which was formerly targeted towards people for free to get response for free. AI though usually requires you to pay for it. So why shouldn't the pay me to use my data? Sure maybe there is someone that made money with my response, and I might buy any of there stuff that's fine because it was not only because of my input unlike AI which only works because of the data. Same with artists. They were posting stuff for free not to be used for free but to present their art and land a job. You can't also not just rip an image from the internet and use it in a commercial because "it was freely available on the internet".

Numai_theOnlyOne t1_jeb9lah wrote on March 30, 2023 at 7:28 PM

#2,497,022

Replying to No_Character_8662 (#2,494,112)

Tbh can we separate human learning with AI learning?

A human is a biological imperfect being that require time and repetition to learn.

AI needs just a large pool of data and can the same as millions of humans in a fracture of the time required.

I think that's not the same learning, and a thing that honestly should be questioned, after all our content was created with humans in mind and not meant to been used for ai.

bookko t1_jeba5ac wrote on March 30, 2023 at 7:32 PM

#2,497,162

Replying to DorkRockGalactic (#2,488,515)

it'd be the same use case as academic books, the knowledge is everywhere, dating back to Pythagoras but having it available in an usable manner is where the crux lies.

Sirisian t1_jeba6bn wrote on March 30, 2023 at 7:32 PM

#2,497,176

Rule 12, submit articles and sourced information.

Particular-Way-8669 t1_jebc3ih wrote on March 30, 2023 at 7:44 PM

#2,497,663

Replying to YummyMummy2024 (#2,489,272)

Everything free to access that is not licensed under copyright friendly IP is by definition IP of the one who put it out. Even if you take picture and put it on Facebook it is your IP. Facebook might have TOS that says they have right to do certain things you post on their site. Sure. But you gave then permission by agreeing to it. OpenAI never received any permission from anyone. Period.

Particular-Way-8669 t1_jebcdng wrote on March 30, 2023 at 7:46 PM

#2,497,723

Replying to khamelean (#2,493,466)

There is difference between human that can be creative and using it for computer program that creates aggregations. Completely different thing. AI does not really learn. It adjusts its mathematical functions based on data.

Particular-Way-8669 t1_jebcnyt wrote on March 30, 2023 at 7:48 PM

#2,497,800

Replying to ShadoWolf (#2,494,773)

You signed off those rights away to these sites. Not to OpenAI lol. It is still your IP. You can not go and copy it because you posted it there because you Will be hit with infrigement law suit. Reddit, Facebook, Google received your permission to use it in certain way. And yes Google or Facebook can potentionally claim it used those data fairly for their models. OpenAI? Not a chance.

ThrillShow t1_jebdn9g wrote on March 30, 2023 at 7:54 PM

#2,498,032

Replying to Space_Pirate_R (#2,496,636)

I'm shocked by how many people unquestioningly accept the idea that AI should be entitled to the same rights as humans, as if a machine that scrapes huge portions of the internet for content is exactly the same as one person browsing.

khamelean t1_jebg7rh wrote on March 30, 2023 at 8:10 PM

#2,498,707

Replying to Space_Pirate_R (#2,496,445)

Are human artist paying royalties to everyone who’s art they scraped off the web??

AcceptableGood5105 t1_jebg8h9 wrote on March 30, 2023 at 8:10 PM

#2,498,712

They’d better worry about AI bots becoming so mature one day that they start violating humans and human society instead of copyrights

khamelean t1_jebgej7 wrote on March 30, 2023 at 8:11 PM

#2,498,757

Replying to Particular-Way-8669 (#2,497,723)

No, there is no difference. Creativity is just combination and random mutation. It’s how humans are creative, it’s how machines are creative. It’s the same thing.

Particular-Way-8669 t1_jebh4n3 wrote on March 30, 2023 at 8:16 PM

#2,498,954

Replying to khamelean (#2,498,757)

This is utter bullshit. There was always some human that came up with something first. When there was nothing like that before. AI technology we know does not have this ability. And never will. It is only data aggregation, nothing else. Human does not need data from other humans to be creative and the very fact that there was someone who climbed off of trees and picked up first fire is proof of that.

[deleted] t1_jebh53j wrote on March 30, 2023 at 8:16 PM

#2,498,956

Replying to khamelean (#2,498,707)

[deleted]

ShadoWolf t1_jebhlhj wrote on March 30, 2023 at 8:19 PM

#2,499,071

Replying to Particular-Way-8669 (#2,497,800)

unfortunately your wrong:

Your Content

The Services may contain information, text, links, graphics, photos, videos, audio, streams, or other materials (“Content”), including Content created with or submitted to the Services by you or through your Account (“Your Content”). We take no responsibility for and we do not expressly or implicitly endorse, support, or guarantee the completeness, truthfulness, accuracy, or reliability of any of Your Content.

By submitting Your Content to the Services, you represent and warrant that you have all rights, power, and authority necessary to grant the rights to Your Content contained within these Terms. Because you alone are responsible for Your Content, you may expose yourself to liability if you post or share Content without all necessary rights.

You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:

When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.

Any ideas, suggestions, and feedback about Reddit or our Services that you provide to us are entirely voluntary, and you agree that Reddit may use such ideas, suggestions, and feedback without compensation or obligation to you.

Although we have no obligation to screen, edit, or monitor Your Content, we may, in our sole discretion, delete or remove Your Content at any time and for any reason, including for violating these Terms, violating our Content Policy, or if you otherwise create or are likely to create liability for us.

khamelean t1_jebi03n wrote on March 30, 2023 at 8:21 PM

#2,499,193

Replying to Particular-Way-8669 (#2,498,954)

Combination + mutation. It allowed evolution through natural selection to give us every life form on earth. Creativity works exactly the same way.

Space_Pirate_R t1_jebi0au wrote on March 30, 2023 at 8:21 PM

#2,499,195

Replying to khamelean (#2,498,707)

Human artists learning from others' work is obviously "fair use." I don't think a corporation will successfully deploy that in defense of training a commercial AI.

Particular-Way-8669 t1_jebiwfy wrote on March 30, 2023 at 8:27 PM

#2,499,417

Replying to ShadoWolf (#2,499,071)

Why do you even bother copying something without reading it?

"You retrain any ownership rights..."

End of Story, I am right. It says exactly what I said it did. You grant Reddit (and only Reddit) rights to manipulate with your content as written in TOS. You do not grant it to anyone else. If Reddit partners with someone then they would also be included if Reddit gave them that right. But this is not what happened. OpenAI scrapped internet. There was no partnership with reddit or anyone whatsoever.

khamelean t1_jebj0yq wrote on March 30, 2023 at 8:28 PM

#2,499,445

Replying to Space_Pirate_R (#2,499,195)

Just looking at a piece of art is enough to encode it into a human’s neural network. Why should it be any different for an artificial neural network? If it’s free to access then it’s free to access.

NLwino t1_jeblcee wrote on March 30, 2023 at 8:42 PM

#2,499,998

Replying to ThrillShow (#2,498,032)

What do you think search engines need to do to give you the results?

Space_Pirate_R t1_jebovpk wrote on March 30, 2023 at 9:05 PM

#2,500,810

Replying to khamelean (#2,499,445)

I don't believe that an artificial neural network is morally or legally equivalent to a human. If I did believe that, then there would be more pressing issues than copyright infringement to deal with, such as corporate enslavement of AIs.

[deleted] t1_jebr73q wrote on March 30, 2023 at 9:20 PM

#2,501,379

Replying to NLwino (#2,499,998)

[deleted]

Space_Pirate_R t1_jebrw1m wrote on March 30, 2023 at 9:24 PM

#2,501,550

Replying to NLwino (#2,499,998)

People making copyright work available on the internet are granting an implied permission for search engines to index their work, because that's pursuant to the normal purposes of posting on the internet. People make work available on the internet for the purpose of allowing others to find it using search engines and view it using browsers.

However, making copyright work available on the internet does not constitute an implied permission or license to do literally anything with the posted work. People don't usually post work on the internet for the purpose of helping corporations train commercial AIs, and therefore no implied permission to do so is granted by the act of making copyright work available on the internet.

khamelean t1_jebrxm4 wrote on March 30, 2023 at 9:25 PM

#2,501,560

Replying to Space_Pirate_R (#2,500,810)

What does moral or legal equivalence to humans have to do with anything?

The point is that all AI has to do to learn from art is look at it. If someone makes their art free to look at, then it’s free for an AI to look at.

ShadoWolf t1_jebs6px wrote on March 30, 2023 at 9:26 PM

#2,501,625

Replying to Particular-Way-8669 (#2,499,417)

You retain ownership... but you more or less signed over all right in what they can do with said information... it right there in the highlighted text.

When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit << this is the part that lets them hand it over to companies like OpenAI

Space_Pirate_R t1_jebta1s wrote on March 30, 2023 at 9:34 PM

#2,501,885

Replying to khamelean (#2,501,560)

AIs don't have agency. The AI is a tool which is being operated by a corporate entity. The corporate entity is governed by existing laws, and requires a license to use a copyright work in the operation of their business.

Particular-Way-8669 t1_jebu396 wrote on March 30, 2023 at 9:39 PM

#2,502,084

Replying to ShadoWolf (#2,501,625)

Again if Reddit trained their own AI on user's data or gave that data to openAI as part of contract then you would have the point. But this is not what happened. OpenAI did not ask anyone. They run data crawling scripts and stole data without asking. It is nothing like what Reddit is doing. You did not sign anything off to OpenAI.

khamelean t1_jebxbzr wrote on March 30, 2023 at 10:01 PM

#2,502,865

Replying to Space_Pirate_R (#2,501,885)

So companies have to pay a licensing fee to every artist who’s work that employees of that company have ever looked at?? Yeah, I don’t think that’s how it works.

johndburger t1_jeby23l wrote on March 30, 2023 at 10:06 PM

#2,503,008

Replying to khamelean (#2,493,466)

ChatGPT has “learned” some generalizations from the text that it’s processed, but it has also literally memorized (I.e. copied) billions of words from it.

johndburger t1_jebyej1 wrote on March 30, 2023 at 10:08 PM

#2,503,096

Replying to ShadoWolf (#2,501,625)

> this is the part that lets them hand it over to companies like OpenAI

Your claim is that OpenAI has negotiated usage rights from every single site it’s gotten data from? Do you have any evidence for this?

[deleted] t1_jebz2w2 wrote on March 30, 2023 at 10:13 PM

#2,503,250

Replying to khamelean (#2,502,865)

[deleted]

Space_Pirate_R t1_jebzo2e wrote on March 30, 2023 at 10:17 PM

#2,503,367

Replying to khamelean (#2,502,865)

No, because (as I mentioned earlier) there is a fair use exemption which allows humans to be educated using copyright works. However, there is no such exemption allowing corporations to train AI using copyright works.

khamelean t1_jec048i wrote on March 30, 2023 at 10:20 PM

#2,503,474

Replying to johndburger (#2,503,008)

Technically it remembers the relationships between words, those relationships are encoded in its neural network. It doesn’t just copy the text.

https://en.m.wikipedia.org/wiki/Transformer_(machine_learning_model)

khamelean t1_jec1fmg wrote on March 30, 2023 at 10:29 PM

#2,503,771

Replying to Space_Pirate_R (#2,503,367)

Education is irrelevant in this context. The copyrighted works people consume through education is a tiny fraction of the total number of copyrighted works that most people experience through their lives. And all of those experiences contribute to that person’s capabilities.

The exemption for education’s purposes is for presenting copyright material to students in an education setting. It has nothing to do with copyright work that the student might seek out themselves.

Space_Pirate_R t1_jec5lal wrote on March 30, 2023 at 10:59 PM

#2,504,785

Replying to khamelean (#2,503,771)

Yes, humans experience copyright works and learn from them, and that's fair use. What does that have to do with training an AI?

A person or corporation training an AI is covered by normal copyright law, which requires a license to use the work.

khamelean t1_jec837g wrote on March 30, 2023 at 11:16 PM

#2,505,413

Replying to Space_Pirate_R (#2,504,785)

How is it any different to an employee “using” the work? Corporations don’t pay licensing when an employee gets inspired by a movie they saw last night.

Why do you keep mentioning corporations? An AI could just as easily be trained by an individual. I’ve written and trained a few myself.

Space_Pirate_R t1_jec8j1p wrote on March 30, 2023 at 11:20 PM

#2,505,515

Replying to khamelean (#2,505,413)

>Corporations don’t pay licensing when an employee gets inspired by a movie they saw last night.

The employee themselves paid to view the movie. The copyright owner set the amount of compensation knowing that the employee could retain and use the knowledge gained. No more compensation is due. This is nothing like a person or corporate entity using unlicensed copyright works to train an AI.

>Why do you keep mentioning corporations? An AI could just as easily be trained by an individual. I’ve written and trained a few myself.

Me too. I keep saying "person or corporation training an AI" to remind us that the law (and any moral judgement) applies to the person or corporate entity conducting the training, not to the AI per se, because the AI is merely a tool and is without agency of its own.

khamelean t1_jecbi7y wrote on March 30, 2023 at 11:41 PM

#2,506,203

Replying to Space_Pirate_R (#2,505,515)

“What does that have to do with a person or corporate entity training an ai?”

Training a human neural network is analogous to training an artificial neural network.

Whether the employee paid to watch a movie doesn’t matter, they could have just as easily watch something distributed for free. The transaction to consume the content is, as you said irrelevant to the corporation.

An AI consuming a copyright work is no different to a human consuming a copyright work. If that work is provided for free consumption, why would the owner of the AI have to pay for the AI to consume it?

[deleted] t1_jeceuu2 wrote on March 31, 2023 at 12:06 AM

#2,507,106

Replying to khamelean (#2,506,203)

[deleted]

Space_Pirate_R t1_jecfcfy wrote on March 31, 2023 at 12:09 AM

#2,507,247

Replying to khamelean (#2,506,203)

>Training a human neural network is analogous to training an artificial neural network.

By definition, something analogous is similar but not the same. Lots of things are analogous to others, but that doesn't even remotely imply that they should be governed by the same laws and morality.

>An AI consuming a copyright work is no different to a human consuming a copyright work.

A human consuming food is no different to a dog consuming food. Yet we have vastly different laws governing human food compared to dog food. Dogs and AI are not humans, and that is the difference.

>If that work is provided for free consumption, why would the owner of the AI have to pay for the AI to consume it?

If that work is provided for free consumption, why would the owner of a building have to compensate the copyright owner to print a large high quality copy and hang it on a public wall in the lobby? The answer is that the person (not the AI) is deriving some benefit (beyond fair use) from their use of the copyrighted work, and therefore the copyright owner should be compensated.

beingsubmitted t1_jecici1 wrote on March 31, 2023 at 12:31 AM

#2,508,030

Replying to YummyMummy2024 (#2,489,272)

The algorithm is barely IP, and the data is the bigger part of it's success.

ChatGPT is a reinforcement learning tuned transformer. The ideas and architecture it's built on aren't proprietary. The specific parameters are, but that's not actually that important. The size and number of layers, for example. Most people in ai can make some assumptions. Probably ReLU, probably Adam, etc. Then there are different knobs you can twiddle and with some trial and error you dial it in.

The size and quality of your training data is way more important, and in the case of chatgpt, so is your compute power. Lots of people can design a system that big, it's as easy as it is to come up with big numbers, but training it takes a ton of compute power, which costs money, which is why just anyone hasn't already done it if it's so easy.

It should also be said that GPT is a bit of a surprise success. Before models this size, it was a big risk. You're gonna spend millions to train a model, and you won't know until it's done how good it will be.

Most advancements in AI are open source and public. Those all help advance the field, but at the same time, it's also about taking a bit of a risk, and waiting to see how it pans out before taking the next risk.

Also, there's transfer learning. If you spend a hundred million training a model, I can use your trained model and a fraction of the money to make my own .

It's like if you laboriously took painstaking measurements to figure out an exact kilogram and craft a 1kg weight. You didn't invent the kilogram, difficult as it was to make it. If I use yours to make my own, I'm not infringing on your IP.

khamelean t1_jecru6d wrote on March 31, 2023 at 1:44 AM

#2,510,716

Replying to Space_Pirate_R (#2,507,247)

The building owner is using a replication of the copyrighted work. The owner should absolutely compensate the original creator.

But the printing company that the building owner hires to print the poster doesn’t owe the original creator anything. Even though it is directly replicating copyrighted work, and certainly benefiting from doing so. If the printer were selling the copyrighted works directly then that would be a different matter and they would have to compensate the original copyright owner. So clearly context matters.

An AI doesn’t even make a replication of the original work as part of its training process.

If the AI then goes on to create a replication, or a new work that is similar enough to the original that copyright applied, and intended to use the work in a context where copyright would apply, then absolutely. That would constitute a breach of copyright.

It is the work itself that is copyrighted, not the knowledge/ability to create the work. It’s the knowledge of how to create the work which is encoded in the neural network.

Lots of people benefits from freely distributed content. Simply benefiting from it is not enough to justify requiring a license fee.

Hypothetically speaking, let’s say a few years down the line we have robot servants. I have a robotic care giver that assists me with mobility. Much as I may have a human care giver today.

If I go to the movies with my robot care giver, they will take up a seat so I would expect to pay for a ticket, just as I would for a human care giver. Do I then need to pay an extra licensing fee for the robots AI brain to actually watch the movie?

What if it’s a free screening? Should I still have to pay for the robot brain to “use” the movie?

Is the robot “using” the movie in some unique and distinct way compared to how I would be “using” the movie?

Comments