_poisonedrationality t1_j7i5r45 wrote on February 6, 2023 at 11:31 PM

You shouldn't confuse "scientific progress" with "commercial gain". I know a lot of companies in AI blur the line but I think that researchers, who don't seek to make a profit aren't really the same as something like Stability AI, who are trying to sell a product.

Besides, it's not clear to me whether these AI tools be used to benefit humanity as a whole or only increase the control a few companies have over large markets. I really hope this case sets ome decent precedents about how AI developers can use data they did not create.

EmbarrassedHelp t1_j7klnqy wrote on February 7, 2023 at 2:05 PM

If Getty Images wins, then AI generation tools are going to become further concentrated to a handful of companies while also becoming less open.

HateRedditCantQuitit t1_j7nbrkz wrote on February 8, 2023 at 1:04 AM

Not necessarily. If it turns out, for example, that language generation models trained on GPL code must be GPL, then it means that there's a possible path to more open models, if content creators continue creating copyleft content ecosystems.

currentscurrents t1_j7ioshb wrote on February 7, 2023 at 1:53 AM

> Besides, it's not clear to me whether these AI tools be used to benefit humanity as a whole

Of course they benefit humanity as a whole.

Language models allow computers to understand complex ideas expressed in plain english.
Automating art production will make custom art/comics/movies cheap and readily available.
ChatGPT-style AIs (if they can fix hallucination/accuracy problems) give you an oracle with all the knowledge of the internet.
They're getting less hype right now, but there's big advances in computer vision (CNNs/Vision Transformers) that are revolutionizing robotics and image processing.

>I really hope this case sets ome decent precedents about how AI developers can use data they did not create.

You didn't create the data you used to train your brain, much of which was copyrighted. I see no reason why we should put that restriction on people trying to create artificial brains.

[deleted] t1_j7j2gb9 wrote on February 7, 2023 at 3:39 AM

[deleted]

e_for_oil-er t1_j7kh4m0 wrote on February 7, 2023 at 1:27 PM

Major corporations using ML to generate images instead of hiring artists purely in the goal of increasing their profits. Helping to make the richest guy to get even more rich. How does that help humanity?

VeritaSimulacra t1_j7ib4yu wrote on February 7, 2023 at 12:09 AM

TIL science cannot progress without training ML models on Getty images

currentscurrents t1_j7innd5 wrote on February 7, 2023 at 1:44 AM

Getty is just the test case for the question of copyright and AI.

If you can't train models on copyrighted data this means that they can't learn information from the web outside of specific openly-licensed websites like Wikipedia. This would sharply limit their usefulness. It also seems distinctly unfair, since copyright is only supposed to protect the specific arrangement of words or pixels, not the information they contain or the artistic style they're in.

The big tech companies can afford to license content from Getty, but us little guys can't. If they win it will effectively kill open-source AI.

trias10 t1_j7iv9iy wrote on February 7, 2023 at 2:43 AM

Data is incredibly valuable, OpenAI and Facebook have proven that. Ever bigger models require ever more data. And we live in a capitalist world, so if something is valuable, like data, you typically have to pay for it. So open source AI shouldn't be a thing.

Also, OpenAI is hardly open source anymore. They no longer disclose their data sources, data harvesting, data methodologies, nor release their training code. They also don't release their trained models anymore.

If they were truly open source, I could see maybe defending them, but at the moment all I see is a company violating data privacy and licences to get incredibly rich.

[deleted] t1_j7iwbn0 wrote on February 7, 2023 at 2:51 AM

[removed]

HateRedditCantQuitit t1_j7l1m4c wrote on February 7, 2023 at 3:59 PM

>If you can't train models on copyrighted data this means that they can't learn information from the web outside of specific openly-licensed websites like Wikipedia. This would sharply limit their usefulness.

That would be great. It could lead to a future with things like copyleft data, where if you want to train on open stuff, your model legally *must* be open.

ninjasaid13 t1_j7irgdv wrote on February 7, 2023 at 2:14 AM

I think he meant more about open source being threatened.

hgoel0974 t1_j7l1meo wrote on February 7, 2023 at 3:59 PM

You seem like the type to argue that any ethics related restrictions on science are bad.

superluminary t1_j7ldt9p wrote on February 7, 2023 at 5:18 PM

If the US doesn’t allow it then China is just going to pick this up and run with it. These things are technically possible to do now. The US can either be at the front, leading the AI revolution, or can dip out and let other countries pick it up. Either way it’s happening.

[N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement

xtime595 t1_j7hyhbw wrote on February 6, 2023 at 10:42 PM