Submitted by Wiskkey t3_10vg97m in MachineLearning
JustOneAvailableName t1_j7le6dw wrote
Reply to comment by HateRedditCantQuitit in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey
> but commercial use requires opt-in consent from content creators
You might as well ban it directly for commercial use with opt in
TaXxER t1_j7ojt22 wrote
As much as I like ML, it’s hard to argue that training ML models on data without consent, let alone even copyrighted data, would somehow be OK.
JustOneAvailableName t1_j7oknmi wrote
Copyright is about redistribution and we're talking pubicly available data. I don't want/need to give consent to specific people/companies to allow them to read this comment. Nor do I think it should now be up to reddit to decide what is and isn't allowed
TaXxER t1_j7omop6 wrote
Generative models do redistribute though, often outputting near copies:
https://arxiv.org/pdf/2203.07618.pdf
Copyright does not only cover republishing, but also covers derived work. I think it is a very reasonable position to consider all generative model output o for which some training set image Xi had a particularly large influence on o, to be derived work from Xi.
Similar story holds true for code generation models and software licensing: copilot was trained on lots of software repos that had software licenses that require all derived work to be licensed under an at least equally permissive license. Copilot may very well output a specific code snippets particularly based on what it has seen in a particular repo, thereby potentially opening up the user to the obligation to the licensing constraints that come with deriving work from that repo.
I’m an applied industry ML researcher myself, and am very enthousiastic about the technology and state of ML. But I also think that as a field as a whole we have unfortunately been careless about ethical and legal aspects.
Viewing a single comment thread. View all comments