Submitted by romantimm25 t3_10mqm3g in MachineLearning
mil24havoc t1_j64lhxk wrote
IANAL but the copyright protects the paper's text, data, and the code. Algorithms themselves can't be copyrighted. If you reimplement the algorithm, you can do whatever you want with it.
Edit to add: licenses on (trained) models haven't been tested in court as far as I'm aware. I can imagine this being very complicated. Can you copyright and license a linear regression fit to simple economic data? For example: log(gdp) = alpha + beta×population? That seems silly. So why would a Transformer (e.g.) be any different? If you add Gaussian noise to every weight in a Transformer, is the license still valid?
romantimm25 OP t1_j64mzgp wrote
What I always don't understand is the "reimplement" the algorithm.
I mean where lies the line between being too similar to the original and being completely different?
Of course there is the most obvious cases where one changes a "for loop" to a "while loop". But then does switching a certain library on which the paper's code depends on means that the implementation is different enough?
mil24havoc t1_j64ogl0 wrote
It basically means you read the paper and write the code to do what the paper describes yourself.
If you start with their code base, then your work is derivative of that copyrighted work and the question becomes a bit more complicated.
Yes, the line is fuzzy. However, it's typically very easy to stay on the "not copyright or license infringing" side of the line if you make an honest effort to rewrite the code from scratch and simply use their code base to check your understanding of the algorithm.
Again, IANAL but changing a for loop to a while loop is probably not sufficient to distinguish between their work and yours. Rewriting the code in another language may be. Rewriting it in the same language but making substantial changes to (for example) user interface, data preprocessing, training data, hyperparameters, etc... may be.
Edit: courts and lawyers usually aren't too concerned with technical details. Think of it like a book. The same story gets told over and over again by different authors who use different words to tell it. Your implementation needs to tell the same story but in different words, basically.
red_dragon t1_j64qkqz wrote
Isn't the main issue with the weights? Are the weights propreitary?
mil24havoc t1_j64s1uy wrote
The weights are part of the model, not the algorithm. Whether these can be copyrighted is (a) unclear and (b) should have no bearing on the status of the algorithm itself.
Edit: The output of an algorithm has been ruled by courts to not be copyrightable. A Transformer is, itself, the "output" of an algorithm (e.g., SGD). Therefore, IMHO (IANAL), a Transformer cannot be copyrighted. We'll see if the judges who start taking these cases are savvy enough to rule correctly. Similarly, recipes cannot be copyrighted and they're quite similar to algorithms.
Viewing a single comment thread. View all comments