Submitted by [deleted] t3_11v4h5z in MachineLearning
starstruckmon t1_jct0s11 wrote
Reply to comment by throwaway957280 in [P] The next generation of Stanford Alpaca by [deleted]
They are. It's less to do with copyright and more to do with the fact that you signed the T&C before using their system ( and then broke ). It's simmilar to the LinkedIn data scraping case where the court ruled that it wasn't illegal for them to scrape ( nor did it violate copyright ) but they still got in trouble ( and had to settle ) because of violating the T&C.
One way around this is to have two parties, one generating and publishing the dataset ( doesn't violate T&C ) and another independant party ( who didn't sign the T&C ) fine-tuning a model on the dataset.
RoyalCities t1_jctcu1m wrote
Couldnt it be possible to set up a large community Q/A repositiry then? Just crowdsource whatever it outputs and document collectively.
Viewing a single comment thread. View all comments