starstruckmon t1_jct0s11 wrote on March 19, 2023 at 9:57 AM

Reply to comment by throwaway957280 in [P] The next generation of Stanford Alpaca by [deleted]

They are. It's less to do with copyright and more to do with the fact that you signed the T&C before using their system ( and then broke ). It's simmilar to the LinkedIn data scraping case where the court ruled that it wasn't illegal for them to scrape ( nor did it violate copyright ) but they still got in trouble ( and had to settle ) because of violating the T&C.

One way around this is to have two parties, one generating and publishing the dataset ( doesn't violate T&C ) and another independant party ( who didn't sign the T&C ) fine-tuning a model on the dataset.

RoyalCities t1_jctcu1m wrote on March 19, 2023 at 12:29 PM

Couldnt it be possible to set up a large community Q/A repositiry then? Just crowdsource whatever it outputs and document collectively.

[deleted] OP t1_jd0nazd wrote on March 20, 2023 at 11:44 PM

[removed]

BraianP t1_jdfjbq4 wrote on March 24, 2023 at 12:47 AM

so, open assistant?