unua_nomo t1_j2eyhnh wrote 2 years ago

Reply to comment by misconfigbackspace in There's now an open source alternative to ChatGPT, but good luck running it by ravik_reddit_007

I mean there are already open source datasets available, such as the Pile.

I can't see any argument for why a model derived on open source data would likewise not be open source, at which point if you could argue that a ML model could produce ip breaking content, that would be the responsibility of the individual producing and subsequently distributing that content.

As for data becoming stale, that wouldn't necessarily be an issue for plenty of applications, and even then there's no reason you couldn't just crowd fund 80k a year to train a newly updated model with newer content folded in.

misconfigbackspace t1_j2ez1sa wrote 2 years ago

> such as the Pile.

TIL. Thanks.