Submitted by JohnyWalkerRed t3_123oovw in MachineLearning
big_ol_tender t1_jdvu92g wrote
Thank you for posting this. I’ve raised this issue on a number of threads and even opened an issue on the alpaca repo. Everyone seems to ignore this and I’m worried about downstream issues with these models, and would love an open source alternative ( have been exploring making one myself).
JohnyWalkerRed OP t1_jdwjvxy wrote
Yeah like the databricks dolly post is funny to me because they are an enterprise software company and dolly is not really useful in the context they operate in. I guess they just wanted to get some publicity.
Looks like openassist, when mature, could enable this. Although it seems the precursor to an Alpaca-like dataset is an RLHF model, which itself needs human-labeled dataset, so that bottleneck needs to be solved too.
Taenk t1_jdwlejh wrote
The Open Assistant project is working on that as well.
rshah4 t1_jdxhz3d wrote
I agree with the sentiments here and don’t think it’s ok to use some of these datasets that appear to violate OpenAIs terms. I dealt with it by making a funny video: https://youtu.be/31u88EDmIwc
Viewing a single comment thread. View all comments