Monoranos t1_j6xmoc2 wrote on February 2, 2023 at 4:51 PM

Reply to comment by mr_birrd in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

I am not saying the whole internet is free to run but, using people's data without consent raises privacy and ethical concerns. Profiting from potentially stolen data raises questions about legality and morality.

mr_birrd t1_j6xo5qk wrote on February 2, 2023 at 5:00 PM

No it doesn't raise ethical concerns. You literally have to agree about usage about your data and at least in Europe should be able to opt out of everything if you want. You should 100% know this, those are the rules of the game. Just cause you don't read the terms of agreements doesn't make it unethical for companies to read your data. Sure if you then use it for insurances that won't help you cause you will become sick w.h.p. that's another thing. But don't act surprised.

Monoranos t1_j6xp59x wrote on February 2, 2023 at 5:06 PM

Just read my edit about the GDPR and explicit consent.

"in Europe should be able to opt out of everything if you want." Great point, I wonder how would OpenAI react if people want them to remove their data. Is it even possible ?

mr_birrd t1_j6xps33 wrote on February 2, 2023 at 5:10 PM

Do you know the dataset is was trained on even?

Monoranos t1_j6xs7m3 wrote on February 2, 2023 at 5:25 PM

I don't believe that they disclosed the data on which they trained chatGPT. If you know do you mind sharing ? :)

mr_birrd t1_j6xtb3u wrote on February 2, 2023 at 5:32 PM

Edit: Chatgpt uses GPT3. Search the dataset it used.

Google it they have full transparency. If you find a text by yourself there maybe ask if they can remove it. First of all, the data is only used for stachastic gradient descent and the model has no idea about the content it read, it only can model probabilities of words, e.g. it learned to speak but it only speaks such that it mostly outputs what makes sence in a bayesian way.

So the model is already trained and it didn't even read all of the data, those huge models often only read each instance of sample once at maximum, since they learn that "well".

Also in the law text you wrote I understand it that if you opt out in the future, it doesn't make past data processing wrong. The model is already trained, so they don't have to remove anything.

They also mostly have a whole ethics chapter in their papers, maybe you go check it out. Ethics etc is not smth unknows and especially such big companies also have some people working on that in their teams.

Monoranos t1_j6xumt3 wrote on February 2, 2023 at 5:40 PM

Even if they have full transparency it doesn't mean they are GDPR complient. I tried to look more into it but was not successfull.

mr_birrd t1_j6xvaec wrote on February 2, 2023 at 5:44 PM

Well the thing is you aren't the first one to think about that. They do this for very long already and know that what they do is legal here. They would not waste millions in training it just to throw it away afterwards.

myrmil t1_j6xw2sq wrote on February 2, 2023 at 5:48 PM

Yeah, they sure wouldn't Kappa