biggieshiba

biggieshiba t1_jdojnn6 wrote

I don't understand why anyone would care, in a few years half the internet will be ai generated. If someone uses GPT-4 to generate a sentence posted on Wikipedia how will you know before using it ? Don't you think many models will use that sentence?

Plus, how will they know, training data is not easy to extract from a model. Except if you are a direct OpenAI competitor they won't ever care or even look at you (well maybe their superAI will).

Lastly the dataset is full of errors, better generate again or even pay people would be quite cheap for 50k examples. This is quite a bad dataset when you really look at it, empty inputs or outputs, unclear instructions, instructions not fit for model... The fact that it is bad and small is very encouraging BTW since it performs pretty well.

2

biggieshiba t1_iu8524i wrote

Hello, I'm learning as a hobbyist and want to go to production with my trained model.

I know front and back end coding but serving and scaling a model in production seems daunting. I'm looking at AWS right now but it doesn't seem like the easiest tool to deploy ML models. I thought it would be much easier to deploy a model! (real world performance is another problem I will have to study soon)

3