Submitted by _underlines_ t3_zstequ in MachineLearning
gelukuMLG t1_j23znll wrote
Reply to comment by EthansWay007 in [D] When chatGPT stops being free: Run SOTA LLM in cloud by _underlines_
I think it saves the highly rated responses and feeds it into a dataset then it uses reinforcement learning by giving a positive reward to them.
Viewing a single comment thread. View all comments