Viewing a single comment thread. View all comments

gelukuMLG t1_j23znll wrote on December 29, 2022 at 2:24 PM

I think it saves the highly rated responses and feeds it into a dataset then it uses reinforcement learning by giving a positive reward to them.