Viewing a single comment thread. View all comments

spiritus_dei OP t1_j77u2ic wrote on February 4, 2023 at 7:09 PM

That might be why RLHF (reinforcement learning by human feedback) is ultimately doomed to fail.