bigabig

bigabig t1_j6z3a6d wrote on February 2, 2023 at 10:18 PM

Reply to [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

I thought this was also because you do not need so much supervised training data because you 'just' have to train the reward model in a supervised fashion?

bigabig t1_j6yc5gr wrote on February 2, 2023 at 7:28 PM

Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

Is the automatic transcription done with openai whisper?