Submitted by FerretDude t3_y8y8cm in MachineLearning

Hey all!

My name is Louis Castricato. I lead CarperAI, a large FOSS group that recently released a library for doing distributed RLHF.

We just announced a project today during Scale's TransformX conference to reimplement Instruct GPT, make all the datasets available as MIT, and release our checkpoints/models.

I'm super interested in the democratization of large scale RLHF, as I feel it's a relatively unexplored space in the open source community.

To that end, we'd love to get the subreddit and community more involved in our task selection process for our instruct model. We'll be hosting a panel on this in a few weeks, so I'm curious r/machinelearning, what kinds of tasks would you love to see an instruct model tuned on if you had infinite resources?

Here is our instruct announcement: https://carper.ai/instruct-gpt-announcement/ And a link to our discussion panel on the CarperAI discord: https://discord.gg/cCR3xEAt?event=1029746950305751141

Excited to hear your thoughts!

49

Comments

You must log in or register to comment.

visarga t1_it323xj wrote

I'd like to see information extraction from semi structured documents like receipts, invoices, forms, contracts, screen shots (apps), etc. The format - question answering, you prompt with a document transcribed in text and a question, get the value in return.

6

FerretDude OP t1_it363tv wrote

Yeah I think a more general format for information extraction could potentially be useful

3

ivalm t1_it6c261 wrote

Task oriented dialogues following a description of the task.

From medical domain example:

————

Ask what symptoms the patient is feeling. For each symptom ascertain symptom duration, if it is worsening or improving, if there alleviating or aggravating factors. For symptoms with uncertain location assert symptom location.

Doctor: Hi, what brings you in here today?

Patient: I’ve been having a sore throat and pain in my face for the past week.

Doctor: [start generation]

—————

In particular such tasks (after multiple dialogue turns) are hard because one needs to be coherent through the conversation. This is something eg davinci-2 is much better than davinci.

3

Ok-Zombie2406 t1_it7vw7b wrote

Is there any way to find bias with RLHF models?

2