iamspro t1_jderz7f wrote on March 23, 2023 at 9:38 PM

I tried fine tuning vs few shot for my own implementation and in the end few shot was just much easier, despite the context window drawback. Huge advantage is you can dynamically add/remove/update APIs in an instant.

endless_sea_of_stars t1_jdezatt wrote on March 23, 2023 at 10:27 PM

I suspect future versions will do both. They will "bake in" some basic APIs like simple calculator, calendar, fact look ups. They will use in context for 3rd party APIs.

iamspro t1_jdf0f1o wrote on March 23, 2023 at 10:34 PM

Good point, that baking in could also include the overall sense of how to get the syntax right

countalabs t1_jdibk1j wrote on March 24, 2023 at 4:25 PM

The "fine tuning" in OpenAI API can be few-shots. The other approach of putting the instruction or example in context should be called zero-shots.

iamspro t1_jdj4wzl wrote on March 24, 2023 at 7:32 PM

Fine-tuning is distinct afaik... using OpenAI's language for it[1]:

zero-shot: no examples in the prompt, just an input (and/or instruction)

few-shot: one or more examples of input+output in the prompt, plus new input

fine-tuning: updating the model with examples (which can then be used with zero- or few-shot as you wish)

[1] https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api (part 5)

_faizan_ t1_jdwdwsm wrote on March 27, 2023 at 6:03 PM

Is there an open Implementation of ToolFormer? or you rolled your own implementation for finetuning? They did mention in their paper that they gave few In-context examples of tool usage and then used GPT-J to label more text which they finally used for fine-tuning. Did you follow a similar approach. I have been looking to reproduce tool-former but not sure where to start even.

wind_dude t1_jdf5yhj wrote on March 23, 2023 at 11:13 PM

Look at their limited docs, I feel it's a little simpler than toolformer, probably more like the blenderbot models for search, and prompt engineering.

- Matching intent from the prompt to a description of the plugin service

- extracting relevant terms from the prompts to send as query params based on description of the endpoint

- model incorporates API response into model response

"The file includes metadata about your plugin (name, logo, etc.), details about authentication required (type of auth, OAuth URLs, etc.), and an OpenAPI spec for the endpoints you want to expose.The model will see the OpenAPI description fields, which can be used to provide a natural language description for the different fields.We suggest exposing only 1-2 endpoints in the beginning with a minimum number of parameters to minimize the length of the text. The plugin description, API requests, and API responses are all inserted into the conversation with ChatGPT. This counts against the context limit of the model." - https://platform.openai.com/docs/plugins/introduction

signed7 t1_jdfcly9 wrote on March 24, 2023 at 12:00 AM

It's a shame that 'Open'AI has become so closed. Would be so cool to see a proper paper with technical details on how this works...

meister2983 t1_jdgghu6 wrote on March 24, 2023 at 5:32 AM

The Microsoft Research paper assessing intelligence capability of GPT4 effectively did this. If you just define APIs for the model to use under certain conditions it will write the API call. Once you do that, it's straightforward for a layer on top to detect the API call, actually execute it, and write the result back.

daugaard47 t1_jdkkyds wrote on March 25, 2023 at 1:46 AM

Wish they would have stayed open source, but can understand why they would sell out. There would have been no way they could handle the amount of traffic/need if they would have remained a non-profit. But as someone who works for a non-profit, I don't understand how they legally changed to a for-profit over a weeks time period. 😐

godaspeg t1_jdgih6t wrote on March 24, 2023 at 5:57 AM

In the "sparks of AGI" GPT4 Paper (can totally recommend to have a look, its crazy), the authors talk about the amazing abilities of the uncensored GPT4 version to use tools. Probably this suits quite well to the simple plugin approach of OpenAi, so I have high espectations.

Soc13In t1_jdh0n2m wrote on March 24, 2023 at 10:12 AM

Link/citation please

godaspeg t1_jdh18s9 wrote on March 24, 2023 at 10:20 AM

https://arxiv.org/abs/2303.12712

If you dont want to read 154 pages, here is an awsome summary:

https://youtu.be/Mqg3aTGNxZ0

Soc13In t1_jdhtpvf wrote on March 24, 2023 at 2:30 PM

thank you.

drcopus t1_jdhjddx wrote on March 24, 2023 at 1:17 PM

Imo doing everything in-context seems more hacky - I would rather see a Toolformer approach but I understand that it probably requires more engineering and compute.

I reckon the in-context approach probably makes the plugins less stable as the model has to nail the syntax. ChatGPT is good at coding but it makes basic errors often enough to notice.

[N] ChatGPT plugins

endless_sea_of_stars t1_jde88qi wrote on March 23, 2023 at 7:32 PM