farmingvillein t1_ixdp4t8 wrote on November 22, 2022 at 5:46 PM

Very neat! Would love to see a version built with fewer filters (secondary models)--i.e., more grounded in a singular, "base" model // less hand-tweaking--but otherwise very cool. (Although wouldn't surprise me if simply upgrading the model size went a long way here.)

Acceptable-Cress-374 t1_ixg6ngd wrote on November 23, 2022 at 4:54 AM

Listened to a podcast with Andrej Karpathy recently, and his intuition for the future of LLM is that we'll see more collaboration and stacking of models, sort of a "council of GPT's" kind of approach, where you have models trained on particular tasks working together towards the goal.

Whatever the future holds, I'm betting we'll see constant improvements over the next few years, before we see a new revolutionary one-model take.

farmingvillein t1_ixgd88a wrote on November 23, 2022 at 6:03 AM

Yeah, understood, but that wasn't really what was going on here (unless you take a really expansive definition).

They were basically doing a ton of hand-calibration of a very large # of models, to achieve the desired end-goal performance--if you read the supplementary materials, you'll see that they did a lot of very fiddly work to select model output thresholds, build training data, etc.

On the one hand, I don't want to sound overly critical of a pretty cool end-product.

On the other, it really looks a lot more like a "product", in the same way that any gaming AI would be, than a singular (or close to it) AI system which is learning to play the game.

graphicteadatasci t1_ixh2mk4 wrote on November 23, 2022 at 11:48 AM

But they specifically created a model for playing Diplomacy - not a process for building board game playing models. With the right architecture and processes then they could probably do away with most of that hand-calibration stuff but the goal here was to create a model that does one thing.

farmingvillein t1_ixictvr wrote on November 23, 2022 at 5:42 PM

Hmm. Did you read the full paper?

They didn't create a model that does one thing.

They built a whole host of models, with high levels of hand calibration, each configured for a separate task.

kaj_sotala t1_ixj2dte wrote on November 23, 2022 at 8:30 PM

Do you happen to remember which podcast that was? Sounds interesting

Acceptable-Cress-374 t1_ixj37v4 wrote on November 23, 2022 at 8:35 PM

Lex Fridman's: https://lexfridman.com/andrej-karpathy/

Should be around here, if you want a direct timestamp, although I found the entire podcast really worth while.

(38:18) – Transformers

(46:34) – Language models

(56:45) – Bots

kaj_sotala t1_ixj3e79 wrote on November 23, 2022 at 8:36 PM

Thank you!