AsthmaBeyondBorders t1_itabm60 wrote on October 22, 2022 at 2:34 AM

Reply to comment by FirstOrderCat in U-PaLM 540B by xutw21

Look at the post you are replying to.

A wall is when we can't improve the results of the last LLMs.

New LLMs, both with different models and bigger scale, not only improve the performance of the last LLMs on tasks we already know they can do, but we also know there are emergent skills that we may still find scaling up. The models become capable of doing something completely new just because of scale, when we scale up and stop finding emergent skills then that's a wall.

FirstOrderCat t1_itacdp4 wrote on October 22, 2022 at 2:40 AM

>A wall is when we can't improve the results of the last LLMs.

The wall is a lack of break through innovations.

Latest "advances" are:

- build Nx larger model

- tweak prompt with some extra variation

- fine-tune on another dataset, potentially leaking benchmark data to training data

- receive marginal improvement in benchmarks irrelevant to any practical task

- call your new model with some epic-cringe name: path-to-mind, surface-of-intelligence, eye-of-wisdom

But none of these "advances" somehow can replace humans on real tasks, with exception to style-transfer of images and translation.

AsthmaBeyondBorders t1_itaclhq wrote on October 22, 2022 at 2:42 AM

The problem is you don't know what emergent skills are yet to be found because we didn't scale enough. And "breakthrough" may well be one of the emergent skills we haven't reached yet

FirstOrderCat t1_itad0zs wrote on October 22, 2022 at 2:46 AM

>The problem is you don't know what emergent skills are yet to be found because we didn't scale enough.

Yes, and you don't know if such skills will be found and we hit the wall or not yet.

AsthmaBeyondBorders t1_itad4q2 wrote on October 22, 2022 at 2:47 AM

There is a very old solution to finding that out. It is to scale and check instead of guessing

FirstOrderCat t1_itaeypy wrote on October 22, 2022 at 3:03 AM

this race maybe over.

On the graph guy is proud of getting 2 points in some synthetic benchmark, while spending 4 millions TPUv4 hours = $12M.

At the same time we hear that Google cuts expenses and considering layoffs, and LLM part of Google Research will be the first in the line, because they don't provide much value in Ads/Search business.

AsthmaBeyondBorders t1_itagbfz wrote on October 22, 2022 at 3:15 AM

This model had up to 21% gains in some benchmarks, as you can see there are many benchmarks. You may notice this model is still 540B just like the older one, so this isn't about scale it is about a different model which can be as good and better than the previous ones while cheaper to train.

You seem to know a lot about Google's internal decisions and strategies as of today, good for you, I can't discuss stuff I have absolutely no idea about and clearly you have insider information about where google is going and what they are doing, that's real nice.

FirstOrderCat t1_itahazn wrote on October 22, 2022 at 3:25 AM

>This model had up to 21% gains in some benchmarks, as you can see there are many benchmarks

Meaning they received less than 2 points in many others..

> it is about a different model which can be as good and better than the previous ones while cheaper to train.

Model is the same, they changed training procedure.

> You seem to know a lot about Google's internal decisions and strategies as of today

This is public information.

justowen4 t1_itau79p wrote on October 22, 2022 at 5:45 AM

Epic commenting you two. The winner is….. AsthmaBeyondBorders !