justowen4 t1_it645n5 wrote on October 21, 2022 at 5:14 AM

In case you missed it, LLMs surprised us by being able to scale beyond expectations. The underestimation was because llms came from the nlp world with simple word2vec style word associations. In 2017 the groundbreaking “attention is all you need” paper showed the simple transformer architecture alone with lots of gpu time can outperform other model types. Why? Because it’s not an nlp word association network anymore, it’s a layered context calculator that uses words as ingredients. Barely worth calling them llms unless you redefine language to be integral to human intelligence

ftc1234 t1_it64mud wrote on October 21, 2022 at 5:19 AM

I know what LLMs are. They are a surprising development but the history of technology is littered with surprising discoveries and inventions. But there are very few inventions of the earth shattering variety. And I don’t believe that LLMs are of that variety for the reasons I stated. CNNs were all the rage before LLMs. And respected giants in the field such as Yann LeCun have also stated that LLMs are important but they aren’t everything.

AdditionalPizza OP t1_it6o96e wrote on October 21, 2022 at 9:41 AM

They may not be the be all end all, though they sure are looking like they are a very significant step at the very least.

But I've said in the comments before, this post is about the time before AGI. We don't need AGI to see massive disruptions in society. I believe LLM's are the way we will get there, but language models are "good enough" to increase productivity by enough across enough IT sectors that we will start seeing some really big changes soon.

Advancements like this are going to lead to more powerful LLM's too. Highly suggest reading this article from deepmind as the implications are important.

ftc1234 t1_it7c7j6 wrote on October 21, 2022 at 1:43 PM

The problem is often the last mile issue. Say you use LLMs to generate a T-shirt style or a customer service response. Can you verify correctness? Can you verify that the response is acceptable (eg., not offensive)? Can you ensure that it isn’t biased in its response? Can you make sure it’s not misused by bad actors?

You can’t represent all that with just patterns. You need reasoning. LLMs are still a tool to be exercised with caution by a human operator. It can dramatically increase the output of a human operator but it’s limitations are such that it’s still bound by the throughput of the human operator.

The problems we have with AI is akin to the problem we have with the internet. Internet was born and adopted in a hurry but it had so many side effects (eg. Dark web, cyber attacks, exponential social convergence, counduit for bad actors, etc). We aren’t anywhere close to solving those side effects. LLMs are still so limited in their capabilities. I hope the society will choose to be thoughtful in deploying them in production.

AdditionalPizza OP t1_it7dt3m wrote on October 21, 2022 at 1:55 PM

All I can really say is issues like that are being worked on as we speak and have been since inception. Assuming it will take years and years to solve some of them is what I'm proposing we question a little more.

But I'm also not advocating that fully automated systems will replace all humans in a year. I'm saying a lot of humans won't be useful at their current jobs when an overseen AI replaces them, and their skill level won't be able to advance quickly enough in other fields to keep up, rendering them unemployed.

ftc1234 t1_it7f3se wrote on October 21, 2022 at 2:05 PM

I am postulating something in the opposite direction of your thesis. The limitations of LLMs and modern AI are so much that the best it can do is enhance human productivity. But its not enough to replace it. So we’ll see a general improvement in the quality of human output but I don’t foresee a large scale unemployment anytime soon. There maybe a shift in the employment workforce (eg. A car mechanic maybe forced to close shop and operate alongside robots at the Tesla giga factory) but large scale replacement of human labor will take a lot more advancement in AI. And I have doubts if society will even accept such a situation.

AdditionalPizza OP t1_it7hczg wrote on October 21, 2022 at 2:20 PM

Yeah we have totally opposite opinions haha. I mean we have the same foundation, but we go different directions.

I believe increasing human productivity with AI will undoubtedly lead to a quicker rate with which we achieve more adequate AI and then the cycle continues until the human factor is unnecessary.

While I'm not advocating full automation of all jobs right away, I am saying there's a bottom rung of the ladder that will be removed, and when there's only so many rungs, eventually the ladder won't work. As in, chunks of corporations will be automated and there won't be enough jobs to fill elsewhere for the majority of the unemployed.

visarga t1_it6nwso wrote on October 21, 2022 at 9:36 AM

> But can it reason by itself without seeing pattern ahead of time? Can it distinguish between the quality of the results it generates? Can it have an opinion that’s not in the mean of the output probability distribution?

Yes, it's only gradually ramping up, but there is a concept of learning from verification. For example AlphaGo learned from self play, but it was trivial to verify who won the game. In math it is possible to plug the solution back to verify it, in code it is possible to run it or apply test driven feedback, with robotics it is possible to run sims and learn from outcomes.

When you move to purely textual tasks it becomes more complicated, but there are approaches. For example if you have a collection of problems (multi-step, complex ones) and their answers, you can train a model to generate intermediate steps and supporting facts. Then you use these intermediate data to generate the answer, an answer you can verify. This trains a model to discover on its own the step by step solutions and solve new problems.

Another approach is to use models to curate the training data. For example LAION-400M is a dataset curated from noisy text-image pairs by generating alternative captions and then picking the best - either the original one or one of the generated captions. So we use the model to increase our training data, that will boost future models in places out of distribution.

So it's all about being creative but then verifying somehow and using the signal to train.

ftc1234 t1_it7ak7b wrote on October 21, 2022 at 1:31 PM

I think you understand the limitations of the approaches that you’ve discussed. Generating intermediate results and trying out possibilities of outcomes is not reasoning. It’s akin to a monte carlo simulation. We do such reasoning every day (eg. Is there time to eat breakfast or do you have to run to office for the meeting, do you call the plumber this week or do you wait till next month for the full paycheck, etc). LLMs are just repeating patterns and that can only take you so far.

visarga t1_it8o018 wrote on October 21, 2022 at 7:05 PM

> Generating intermediate results and trying out possibilities of outcomes is not reasoning.

Could be. People are doing something similar when faced with a novel problem. It doesn't count if you've memorised the best action from previous experience.

If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI.

ftc1234 t1_it5pgx3 wrote on October 21, 2022 at 2:59 AM