Viewing a single comment thread. View all comments

ElvinRath t1_j67uali wrote

Writing a description of every step instead of just clicking seems like a downgrade to me.

​

Oh well, as someone said on the discord, combined with voice recognition it could be useful for people with disabilities

13

manubfr t1_j67ybjg wrote

With enough data and a smarter model you could probably ask it first to break down all tasks and then execute them sequentially without human intervention. That’s what Adept ACT-1 is trying to do.

I fully expect that a lot of complex digital tasks will one day be fully automated, you will enter a high level description of what you want, the model will propose options for you to pick, then calculate the compute budget requirements for your selected options and give you a few quotes.

So for example, “order a burger fries coke now” will essentially be free, while “write and design a 40-page comic book about the story of Theseus in the style of Frank Miller then publish it on amazon” will come back with options (maybe that task costs $20 or something, likely cheaper).

Automating entire workflows is, to me, the most exciting and realistic outcome of LLMs in the next few years.

7

visarga t1_j68znkm wrote

> Automating entire workflows is, to me, the most exciting and realistic outcome of LLMs in the next few years.

They can also use YouTube screen casts - there are millions - to learn about solving tasks with desktop and web apps. YT is a treasure trove of procedural data - how to do things step by step, with commentary.

7

visarga t1_j68zvbi wrote

> Writing a description of every step instead of just clicking seems like a downgrade to me.

Use a LLM to write the step by step prompts as well. Like SayCan

> We show how low-level tasks can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally extended instructions, while value functions associated with these tasks provide the grounding necessary to connect this knowledge to a particular physical environment.

3