blose1

blose1 t1_jdu4cln wrote

Reply to comment by visarga in [D] GPT4 and coding problems by enryu42

What? Where I'm exactly mistaken? Because both of my statements are true. And there is 0% chance you can pass olympiad task without knowledge, human with all the knowledge WILL reason and come up with a solution BASED on the knowledge he has AND experience of others that is part of that knowledge, if that weren't true then no human would solve any Olympiad. Sorry, but what you wrote in context of my comment is just ridiculous, and looks like a reply to something I didn't write.

4

blose1 t1_jdso3tt wrote

I literally told you my use case and it failed on that and it failed on similar problem 1-2 months ago when I was using 3.5 version, for my class of problems nothing changes, it fails the same way. I think you have your eyes shut and not reading what people write. I'm not talking about easy CRUD problems that you can find thousands of solutions online, ChatGPT is doing ok on these kind of tasks and it solved a lot of them for me too.

10

blose1 t1_jdsid0p wrote

It's the same on out of distribution problems, It will just confidently say false things, I will tell it what is wrong and explain why and it will correct code making it wrong/not working correctly in a different way. I recently build a thing and you can't find anything similar to it anywhere in open source and you can't find any tutorial/solution to this problem online and ChatGPT failed to deliver.

At the end of the day it's just statistics based on all available knowledge on the internet.

6

blose1 t1_jdoj8kl wrote

GPT models struggle with out of distribution programming tasks, which means it can't create novel ideas, I tested this myself many times and it's not a prompt engineering issue. I think LLMs could act as great teachers but not researchers, teachers just teach what we already know, researchers create novel knowledge that teachers use.

7

blose1 t1_j4td3lq wrote

>Recognize the Gmail icon of I say "send an email"

This is not testing intelligence, this is testing if human was trained on computer usage, knows what e-mail is and used gmail before.

Someone from tribe in Africa would fail your test while he is human and is intelligent, train him on this task like you would train current gen multimodal system and it will pass your benchmark. You train LLM in combination with image model and RL model, train on instruction following using inputs you described and now it understands what it sees, can follow what you want it to do.

1

blose1 t1_j3zngmt wrote

Nah, It was clearly like this:

Some exec at MS thinks "so company is worth 0 billions now, we already lost 1 billion, so how much is company worth? let me write program to solve this mathematical problem"

uint8 company_value = 0;

company_value = company_value -1;

print(company_value);

"Ok, so company is worth 255 billions now, got it. Guys I know how much OpenAI is worth!"

11

blose1 t1_j1apar1 wrote

Even with int8 you need at least 175 GB of VRAM to run one model instance, time to launch and load it on demand will be higher that using openai api and your performance will be lower. Forget about running current generation of LLMs like OPT/BLOOM in cloud for real world cases, they are crap, I've tested them, they loop all the time and they can't match chatGPT results, you will not get performance of chatGPT from them without human assisted RL step that openai did. So wait for next gen of open source models or just use chatGPT.

5

blose1 t1_j14q7ul wrote

Have you actually tried both on same tasks? I mean it seems like a lot of people here read a paper and some blog and make their conclusion without even using the tool, I've used both on the same tasks, compared on hundreds of real world cases and yes it's fine-tuned GPT3 but with human assisted RL and it's doing circles around GPT-3 in question answering, COT and code generation.

2

blose1 t1_j0shnwm wrote

> Hallucinations should be easily solvable by incorporating a factual knowledge database, like in RETRO.

> accountants, para-legals, teachers etc

​

No, RETRO style DB will not help with hallucinations, it will only help with simple fact checking, will not help with all generative nature of responses because they are not facts and you can't fact check if they are true in any DB.

​

> imo 50% of white collar jobs are going away in the next 10 years.

You are hallucinating.

0

blose1 t1_izczrn8 wrote

I asked ChatGPT about your comment:

>someone asked on reddit "is this for real?" about article with title "Building A > Virtual Machine inside ChatGPT", how would you respond to him ?

A: It's difficult to say without more context. It's possible that the article in question is discussing the use of a large language model, such as GPT-3, to create a virtual machine within a chat environment. However, without being able to view the article and verify its contents, it's impossible to say for sure whether the claim is true or not. In general, it's always a good idea to approach claims with a healthy dose of skepticism and to verify information before accepting it as fact.

>article is indeed about creating a virtual machine inside large language model and simulating running different commands while this model returns accurate responses, specifically using you - ChatGPT, this person is indeed skeptical, knowing now that additional context, how would you respond to that person?

A: It's certainly possible to use a large language model, such as ChatGPT, to simulate running commands within a virtual machine. Language models are able to generate text based on a given input, so they can be used to generate responses to a wide range of prompts. In the case of a virtual machine, the prompts might be commands that a user could enter, and the responses would be the output of those commands. However, it's important to note that these responses are generated by the model and are not the result of the model actually executing the commands. In other words, while the model may be able to generate responses that are similar to the output of a real virtual machine, it is not actually running the commands and therefore is not a true virtual machine.

13