Submitted by sideways t3_103hwns in singularity
bernard_cernea t1_j30sckp wrote
The highest voted comment on that blog by jbash for your convenience:
I really don't care about IQ tests; ChatGPT does not perform at a human level. I've spent hours with it. Sometimes it does come off like a human with an IQ of about 83, all concentrated in verbal skills. Sometimes it sounds like a human with a much higher IQ than that (and a bunch of naive prejudices). But if you take it out of its comfort zone and try to get it to think, it sounds more like a human with profound brain damage. You can take it step by step through a chain of simple inferences, and still have it give an obviously wrong, pattern-matched answer at the end. I wish I'd saved what it told me about cooking and neutrons. Let's just say it became clear that it did was not using an actual model of the physical world to generate its answers.
Other examples are cherry picked. Having prompted DALL-E and Stable Diffusion quite a bit, I'm pretty convinced those drawings are heavily cherry picked; normally you get a few that match your prompt, plus a bunch of stuff that doesn't really meet the specs, not to mention a bit of eldritch horror. That doesn't happen if you ask a human to draw something, not even if it's a small child. And you don't have to iterate on the prompt so much with a human, either.
Competitive coding is a cherry-picked problem, as easy as a coding challenge gets... the tasks are tightly bounded, described in terms that almost amount to code themselves, and come with comprehensive test cases. On the other hand, "coding assistants" are out there annoying people by throwing really dumb bugs into their output (which is just close enough to right that you might miss those bugs on a quick glance and really get yourself into trouble).
Self-driving cars bog down under any really unusual driving conditions in ways that humans do not... which is why they're being run in ultra-heavily-mapped urban cores with human help nearby, and even then mostly for publicity.
The protein thing is getting along toward generating enzymes, but I don't think it's really there yet. The Diplomacy bot is indeed scary, but it still operates in a very limited domain.
... and none of them have the agency to decide why they should generate this or that, or to systematically generate things in pursuit of any actual goal in a novel or nearly unrestricted domain, or to adapt flexibly to the unexpected. That's what intelligence is really about.
I'm not saying when somebody will patch together an AI with a human-like level of "general" performance. Maybe it will be soon. Again, the game-playing stuff is especially concerning. And there's a disturbingly large amount of hardware available. Maybe we'll see true AGI even in 2023 (although I still doubt it a lot).
But it did not happen in 2022, not even approximately, not even "in a sense". Those things don't have human-like performance in domains even as wide as "drawing" or "computer programming" or "driving". They have flashes of human-level, or superhuman, performance, in parts of those domains... along with frequent abject failures.
ebolathrowawayy t1_j32jtql wrote
> Other examples are cherry picked. Having prompted DALL-E and Stable Diffusion quite a bit, I'm pretty convinced those drawings are heavily cherry picked; normally you get a few that match your prompt, plus a bunch of stuff that doesn't really meet the specs, not to mention a bit of eldritch horror.
Clearly he barely used SD.
AsuhoChinami t1_j32o2r3 wrote
Yeah. You can debate the overall intelligence of AI, but AI art and image generation is now very good. There is simply no getting around this.
ebolathrowawayy t1_j32ps2r wrote
His unwillingness to engage with the material in front of him led him to mischaracterize image gen. It makes me think most of his arguments are poor because image gen isn't the only thing he didn't engage with.
Yes ChatGPT has some pretty serious flaws, but they seem to be solved by other models. I won't be surprised when gpt-4 comes out and is indistinguishable from an extremely smart human.
Left-Shopping-9839 t1_j31u10m wrote
Oh you sound like someone who actually used the tools. All the hype is people who read about them. Just try putting code written by copilot straight into production!
[deleted] t1_j321dkw wrote
[deleted]
Left-Shopping-9839 t1_j3243r8 wrote
If you actually do real software development you would know this isn't possible. By 'you' I mean anyone. Not specifically you. I have spent hours tracking down strange errors back to the fact that I didn't check the copilot code closely enough. It does a great job providing code that is 90% correct, but often slips in undeclared variables etc. This is not 'intelligence'. It's just an awesome code completion tool which makes a lot of mistakes but still saves a lot of typing.
[deleted] t1_j32ca6v wrote
[deleted]
Left-Shopping-9839 t1_j32drm7 wrote
Agree 100%. In my company (and I think most others are this way) your code has to pass tests. This is what is missing in the copilot model. They would need to track feedback all the way to production and applied fixes to know whether the code suggestion is good. That is the sort of learning loop which needs to be in place to even start to claim intelligence. Hopefully they are working on this. I use copilot and honestly I love it. It's not going to be replacing humans in its current iteration yet the hype train keeps rolling. LLMs are mockingbirds. They are impressively good, but still mockingbirds. DALL-E imo....is shit.
footurist t1_j329a5h wrote
Yes, that's what I came to conclude early on when reading about this for coding.
One thing it's really, really good for though is providing a context sensitive solution ( sometimes ) to give you a head start.
Left-Shopping-9839 t1_j32f2t0 wrote
I use copilot for everything and I love it. There are times when it spits out code that looks exactly like what I'm thinking and does it better than I could. In those moments I could easily claim the singularity has arrived. The next time it creates something that uses some library of functions that I don't even have imported and sometimes doesn't even exist LOL. So even if they work out the simple stuff, it's still a long way from being anything other than awesome code completion.
ebolathrowawayy t1_j32k5fs wrote
I haven't used copilot but chatgpt is great for generating a starting point for libraries you've never used before. Literally better than the official docs in most cases.
Edit: Have you used both chatgpt and copilot? How do they differ for code gen?
Left-Shopping-9839 t1_j33pp82 wrote
Thanks for the tip. I'll try that.
footurist t1_j32nl8k wrote
Yes, the inventing part can be incredibly funny. Just recently it had me fooled completely. It was making up believable functions like there's no tomorrow. "updateTargetValue", "reverseDirection". Lmao.
ZenMind55 t1_j32wqxu wrote
People don't understand the concept of exponential increases. All of these things they mentioned have only come about in the last couple years. Saying it's not as smart as you think will become incorrect very soon.
Just like in the image, it's a small gap between a dumb human and Einstein and it will bridge that gap sooner than we think
CommercialNo6364 t1_j33edb6 wrote
>Diplomacy bot
what's it?
CommercialNo6364 t1_j33efyd wrote
ok it's Cicero I guess
DukkyDrake t1_j34ncft wrote
It doesn't matter if people convince themselves or others that some AI tool is AGI. The only thing that matters is if the tool is competent on important tasks, giving it a particular name doesn't change its competency.
All we have are super intelligent tools that are good at unimportant things. It's unreliable because it doesn't really understand anything. It will take serious engineering to integrate unreliable tools into important systems, that will limit its spread in the physical world.
Viewing a single comment thread. View all comments