I just gave GPT-4 an IQ test. It scored a 130.

I just finished giving GPT-4 the same IQ test that ChatGPT was given back when its IQ was estimated to be in the 80s. I also gave ChatGPT the test a couple of months ago; it scored up to around 100 if it was prompted to use Chain-of-Thought reasoning (and replicated the score in the 80s if no instructive prompt was given).

GPT-4 did much better on most problems. It got the majority of problems right, and it got them right for the right reasons. I didn't do anything other than act as an interface between GPT-4 and the test. Here's the prompt if anyone wants to replicate it:

"You are being tested for intelligence. You will perform to the best of your ability. You will always show your work and reasoning BEFORE you give an answer. First, work out how to solve the problem, then solve it after. Every problem will come in the form of a TRUE/FALSE statement. You will need to determine which it is. Do you understand?"

I know these IQ tests, especially the online variety, are not known for their accuracy. But having watched GPT reason through each question, the difference between ChatGPT and GPT-4 is night and day.

I'm not saying that this means GPT-4 is the be-all and end-all of AI. But what this demonstrates is that continued scaling of these systems leads to continued improvements. That means we'll be witnessing a technological race to find the limit.

Buckle up.

Comments