milktoasttraitor t1_jdzuw0z wrote on March 28, 2023 at 12:23 PM

Reply to comment by rfxap in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

If you look at the prompt they show, they clearly gave it hints which tell it the exact approach to use in order to solve the problem. The problem is also a very slight derivative of another existing, very popular problem on the platform (“Unique Paths”).

This is impressive in another way, but not in the way they were trying to show. They didn’t show the other questions it got right, so no way of telling how good or bad the methodology was overall or what hints they gave it. For that question at least, it’s not good and it makes me skeptical of the results.