Submitted by RadioFreeAmerika t3_122ilav in singularity
elehman839 t1_jdt94ba wrote
Reply to comment by ArcticWinterZzZ in Why is maths so hard for LLMs? by RadioFreeAmerika
Here's a neat illustration of this. Ask ChatGPT to multiply any two four-digit numbers. For example:
Input: 3742 * 7573
Output: The product of 3742 and 7573 is 28350686
The correct answer is 28338166. The bolded digits are right, and the plain digits are wrong. So it gets the first bit right, the last bit right, and the middle bit wrong. This seems to be very consistent.
Why is this? In general, computing the first digits and the last digits requires less computation than the middle digits. For example:
- Determining that that last digit should be a 6 is easy: notice that the last digits of the multiplied numbers are 2 and 3 and 2 * 3 = 6.
- Similarly, it is easy to see that 3000-something times 7000-something should start with a 2, because 3 * 7 = 20-something.
- But figuring out that the middle digits of the answer are 38 is far harder, because every digit of the input has to be combined with every other digit.
So I think what we're seeing here is ChatGPT hitting a "compute per emitted token" limit. It has enough compute to get the leading digits and the trailing digits, but not the middle digits. Again, this seems to be quite reliable.
Viewing a single comment thread. View all comments