Viewing a single comment thread. View all comments

sdmat t1_je93qz6 wrote

The arithmetic thing is interesting, GPT4 usually fails at multiplying a pair of 3 digit numbers as pure token prediction:

> Multiply 438 * 939 > > The product of 438 and 939 is 411,182.

But it does actually know how to perform long form multiplication and with the right prompt can get the correct answer this way:

>Multiply 438 by 939 using long form multiplication. Check each step for correctness as you perform it and revise if there is an error. Add trailing zeros to the parts. Perform the final addition in steps by successively adding each part. > >Alright, let's multiply 438 by 939 using long-form multiplication. I'll write out the steps and verify their correctness as I go along. > > 438 > x 939 > _________ >Step 1: Multiply 438 by 9 (the digit in the ones place of 939) > > 438 > x 9 > _________ > 3942 (9 * 438 = 3942, Correct) >Step 2: Multiply 438 by 30 (3 in the tens place of 939, so we multiply by 3 and add a trailing zero) > > 438 > x 3 > _________ > 1314 (3 * 438 = 1314, Correct) >Now, add a trailing zero to this product: 13140 > >Step 3: Multiply 438 by 900 (9 in the hundreds place of 939, so we multiply by 9 and add two trailing zeros) > > 438 > x 9 > _________ > 3942 (9 * 438 = 3942, Correct) >Now, add two trailing zeros to this product: 394200 > >Step 4: Add the three products from steps 1, 2, and 3 > > 3942 (Step 1) > +13140 (Step 2) > +394200 (Step 3) > __________ >Step 5: Add the products in steps > >First, add Step 1 and Step 2: > > 3942 > +13140 > _________ > 17082 >Now, add this sum to the product from Step 3: > > 17082 > +394200 > _________ > 411282 >So, the product of 438 and 939 is 411,282.

It still often makes mistakes doing this, but it clearly has a good grasp of the concept.

It's able to correctly perform the high level long multiplication procedure for large numbers (haven't had the patience to see just how large) but is let down by the reliability of the arithmetic in constituent steps.

A lot of humans have the same problem.

5