ChatGPT could not do simple maths! But Lumo got it right. Why??
Understanding the Math Capabilities of AI Language Models: A Comparative Look at ChatGPT and Lumo
Artificial Intelligence has revolutionized the way we process and interpret data, yet questions about its precision—especially in tasks like basic arithmetic—remain relevant. Recently, a user shared an insightful experience comparing two AI tools: ChatGPT and Lumo. The experiment sheds light on their respective strengths and limitations in handling straightforward calculations, emphasizing the importance of understanding AI’s underlying mechanisms.
The Scenario
The user provided both AI systems with a list of budgetary items, each accompanied by an amount, and tasked them with summing the totals. When the user verified ChatGPT’s initial response with a calculator, discrepancies emerged. Upon prompting for a second attempt, ChatGPT corrected its calculation, providing an accurate sum. Repeating the process with a new list yielded similar results: an initial mistake, followed by a corrected answer after a request to “calculate carefully.”
ChatGPT’s Explanation
In explaining the initial error, ChatGPT highlighted several factors:
-
Human-like Processing of Numbers:
Since ChatGPT is trained to mimic natural language understanding, it processes lists of numbers similarly to how humans read and interpret text. This method is susceptible to errors, especially with lengthy or complex lists, such as misreading entries, double-counting, or skipping items due to formatting nuances. -
Handling Duplicates and Entry Recognition:
The model may misinterpret repeated or similar entries—like noticing a duplicated “Car expenses”—leading to an inflated total if both are included unintentionally. -
Lack of Explicit Verification Procedures:
Unlike spreadsheet programs or calculators that automatically verify sums, ChatGPT doesn’t inherently double-check its arithmetic unless explicitly prompted. Its natural language reasoning can introduce mistakes, especially when performing multi-step calculations internally. -
Trade-offs Between Speed and Precision:
Designed to provide rapid responses, ChatGPT sometimes sacrifices meticulous accuracy. Errors can stem from internal processing slips rather than fundamental computational inability.
The Silver Lining
Importantly, when given clear instructions to slow down and verify calculations, ChatGPT was capable of producing correct totals consistently. In contrast, the user reported that Lumo handled the same task with speed and precision, arriving at the correct sum on the first attempt.
Implications for Users
This comparison highlights key considerations for those integrating AI into routine tasks:
- Explicit Instructions Enhance Accuracy:
Asking AI to proceed step-by-step or double-check computations can
Post Comment