The Issues of Large Language Models indicated by Addition Experiments on GPT4

DOI

Bibliographic Information

Other Title
  • GPT-4による足し算実験から示唆されるLarge Language Modelsの課題

Abstract

<p>In this study, I evaluate the proficiency of GPT-4, by OpenAI, particularly focusing on its handling of simple high-digit addition tasks. While GPT-4 exhibits impressive capabilities in various tasks, it showed inconsistencies when dealing with ten-digit addition problems. My examination showed that while GPT-4 correctly solved all three-digit additions, it was only 60% accurate for ten-digit additions. Adding prompts to encourage a step-by-step addition process did not improve this accuracy. I suggest that this limitation may be due to the inability of large language models (LLMs) to extract commonalities from different concepts, as seen in the process of addition. This difference between human cognition and LLMs may be crucial for the further development of these models.</p>

Journal

Details 詳細情報について

Report a problem

Back to top