Understanding Why LLMs Struggle to Count the ‘R’s in “Strawberry” — Variation 139
Understanding Why Large Language Models Struggle to Count Letters: The Case of “Strawberry”
In recent discussions, you may have seen a common joke pointing out that large language models (LLMs) often fail to accurately count specific letters within words—such as the number of R’s in “Strawberry.” But what’s the underlying reason for these seemingly simple mistakes?
At the core, LLMs process text by segmenting it into smaller units called “tokens.” These tokens are then transformed into mathematical representations known as “vectors.” These vectors serve as the foundational input for the model’s subsequent computations, enabling it to generate responses and understand context.
However, this process has limitations. These models are not explicitly trained to recognize or count individual characters within a word. Because their internal representations focus on patterns and relationships at the level of tokens rather than precise characters, they lack a detailed memory of each letter. Consequently, they often cannot reliably determine, for example, how many R’s are present in “Strawberry.”
For a visual explanation of this concept, check out this detailed diagram here: https://www.monarchwadia.com/pages/WhyLlmsCantCountLetters.html.
Understanding these nuances can shed light on the strengths and limitations of large language models and why they sometimes excel at language tasks but stumble at seemingly straightforward ones like character counting.



Post Comment