Join Us

Genuine Artificial Intelligence

July 17, 2025

Why LLM’s can’t count the R’s in the word “Strawberry”

Understanding the Limitations of Large Language Models: Why They Struggle to Count Letters

In the realm of Artificial Intelligence, Large Language Models (LLMs) like GPT-3 are often celebrated for their impressive conversational abilities and text generation. However, a common question arises: Why do these models sometimes fail at simple tasks, such as counting the number of ‘R’s in the word “Strawberry”?

Deciphering How LLMs Process Text

At their core, LLMs process language differently than humans. When given input, these models first segment the text into smaller units called “tokens.” These tokens might represent whole words, parts of words, or even characters, depending on the tokenization scheme used.

Once tokenized, each piece is transformed into a numerical representation known as a “vector.” Think of these vectors as high-dimensional numeric summaries that capture some aspects of the token’s meaning, context, or structure. These vectors are then passed through multiple layers of the model to produce a response.

Why Counting Characters Isn’t in Their Skill Set

The key point is that LLMs are optimized for understanding and generating coherent language, not for detailed, character-by-character counting. Since the vector representations do not encode explicit, precise information about individual characters within a token, the models lack a direct means of knowing, for example, how many ‘R’s are present in “Strawberry.”

This limitation explains why LLMs—designed to grasp the semantic and syntactic nuances of language—often stumble on tasks that require exact letter counts or similar low-level text specifications.

Visual Aid for Better Understanding

For a more detailed explanation, including helpful diagrams that illustrate how tokenization and vectorization work, check out this resource: https://www.monarchwadia.com/pages/WhyLlmsCantCountLetters.html. Although images can’t be embedded here, the visual explanations offer valuable insights into the inner workings of these models.

Conclusion

While LLMs excel at understanding language context and generating human-like text, their internal representations are not designed for granular tasks like counting specific letters. Recognizing this helps us better appreciate both the strengths and limitations of these powerful AI systems.