For most of my life, I have pondered a question that sits at the very center of how
our brain works: how do we understand language?
The question isn’t how we repeat language, nor how we recognize its surface
patterns, but how we understand its meaning in context. If you ask ten experts
about this subject, called Natural Language Understanding or NLU, you will get
ten different definitions because there are many theories available in academia!
These tend to have origins in the 1950s or before and can be seen as valid
competitors in the absence of working solutions.
But to most people, understanding is simple. It is the moment when words
connect to meaning. You do not ‘predict’ meaning (a popular paradigm in today’s
machine learning community): you experience it. To many of us, we just assume
that we receive meaning directly not the sounds that make it up!
Long before artificial intelligence (AI) became a marketing term, I was studying
how the brain works by learning what is it normally does, how we can improve it,
what happens when it fails and what scanning technology can teach us. In short,
I studied cognitive science. I read about neuroscience, psychology, philosophy,
linguistics, computer science and anthropology – the building blocks of cognitive
science. As a computer engineer and software developer, the use of logic and
computer processing could be seen to be applied everywhere. For decades I
worked in large technology companies, running teams and building systems, but
always with this core fascination: what does it mean for a machine to truly
understand the meaning of language?
Over time I was persuaded by very popular concepts in our history. Things like
the semiotics model of C.S. Peirce, the need for rigorous science as proposed by
John R. Pierce of Bell Labs, and most recently, the linguistics work developed by
Robert D. Van Valin, Jr. that is known as Role and Reference Grammar or RRG.
These theories come from great scientists who have solid support for their
theories.
Today’s popular AI approach, Generative AI, is very different, but I argue that
human languages are not statistical, nor powered by big data. Our languages are
demonstrably not the result of predicting the next word based on a training set.
Children readily learn language in stages without needing to consume billions of
books! Children’s Childrens’ brains do not need to calculate probabilities across
millions of sentences because brains aren’t good at that, and language works
with phrases, not word sequence probabilities. People don’t consult corpora
before speaking. A child hears a limited number of utterances and forms
associations between sound and meaning in which the meaning is experienced
as sensory input. Our brain’s innate capability to recognize the world allows for
generalization that language can then exploit by using the patterns of language.
This observation led me to build Pattern-atom theory, or Patom theory (PT). Its
cognitive architecture is based on a brain model that addresses the many
unknown explanations for brain damage and function.
In short, PT models a brain as one that just stores, matches and uses
hierarchical, bidirectional patterns. These patterns are atomic, in that a brain can
store a pattern only once. Sensory information pours in and is organized by what
has been experienced before. Motor control moves our body by controlling a
large number of muscles in sequence with feedback via proprioception, inner ear
balance and our dominant sensors of vision and hearing.
Patom theory is a level above ‘how the brain works.’ Instead it focuses on ‘what
the brain does,’ a theory that gives predictive power as well as descriptive power.
It models how we remember and represent things: patterns laid on top of
patterns, creating meaning-based networks that allow flexible understanding with
bottom-up recognition from senses and top-down recognition of what we expect.
As a trained cognitive scientist in a developing and immature field, AI, many of
the things I’d been taught didn’t work in practice. Using parts of speech to model
phrases for example, creates an unsolvable combinatorial explosion that held
back progress in NLU for decades. There are many ideas that form the bedrock
of NLU that need to be broken up.
The first time I integrated PT on my computer and watched it recognize phrases
it had never seen before, I realized we had crossed an important line. It was a
proof of concept that a machine could understand accurately, not by probability,
but by cognition. A direct mapping from language to meaning was on the path,
but still there were cases that didn’t resolve to a single meaning. What was
missing was a true following of RRG – incorporating RRG fully. The right kind of
pattern matching gives rise to machines that can understand in a way similar to
human understanding.
At the time, almost nobody wanted to hear this because they were invested in
expensive projects needing improvements in deep learning – a statistical model
using neural networks that is unlike a human brain’s model since human brains
are bidirectional, not unidirectional in operation.
The field was heading toward bigger models, bigger datasets, bigger compute
but bigger does not mean better. A parrot can repeat a phrase beautifully, but the
parrot may not know what it means. A child, even one who has heard far less
language, knows exactly what “give me the ball” means because the pattern is
grounded, meaning it includes the knowledge of what a ball looks like, what it
does, how heavy it is, and so forth. They also know what ‘give’ and ‘me’ means.
Grounding is not optional for human-like AI. Grounding is the connection
between experience, perception, and meaning and is best retained in its source
modalities, connected within our brain.
For years I have published my ideas in articles and interviews and have spoken
at conferences to demonstrate multi-language cognitive models. I showed how
the same internal memory patterns could support English, Chinese, French,
Japanese, Korean, and more. Unlike systems of prediction, I showed how
understanding language enables a new generation of AI to help us. I showed the
path to grow a cognitive architecture incrementally to drive applications in exactly
the same way PT expects a brain to do.
What was missing? I had built the engine, but not the vehicle. I needed a real-
world application that could demonstrate what Cognitive AI can do when it is
helping real people in real situations and expands its scale at the same time.
That moment came when I met Chris Lonsdale.
Chris approached the problem from a different angle. Where my work focused on
language understanding, his work focused on human understanding — how the
brain acquires language naturally. It was a case of theoretical science meeting
applied science. Listening to him describe the neurological principles behind
rapid language acquisition, I immediately recognized the parallels. He was
describing, in practical terms, what I had been building from theory: an
environment where meaning takes priority over memorization, where multi-
sensory input forms deeper memory pathways, and where patterns of
understanding grow naturally from layers of simpler meanings. It became clear
that we were two sides of the same coin. Chris was building the optimal
environment for a human learner, and I was building the optimal architecture to
help human learners.
When Chris told me he wanted to build the next evolution in brain-based
language learning, one that could respond to learners in real time, adapt to their
level, and give them meaningful conversational opportunity without fear of
embarrassment, I knew exactly what role Cognitive AI could play. It could be the
bridge. It could be interactions that make learning feel natural and adaptive in
context. It could replicate the way a parent supports a child’s language growth,
but not by correcting them with grammar rules. Instead, understanding their
intentions and guiding them appropriately is key.
This is the foundation of Speech Genie.
Speech Genie is not a chatbot, and it is not an LLM with a character skin. It is the
first real-world implementation of cognitive artificial intelligence built for language
acquisition. The AI inside Speech Genie does not predict the next word based on
statistical probability. It understands what the learner is trying to communicate. It
notices patterns, recognizes meaning and adapts its responses to what the
learner can process based on their acquisition of vocabulary and phrases to
date.
And most importantly, it helps the learner build the mental patterns of the new
language the same way a child builds patterns in their mother tongue.
Most AI systems today can produce fluent language, but they do not know what
they are saying. Speech Genie’s Cognitive AI is different. It knows what the
words mean, how they relate to each other, and what the learner is likely trying to
express. This allows it to give feedback that is meaningful, not mechanical. It can
guide pronunciation, highlight misunderstandings and design interactions that
accurately meet the learner at their current level of ability. When a learner says
something slightly wrong, the system knows the intended meaning and helps
steer them gently toward the proper expression — just like a real parent would.
If you pay attention, you will notice that the human brain does something
remarkable with language. It recognizes patterns in their entirety, not as a
disconnected collection of things. It stores meaning in ways that allow infinite
flexibility from finite examples. A child does not hear every possible sentence
before learning to speak. They hear and recognize small auditory patterns that
link to meanings and, in turn, generalize with them into a growing and limitless
set of expressions. PT is built around this exact mechanism requiring it to store
patterns in biologically plausible ways. This results in generalization and
creativity as side-effects.
For this reason, Speech Genie does not require massive datasets. It learns and
adapts through structured patterns, not statistical brute force. This makes the
system lightweight, efficient, and more aligned with human cognition. It also
makes it safer and more predictable because the system is grounded in
meaning, not in correlations. The problem of hallucinations comes from
limitations in the statistical approach in statistical AI. In contrast, by recognizing
what the learner is saying with the correct patterns, Speech Genie can guide
them towards clarity.
When Chris and I began combining his brain-based learning methods with
Cognitive AI, something clicked immediately. His work explains how humans
acquire language: through relaxed listening, comprehensible input, multi-sensory
cues, mouth-shape mimicry, gestures, and contextual immersion. My work
explains how a machine can understand language based on how languages
work, and guide a learner without relying on large-scale memorization.
The result is a system that makes language acquisition feel intuitive, natural, and
emotionally safe. And we learn best when emotions from stress are minimized.
One of the biggest obstacles adult learners face is fear. Fear of speaking. Fear of
making mistakes. Fear of looking foolish. Fear shuts down cognitive flexibility and
reduces the brain’s ability to recognize new patterns. Speech Genie removes that
fear entirely. When you speak to the Genie, you are in a judgment-free
environment. You can practice without embarrassment. You can experiment. You
can make mistakes and receive instant feedback that feels helpful, not punitive.
The AI Genie is designed to support, not to judge. It listens carefully,
understands meaning, and guides you gently.
From the AI’s perspective, the feedback it gives is grounded in cognition, not
statistics. It does not say, “People usually say this phrase next.” Instead, it says,
“I understand the meaning you are trying to express. I think THIS is what you
mean, is that correct?” This difference is subtle but profound. It is the difference
between information and intelligence, between imitation and understanding.
As we build Speech Genie, I keep coming back to a recurring thought: this is the
first time in the AI world where we can apply a meaning-based model to real
human learning. We are not trying to fool people into thinking the system is
intelligent. We are trying to give people a tool that genuinely helps their ability
grow. We are helping them build memory patterns that are needed to achieve
true fluency.
Like Chris, I have seen enormous potential in helping people unlock language.
Language is one of humanity’s greatest tools for connection, creativity, and
success. When someone acquires a new language, their world expands. They
gain opportunities they did not have before. They communicate with new
communities. They access new cultures, new ideas, new relationships. And
language is an ability that grows with use. It never stops giving back.
Speech Genie is designed to accelerate this process by aligning human learning
with the support of a machine. The human learns through natural acquisition. The
machine guides through cognitive comprehension. The combination allows
progress that feels easy, authentic, and deeply rewarding.
When people ask me why I chose to bring this technology to the world now, the
answer is simple. For the first time, the technology is ready. The cognitive
foundations are proven. The understanding engine works. And with Chris’s
lifetime of work on brain-based learning, we finally have the perfect environment
in which to deploy it. We are building something that not only teaches languages
but changes how people think about learning itself.
Speech Genie is not the end of Cognitive AI. It is the beginning. It is the first step
toward a future where machines can act as genuine cognitive partners. By not
overwhelming us with data, our brains use their ability to learn. By understanding
us and helping us understand the new language like we learned our first,
progress is effortless. By recognizing meaning in the examples, the system helps
us learn the best way under the control of the Genie, and as designed by experts
in language learning. The system is designed to support us as we grow.
If you join us in this journey, you are not just learning a language. You are
participating in the birth of a new approach to human improvement. It’s an
approach that values meaning over prediction, understanding over imitation, and
cognition over correlation.
I invite you to be part of this movement because it is built to make your journey
effortless. The approach is new and impressive, but most of all it can
fundamentally change the way humans interact with knowledge, with machines,
and with each other.
Language is the gateway to understanding, whichever language you choose.
Understanding is a foundation for human intelligence. And human intelligence, in
every meaningful sense, begins with patterns of meaning.
Speech Genie brings those patterns to life.
John Ball, Cognitive Scientist
