New research says all the language in your head takes up as much space as a picture would on a hard drive — about 1.5 megabytes.
A team of researchers from the University of Rochester and the University of California estimates that all the data your brain needs to encode language — at least in the case of English — only adds up to around 1.5 megabytes. The team reached this figure by applying information theory to add up the amount of data needed to store the various parts of the English language.
Quick download
We learn how to speak by listening to those around us as infants. We don’t yet have a clear idea of how this process takes place, but we do know that it’s not a simple case of storing words alongside their definitions as you’d see in a dictionary. This is suggested by the way our minds handle words and concepts — for example, by forming associative clues between the concept of flight and the words “bird,” “wing,” or even “robin.” Our brains also store the pronunciation of words, how to physically create the sound as we speak, or how words interact with and are used with other words.
In an effort to map out how much ‘space’ this information takes up in our brain, the authors worked to convert all of the ways our brain might store a language into data amounts. To do so, they turned to information theory, a branch of mathematics that deals with how information is encoded via sequences of symbols.
The researchers assigned a quantifiable size estimate to each aspect of English. They began with phonemes — the sounds that make up spoken words — noting that humans use approximately 50 phonemes. Each phoneme, they estimate, would take around 15 bits to store.
Next came vocabulary. They used 40,000 as an average number of words an average person would know, which would translate into 400,000 bits of data. Word frequency is also an important element in speech, one which the team estimated would take around 80,000 bits to ‘code’ in our brains. Syntax rules were allocated another 700 bits.
Semantics for those 40,000 words was the single largest contributor to size the team factored in: roughly 12 million bits. Semantics, boiled down, is the link between a word or a symbol and its meaning. The sounds that made up the words themselves were logged under ‘vocabulary’, and this category basically represents the brain’s database of the meaning those sounds convey.
“It’s lexical semantics, which is the full meaning of a word. If I say ‘turkey’ to you, there’s information you know about a turkey. You can answer whether or not it can fly, whether it can walk,” says first author Frank Mollica at the University of Rochester in New York.
Adding it all up came to approximately 1.56 megabytes, which is surprisingly little. It’s barely enough to fill a floppy disk (the ‘save’ icon).
“I thought it would be much more,” Mollica agrees.
Keep in mind that these results are estimations. Furthermore, the team applied their estimation using only English as a subject language. The result should be useful as a ballpark idea of how much space language acquisition takes up in our brains, however. Mollica says that these numbers are broad enough estimates that they might carry over to other languages as well.
The paper “Humans store about 1.5 megabytes of information during language acquisition” has been published in the journal Royal Society Open Science.