While grammar is the skeleton, vocabulary is the flesh of a language. No surprise then that the perennial question on a language student’s mind is: how should I learn vocabulary. But more fundamental still is the question: How to choose the words to learn? The answer is vocabulary frequency.
The methods for learning vocabulary are many and varied, but there is only one thing you need for a solid vocabulary study strategy: frequency.
A focus on frequency is essential for a practical vocabulary in any foreign language. Let’s look at the research to decide what this means for you as a learning and how you can harness the power of frequency.
There are several schools of thought on how choose vocabulary to learn in your target language.
This method is most familiar from school where you were instructed to learn word lists supplied by a teacher or lifted from a textbook.
The question is, are these words actually useful?
For Japanese, Genki is a great example of this approach. Genki 1 and 2 are great textbooks, but they are made for university students.
But I and many others started studying Japanese while working, which significantly decreases the number of conversations about your university major.
It is not so much a question of whether it is useful to know these words, rather it is a question of timing.
While seductive, this approach is rarely effective.
By utilising this approach, you are essentially trying to be a vocabulary sponge. You patiently jot down every unfamiliar word that you encounter.
The trouble is, by the end of the first day you likely have a list longer than your arm. Attempting to cram all this into your head is simply not sustainable.
Besides, how many of the items on this list will you see again?
The logical evolution if from hoarder to connoisseur.
Now you weigh each word before adding it to the list in an attempt to prevent yourself from becoming overwhelmed.
But how do you make the decision as to which word should be learned and which ignored? You might guess which are best, or tie yourself in knots tallying up each time you meet a word or even choose words on the grounds of taste.
All of this is most unsatisfactory. We need a better way.
Vocabulary Frequency: The Research
The scientific evidence for selecting words based on frequency is overwhelming.
The idea of learning more frequent words, before less frequent words is intuitive. But it’s best to examine that intuition before investing study hours into a strategy built on its foundation.
First, much of the research conducted on this topic is from the perspective of learning English, but it is reasonable to assume it holds true no matter your target language.
A native English speaker knows something between 15,000 and 20,000 lemmas. Sorry, what is a lemma? It’s a fancy term for a group of related words. Take book and books, that would be one lemma, despite technically being two separate words.
Needless to say, 15,000 groups is rather a lot. In fact, it means adding an average of 1,000 word-families to your vocabulary every year from the age of 3 to 25.
To attempt to replicate that in a fraction of the time is unrealistic, if not downright delusional. So, how can we cut this mammoth task down into bitesize chunks? Like this…
Hypothetically, if you were an English learner with a vocabulary of only 800 of the most common words in English you would be able to comprehend around 75% of normal spoken English.
Clearly not all words were created equal, some are far more useful that others. So how many should we be aiming for?
How many words is enough?
Right, 800 words gives you 75% of spoken English, but actually how useful is that?
Research shows that students need to aim for a 95% vocabulary coverage to secure reasonable unassisted comprehension in a language.
What gets you from 75% to 95%? Turns out they’ve done the research for that too.
In short, around 3,000 word-families. This secures you an average vocabulary coverage of 95% in any given common context. Just don’t forget that 3,000-word-families is not the same as 3,000 individual words, it is many more.
How should I choose vocabulary? (Spoiler: Vocabulary Frequency)
We know the target, but how do you enact this vocabulary strategy?
You may already be attempting to leverage the power of frequency, but you are likely to be using eyeball frequency. By that I mean targeting words that seem to crop up repeatedly.
For this to be really effective though you will have to keep track of that in some way. Of course, you could just use your memory, but that means letting a lot targets slip through the net. Alternatively, we are back to the tallying from earlier.
While even this would be a significant improvement on some of the other methods mentioned, it is disappointingly inexact.
You’re also likely to get pigeonholed into specific topics or writing that you enjoy. Whether your thing is science or fantasy, you’ll end up with a pretty niche vocabulary.
Truly, the holy grail would be a frequency list. Imagine that a list of vocabulary from most to least frequent.
Luckily exactly that exists.
A Frequency Dictionary of Japanese: The Japanese Vocabulary Frequency Tool
Throwing ‘Japanese vocabulary frequency list’ into Google, will return plenty of results; 101 most frequency words, or verbs et cetera, et cetera.
Again, this is better that nothing, but they’re likely just compiled by native speakers.
I do not mean to do down well-meaning native speakers, but the surprising fact is a native speaker’s impression of which words are more frequent is highly subjective. Their likes and dislike, education and experiences colour their judgement.
So, you can’t rely on native speakers for an accurate frequency list of vocabulary.
Enter the ‘Frequency Dictionary of Japanese’.
This study-altering book uses a scientific approach to create an impartial list of 5,000 of the most frequent Japanese words.
To create this list the compilers the Corpus of Spontaneous Japanese and the Balanced Corpus of Contemporary Written Japanese, together totally 107 million words. The frequency data these provide is taken from texts and transcribed speeches covering a range of styles and genres.
Thus, you can have confidence that you really are learning the most frequently used Japanese words.
What’s better is each entry consists of a Japanese word written in kanji, with romanisation and an English translation. Better still each entry has an example Japanese sentence. Perfect material for all those sentence miners out there (more on that another day).
If this sounds good you can pick up a copy from bookshop.org not only will you be supporting local bookshops, but if you use this link I’ll get a small commission, so you’ll be supporting me too: get it from bookshop.org
Learning any language requires a sound strategy for success.
Trying to learn every Japanese word is a monumental waste of time. The power of frequency provides a short cut to a strong functional vocabulary.
Of course, once you’ve achieved that by all means continue to build on your vocabulary with words useful for topics of personal interest.
If that wasn’t convincing enough, I have one final point to present. Yet more research has revealed that more frequent words are also easier to learn. In fact, frequency is a better predictor of learning ease than word length.
Frequency is the true secret to “exploding your comprehension.” Best of all it is evidence-based so you don’t need to just take my word for it. If you want to look into the research yourself take a look at the source list below.
For me, the vocabulary mountain was always terrifying, but this discovery melted my fear and allowed me to make strides towards my Japanese goal.
That’s it, a straight forward strategy and a clear goal. It’s now just up to you to use it.
- Sagar-Fenton, B / McNeil, L, 2018, How many words do you need to speak a language?, bbc.co.uk
- Tono, Y / Yamazaki, M / Maekawa, K, 2013, A Frequency Dictionary of Japanese, Routledge
- Nation, I.S.P., 2013, Learning Vocabulary in Another Language, Cambridge University Press (ebook)
- Koirala, C, 2015, The word frequency effect on second language learning, research-publishing.net