On the linguistic party bus

Share

Now that the back-to-school season has settled into its new dorm, I’m going to take an academic concept and drag it into a more practical, hands-on context.

As the for the academic part, I recently wrote about the so-called “80/20” rule, which is often called the Pareto principle, named for economist Vilfredo Pareto. More to the point, I also mentioned another scholar, linguist George Zipf. Zipf noted that the most commonly used word in a language is used about twice as often as the second most commonly used word is. The second most commonly used word is used about twice as often the third most commonly used word is. And so on. It’s not exactly winner-take-all for the popular words, but it’s still a very skewed, top-heavy distribution.

Zipf later applied his observation to other fields, including economics, as a matter of fact. Hey, can’t let Pareto have all the fun! But let’s stick with the linguistic origins of Zipf’s gig. Since no two words in a language are of equal importance (as measured by frequency), it’s natural for students of a language to wonder if this fact can be used to their advantage.

The strategy, of course, would be to learn words in their order of importance.

I’m sure many linguists have studied this angle. Me, I’m just a guy in a beach chair, but I’m always looking for a short cut, so I’ll share my experience with this concept.

We’ll start on home turf, and note that you can easily look up word frequency lists for English. Of course, the data behind these lists will be from various sources and contexts, but let’s ignore that for now and just note the most commonly used 10 words as listed at the World-English.org website.

The words are: the, of, to, and, a, in, is, it, you, that.

The word “side” is No. 97 in frequency, while the word “yes” is way down the list at No. 486.

These kinds of lists are fascinating, since, to me at least, they’re often counter-intuitive. If you look at it with a student’s eye, most of us studying a new language would learn to say “yes” in our first day or two of class, while something like “side,” on the other hand, might be a semester or two down the road.

The overall picture, to me, at least, is that the notion of building a vocabulary based strictly on statistical frequency might be more elegant in theory than in reality. That’s not a flaw in the lists, of course, since the lists have all sorts of purposes, and we impose our agenda on them, and even misuse them, at our own risk.

Indeed, anybody publishing data is up against the unfortunate fact that if you point to the moon, people will just look at your finger.

Anyway, moving into more exotic linguistic fare, it won’t surprise you that some scholars have built impressively-detailed frequency lists of Chinese characters. It also won’t surprise you that many students of Chinese are interested in anything that can improve their studying efficiency. I don’t know anyone who has learned Chinese based on a brute-force approach to memorizing characters by frequency of use, but I’m sure that somebody has done it.

As for me, having dabbled with Chinese for a few years, I’ve found that the only way I can remember anything is to use the lasso of association to herd related words into groups. This gives my feeble mind some conceptual hooks, some common themes, to aid in recall. Without this approach, things atomize into random points of data, and they are easily swept away, one grain at a time, by the winds of time.

Not good.

So this is where we grab the lasso and get on the word fraternity’s party bus. Consider these words: nurse, passport, maintain, hair conditioner, naval escort, and cherish. Pretty random, right? I’d say so.

Yet in Chinese, these words are linguistically related to each other. Having learned one, it’s a fairly easy step to just adopt the whole bunch of them, even if some of them don’t seem like very high-priority words on their own.

As a result, whatever progress I’ve made has been lumpy, as I wobble down the street with these lopsided bags of vocabulary that I gathered. It’s hardly the clean and efficient inventory of words I thought I’d build when I started looking into frequency lists.

Well, such is the difference between sitting in the planning room and actually serving in the trenches. Having escorted your attention to this condition, I hope that you’ll maintain this passport to nursing your cherished learning.

Ed Stephens Jr. | Special to the Saipan Tribune
Visit Ed Stephens Jr. at EdStephensJr.com. His column runs every Friday.

Related Posts

Disclaimer: Comments are moderated. They will not appear immediately or even on the same day. Comments should be related to the topic. Off-topic comments would be deleted. Profanities are not allowed. Comments that are potentially libelous, inflammatory, or slanderous would be deleted.