Scrabble Christmas ornaments made by Jennifer Bormann, 2011
In Scrabble, there is a finite amount of resources (letter tiles) that players use to create value (points) for themselves. Similarly, in the real world matter cannot be created so much of human effort is rearranging the particles that exist into more optimal combinations. The way that we keep track of how desirable those new combinations are in the economy is with money. Fiat currency has no intrinsic value–it is just said to be worth a certain amount. Sometimes this value changes in response to other currencies. Other times, governments try to hold it fixed. The “law of Scrabble” has remained unchanged since 1938 when it was introduced–but that may be about to change.
Like any well-intentioned dictator, Scrabble inventor Alfred Butts tried to base the value of his fiat money–er, tiles–on a reasonable system: the frequency of their appearance on the front page of the New York Times. As the English language and the paper of record have evolved over the years, though, the tiles’ stated value has remained static. This has opened the door for arbitrage opportunities, although some players try to enforce norms to discourage this type of play:
What has changed in the intervening years is the set of acceptable words, the corpus, for competitive play. As an enthusiastic amateur player I’ve annoyed several relatives with words like QI and ZA, and I think the annoyance is justified: the values for Scrabble tiles were set when such words weren’t acceptable, and they make challenging letters much easier to play.
That is a quote from Joshua Lewis, who has proposed updating Scrabble scoring using his open source software package Valett. He goes on to say:
For Scrabble, Valett provides three advantages over Butts’ original methodology. First, it bases letter frequency on the exact frequency in the corpus, rather than on an estimate. Second, it allows one to selectively weight frequency based on word length. This is desirable because in a game like Scrabble, the presence of a letter in two- or three-letter words is valuable for playability (one can more easily play alongside tiles on the board), and the presence of a letter in seven- or eight-letter words is valuable for bingos. Finally, by calculating the transition probabilities into and out of letters it quantifies the likelihood of a letter fitting well with other tiles in a rack. So, for example, the probability distribution out of Q is steeply peaked at U, and thus the entropy of Q’s outgoing distribution is quite low.
Lewis’s idea seems to fit with a recent finding by Peter Norvig of Google. Norvig was contacted last month by Mark Mayzner, who studied the same kind of information as the Valett package but did it back in the early 1960s. Mayzner asked Norvig whether his group at Google would be interested in updating those results from five decades ago using the Google Corpus Data. Here’s what Norvig has to say about the process:
The answer is: yes indeed, I (Norvig) am interested! And it will be a lot easier for me than it was for Mayzner. Working 60s-style, Mayzner had to gather his collection of text sources, then go through them and select individual words, punch them on Hollerith cards, and use a card-sorting machine.
Here’s what we can do with today’s computing power (using publicly available data and the processing power of my own personal computer; I’m not not relying on access to corporate computing power):
1. I consulted the Google books Ngrams raw data set, which gives word counts of the number of times each word is mentioned (broken down by year of publication) in the books that have been scanned by Google.
2. I downloaded the English Version 20120701 “1-grams” (that is, word counts) from that data set given as the files “a” to “z” (that is, http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-1gram-20120701-a.gz to http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-1gram-20120701-z.gz). I unzipped each file; the result is 23 GB of text (so don’t try to download them on your phone).
3. I then condensed these entries, combining the counts for all years, and for different capitalizations: “word”, “Word” and “WORD” were all recorded under “WORD”. I discarded any entry that used a character other than the 26 letters A-Z. I also discarded any word with fewer than 100,000 mentions. (If you want you can download the word count file; note that it is 1.5 MB.)
4. I generated tables of counts, first for words, then for letters and letter sequences, keyed off of the positions and word lengths.
Here is the breakdown of word lengths that resulted (average=4.79):
Sam Eifling then took Norvig’s results and translated them into updated Scrabble values:
While ETAOINSR are all, appropriately, 1-point letters, the rest of Norvig’s list doesn’t align with Scrabble’s point values….
This potentially opens a whole new system of weighing the value of your letters…. H, which appeared as 5.1 percent of the letters used in Norvig’s survey, is worth 4 points in Scrabble, quadruple what the game assigns to the R (6.3 percent) and the L (4.1 percent) even though they’re all used with similar frequency. And U, which is worth a single point, was 2.7 percent of the uses—about one-fifth of E, at 12.5 percent, but worth the same score. This confirms what every Scrabble player intuitively knows: unless you need it to unload a Q, your U is a bore and a dullard and should be shunned.
However, Norving included repeats like “THE”–not much fun to play in Scrabble, and certainly not with the same frequency it appears in the text corpus (1 out of 14 turns). With the help of his friend Kyle Rimkus, Eifling conducted a letter-frequency survey of words from the Scrabble dictionary and came up with these revisions to the scoring system:
Image from Slate
Eifling points out that Q and J seem quite undervalued in the present scoring system. So what is an entrepreneurial player to do? “Get rid of your J and your Q as quickly as possible, because they’re just damn hard to play and will clog your rack. The Q, in fact, is the worst offender,” he says.
Now as with any proposed policy update that challenges long-standing norms, there has been some pushback against these recent developments. Stefan Fatsis at Slate quotes the old guard of Scrabble saying that the new values “take the fun out” of the game. Fatsis seems to hope that the imbalance between stated and practical values will persist:
Quackle co-writer John O’Laughlin, a software engineer at Google, said the existing inequities also confer advantages on better players, who understand the “equity value” of each tile—that is, its “worth” in points compared with the average tile. That gives them an edge in balancing scoring versus saving letters for future turns, and in knowing which letters play well with others. “If we tried to equalize the letters, this part of the game wouldn’t be eliminated, but it would definitely be muted,” O’Laughlin said. “Simply playing the highest score available every turn would be a much more fruitful strategy than it currently is.”
In political economy this is known as rent-seeking behavior. John Chew, doctoral student in mathematics at the University of Toronto and co-president of the North American Scrabble Players Association, went so far as to call Valett a “catastrophic outrage.”
Who knew that the much beloved board game could provoke such strong feelings? With a fifth edition of the Scrabble dictionary due in 2014 it seems possible but highly unlikely that there could be a response to these new findings. A more probable outcome is that we begin to see “black market” Scrabble valuations that incorporate the new data, much like underground economies emerge in states with strict official control over the value of their money. Yet again, evidence for politics in everyday life.
For more fun with letter games, data, and coding, check out Jeff Knups’ guide to “Creating and Optimizing a Letterpress Cheating Program in Python.”