More Baby Name Regulation

We just talked about this less than two weeks ago: countries that have lists of banned baby names, or lists of permissible names. Azerbaijan will soon join the second category, with one important difference. The Azeri government’s justification for the new rule is that some monikers are politically unacceptable:

The “green” list will include names which can be freely given to children. The “yellow” list will contain unwelcome names – these might be ones likely to be mocked or that sound bad in other languages. The names from the third category, the “red” list, will be forbidden. They might refer to people who are considered aggressors against the Azeri people or have double or obscene meaning in the Azeri language.

A government committee has been compiling the list of acceptable names (currently about 8,000) for several years. The most common bans are for “foreign sounding” names like Dmitry and Lenin. “National” names like Pioneer and Tractor are on the green list.

[via PRI’s The World]

Accidents, Worker Safety, and Coming Due

Over the holidays my dad posed a two-part question after dinner: If a vehicle goes 200,000 miles without a mechanical failure does that mean that it is more likely to have a failure soon? And does the same hold true for an employee who goes 25 years without a lost-time accident at work?

In answer to the first question I responded “no.” Failure rates of electrical and mechanical devices are typically modeled as an exponential distribution. This distribution is “memoryless” in the sense that the probability of having an accident at time t=1 given no accidents up til time t=0 is the same as the probability of having an accident at time t=2 given no accidents through time t=1.

For the employee question the answer is a bit more complex. Here I would assign workers to two types: accident prone or safe. The longer a worker is accident free, the higher the probability that they are in the safe category. (I would use a beta-binomial model with Bayesian updating for those playing at home). A more realistic analysis might include aging effects such as deterioration of eyesight and balance on the one hand, with gaining wisdom and on-the-job experience on the other. For all I know such a study may exist.

At its core this is the same idea as an athlete “coming due” in baseball–false–or having a “hot hand” in basketball”–also false. The interesting part about all this to me is that for two superficially similar problems different models are required by virtue of the fact that humans are involved. Mechanical parts are memoryless; people are not. That makes all the difference.

The Britiſh are Leaving: Law and Legislation for the English “S”

Longfellow's famous poem, written in 1861,  never appeared with the long "s"

Longfellow’s famous poem, written in 1861, never appeared with the long “s”

On Wednesday we looked at a few extinct English letters. During that discussion you may have thought about the long s, resembling an “f” without the crossbar, frequently used in 18th century texts. You have probably noticed that ſ is used mostly at the beginning and middle of words, but seemingly never at the ends. As it turns out, there is a whole structure of rules governing the use of “s”-sounds in English.

The BabelStone blog took a long hard look at this issue, starting with codified rules for English:

  • The long ſ muſt never be uſed at the End of a Word, nor immediately after the short s. (Thomas Dyche’s A Guide to the English Tongue, 1785)
  • A long ſ muſt never be placed at the end of a word, as maintainſ, nor a ſhort s in the middle of a word, as conspires. (Nathan Bailey’s An Universal Etymological English Dictionary, 1756)
  • All the ſmall Conſonants retain their form, the long ſ and the ſhort s only excepted. The former is for the moſt part made uſe of at the beginning, and in the middle of words ; and the laſt only at their terminations. (James Barclay’s A Complete and Universal English Dictionary, 1792)

Readers of this blog are aware that codified rules do not always match common usage. This is the difference between legislation and law, at the root of our understanding of micro-institutions. So how were s and ſ used in common practice? BabelStone took a look at this too, using a Google Books search and summarizing the empirical results. Here are the rules for English-speaking countries:

  • short s is used at the end of a word (e.g. hiscomplainsſucceſs)
  • short s is used before an apostrophe (e.g. clos’dus’d)
  • short s is used before the letter ‘f’ (e.g. ſatisfactionmisfortunetransfuſetransfixtransferſucceſsful)
  • short s is used after the letter ‘f’ (e.g. offset), although not if the word is hyphenated (e.g. off-ſet) [see Short S before and after F for details]
  • short s is used before the letter ‘b’ in books published during the 17th century and the first half of the 18th century (e.g.husbandShaftsbury), but long s is used in books published during the second half of the 18th century (e.g. huſband,Shaftſbury) [see Short S before B and K for details]
  • short s is used before the letter ‘k’ in books published during the 17th century and the first half of the 18th century (e.g. skin,askriskmasked), but long s is used in books published during the second half of the 18th century (e.g. ſkinaſkriſkmaſked) [see Short S before B and K for details]
  • Compound words with the first element ending in double s and the second element beginning with s are normally and correctly written with a dividing hyphen (e.g. Croſs-ſtitchCroſs-ſtaff), but very occasionally may be written as a single word, in which case the middle letter ‘s’ is written short (e.g. Croſsſtitchcroſsſtaff).
  • long s is used initially and medially except for the exceptions noted above (e.g. ſonguſepreſsſubſtitute)
  • long s is used before a hyphen at a line break (e.g. neceſ-ſarypleaſ-ed), even when it would normally be a short s (e.g. Shaftſ-bury and huſ-band in a book where Shaftsbury and husband are normal), although exceptions do occur (e.g. Mans-field)
  • double s is normally written as double long s medially and as long s followed by short s finally (e.g. poſſeſspoſſeſſion), although in some late 18th and early 19th century books a different rule is applied, reflecting contemporary usage in handwriting, in which long s is used exclusively before short s medially and finally [see Rules for Long S in some late 18th and early 19th century books for details]
  • short s is used before a hyphen in compound words with the first element ending in the letter ‘s’ (e.g. croſs-piececroſs-examinationPreſs-workbird’s-neſt)
  • long s is maintained in abbreviations such as ſ. for ſubſtantive, and Geneſ. for Geneſis (this rule means that it is practically impossible to implement fully correct automatic contextual substitution of long s at the font level)

When did ſ drop out of common usage? According to Google N-grams, there was a sharp decline in its popularity around the turn of the 19th century. I wonder whether new printing technologies or textbooks had anything to do with this.


Micro-Institutions Everywhere: The English Alphabet

We take our ABC’s for granted, learning 26 letters in a precise order from our youngest days. When introduced to a second or third language later in life we may realize that even similar tongues to English contain slightly different alphabets–the Spanish ñ, the French ç–despite the fact that they evolved from the same roots. Historical variation in the English alphabet seems largely glossed over in contemporary education, but identifying some of the “missing letters” can help explain a few historical puzzles.

First, there’s ampersand, considered the 27th letter of the English alphabet until about 150 years ago. It’s name comes from its position at the end of the ABC’s:

The word “ampersand” came many years later when “&” was actually part of the English alphabet. In the early 1800s, school children reciting their ABCs concluded the alphabet with the &. It would have been confusing to say “X, Y, Z, and.” Rather, the students said, “and per se and.” “Per se” means “by itself,” so the students were essentially saying, “X, Y, Z, and by itself and.” Over time, “and per se and” was slurred together into the word we use today: ampersand. When a word comes about from a mistaken pronunciation, it’s called a mondegreen.

Anglosaxonrunes.svgBefore the introduction of the Latin alphabet after the Roman conquest of Britain, Anglo-Saxon had an alphabet all its own known as furthorc. In the ensuing battle of cultural power politics Anglo-Saxon lost out. Collateral damage included the letter “thorn,” pictured at right, pronounced with the hard “th” sound. It was replaced by the humble Y, always ready to do double duty in that ambiguous no-man’s-land between consonants and vowels. This explains the anachronistic use of Y in titles like “Ye Olde English Shoppe”–it’s just another spelling of “the.”

On Friday we’ll take a look at another missing letter, the long s (resembling “f”). For a sneak peek and a list of nine other extinct English letters, check out this article from MentalFloss (via @johndcook).

Converting and Standardizing Country Names/Codes in R

regular_expressionsWe have run into this issue before: you have n \geq 2 datasets with 1 < k \leq n different coding schemes for the cross-sectional unit. You need to get them all standardized so you can merge the data and increase the measurement error  control for a reviewer’s favorite variable run some models.

Last week I was about to spend some time merging alphanumeric ISO codes with their COW counterparts, when I ran across the new countrycode package in R.* The package uses regular expressions to convert between the following supported formats:

  • Correlates of War character
  • CoW-numeric
  • ISO3-character,
  • ISO3-numeric
  • ISO2-character
  • IMF numeric
  • FIPS 10-4
  • FAO numeric
  • United Nations numeric
  • World Bank character
  • official English short country names (ISO)
  • continent
  • region

The author is Vincent Arel-Bundock, a doctoral student in comparative politics at Michigan. Thanks Vincent!


* New here meaning I didn’t know about it before and its documentation is dated Jan. 20, 2013.

Regulating Baby Names

BabyNames_NameTagIn America we have a tradition of ridiculous baby names dating back to our Puritan founders. Without regulation, we end up with names like Noun, Comma, and even Semicolon. There’s even a whole book of Bad Baby Names. Citizens of other countries, including Iceland, Germany, Sweden, China and Japan, aren’t subjected to such risks:

In the case of Iceland, it’s about meeting certain rules of grammar and gender, and saving the child from possible embarrassment. Sometimes, although not in every case, officials also insist that it must be possible to write the name in Icelandic.

There is a list of 1,853 female names, and 1,712 male ones, and parents must pick from these lists or seek permission from a special committee.

Maybe we need stronger regulation to prevent disasters like Hilary, now considered the most poisoned name in US history. More at the BBC.

Interviews with Over 50 IR Scholars

Readers of this blog may enjoy Theory Talks, which I recently discovered thanks to a link on Twitter that I cannot remember now. Here’s how the site describes itself:

Theory Talks is an interactive forum for discussion of debates in International Relations with an emphasis of the underlying theoretical issues. By frequently inviting cutting-edge specialists in the field to elucidate their work and to explain current developments both in IR theory and real-world politics, Theory Talks aims to offer both scholars and students a comprehensive view of the field and its most important protagonists.

The interviews tend to follow a pattern of questions, which I like because you can compare views between scholars in different interviews. The three questions they ask in every interview I have read so far are:

  1. What is, according to you, the biggest challenge / principal debate in current IR? What is your position or answer to this challenge / in this debate?
  2. How did you arrive at where you currently are in IR?
  3. What would a student need to become a specialist in IR like yourself?

Here are some interviews with big names to get you started:


Micro-Institutions Everywhere: Around Your Waist

US soldier in WWI. Note the belt is used to carry gear, and is held up by a strap over the shoulder. The pants lack belt loops, which would not be invented until the 1920s.

US soldier in WWI. Note the belt is used to carry gear, and is held up by a strap over the shoulder. The pants lack belt loops, which would not be invented until the 1920s.

Men have always worn belts with their trousers, right? Wrong. Until the First World War, belts served one of two purposes. They could be a way for a ruler to accessorize, or an easy way for soldiers to carry around gear (a belt with pouches all around it for ammunition is essentially a manly fanny pack, after all). No one thought of a belt as a way to keep his pants up.

There are three reasons that belts became de rigueur around 1920 and not before. One reason for this is that until then tailoring was relatively inexpensive:

Logic or no logic, the fact remains that it was easier to develop special and general relativity than to imagine trousers secured with leather belts inserted into belt loops. That does not, however, mean that pre-20th century pants have been dropping off. Trousers were highly cut and waist-fitted to the contours of their wearers, as such tailoring adjustments cost pennies. Then, in the 1820s suspenders have been invented. From then onwards, even mass manufactured trousers could be worn without individual fitting (though tailors’ services still cost pennies).

WWI brought with it the need for mass production of uniforms. Tailoring all those trousers was not feasible, and quartermasters sought to economize on cloth:

Mass production of uniforms for nationally conscripted armies in the time of war shortages forced national governments to trim as much material as possible. The trousers were made with such a low cut that suspenders became loose, and they needed to tie these funny trousers with a wide belt that was worn over the coat. Men discharged from the army got used to this silly fashion. Because the belts did not sit well on trousers, belt loops were introduced in the early 1920s.

Thirdly, those snappy waistcoats and vests that everyone wore before this were not just for fashion–they were hiding suspenders. Suspenders were regarded as undergarments, akin to a woman’s corset. After WWI the waistcoat’s popularity waned, and with it the use of suspenders. You can read more about these developments here, from which the above quotations were taken.

A daily habit for millions of men around the world turns out to be just another contingent fact of history. Even the things we take for granted are shaped by politics and norms. Who knew you had a micro-institution around your waist?

The New Netflix Strategy: Gambling on House of Cards

NetflixGamblingOne week ago Netflix introduced its first original series, House of Cards. The series details the life and crimes of (fictional) US Congressman Francis Underwood and his wife Claire who runs a nonprofit. What is unique about the series is that the entire season–13 episodes–was released all at once. Netflix and streaming services like it have acclimated us to watching shows in bulk like this. Is the new model sustainable?

I hope so, and Atlantic Wire reporter Rebecca Greenfield thinks the answer is yes:

With Netflix spending a reported $100 million to produce two 13-episode seasons of House of Cards, they need 520,834 people to sign up for a $7.99 subscription for two years to break even. To do that five times every year, then, the streaming TV site would have to sign up more 2.6 million subscribers than they would have. That sounds daunting, but at the moment, Netflix has 33.3 million subscribers, so this is an increase of less than 10 percent on their current customer base. Of course, looking at Netflix’s past growth, that represents pretty reasonable growth for the company that saw 65 percent growth from 20 million to over 33 million world-wide streaming customers. Much of that growth, however, comes from new overseas markets. But, even in the U.S., from one year ago, Netflix saw about 13 percent streaming viewer growth jumping from 24 million to 27 million.

The five times per year figure comes from a plan that Netflix CEO Reid Hastings revealed in an interview with GQ. Paying for subscription television like this is not a new idea–it’s a similar business model to HBO. But Netflix seems to have the execution right, at least with this first foray.

Perhaps the biggest difference with convention television is that it doesn’t matter how many people watched House of Cards during its debut week. As Hastings said in a letter to investors two weeks ago:

Linear channels must aggregate a large audience at a given time of day and hope the show programmed will actually attract enough viewers despite this constraint. With Netflix, members can enjoy a show anytime, and over time, we can effectively put the right show in front of members based on their viewing habits. Thus we can spend less on marketing while generating higher viewership.

For linear TV, the fixed number of prime-time slots mean that only shows that hit it big and fast survive, thus requiring an extensive and expensive pilot system to keep on deck potential replacement shows. In contrast, Internet TV is an environment where smaller or quirkier shows can prosper because they can find a big enough audience over time. In baseball terms, linear TV only scores with home runs. We score with home runs too, but also with singles, doubles and triples.

Because of our unique strengths, we can commit to producing and publishing “books” rather than “chapters”, so the creators can concentrate on multi-episode story arcs, rather than pilots. Creators can work on episode 11 confident that viewers have recently enjoyed episodes 1 to 10. Creators can develop episodes that are not all exactly 22 or 44 minutes in length. The constraints of the linear TV grid will fall, one by one.

I look forward to seeing more of this strategy, and as I proceed with House of Cards you may even get a post on its politics.

The Roman Internet


Schematic of the Roman internet, er, road network

Schematic of the Roman internet, er, road network

Terence Eden asks why the Romans didn’t invent the internet:

What I find interesting is that there was nothing fundamentally to stop the Romans – or any other ancient civilization – from creating such a network. The Greeks experimented with it in 4BCE but it seems it never really caught on. Tower building is easy, as is flag waving or other mechanical forms of signalling. Their technology was certainly capable of building a proto-Internet. That would have had some profound changes to our history.

It’s an interesting counterfactual, but I think the better answer is that the Romans did invent the internet in a manner of speaking. The Roman road network was an incredible innovation for its time. Thanks to the ORBIS project at Stanford, we can gain a better appreciation for how significantly Roman roads reduced the cost and time required to travel. Robert Gonzalez describes how many variables you can manipulate in the simulations:

Tell us, would you like to travel to Rome by road, river or open sea? Would you stick to the coasts or set a course through the mainland? During which month would you journey? Would you opt for the fastest route (bearing in mind that the shortest course does not always translate to the quickest passage) or the cheapest? Speaking of expenses, how much would this journey cost you, anyway? (Please give your answer in denarii.)

You can try the interactive maps for yourself here.