What Can Software Developers Learn from Tigers?

Object-oriented programming is a powerful way of modeling the world. Objects encapsulate data and behavior, and can interact and be composed in many useful ways.

As developers, one question we often consider is which types (in both technical and non-technical sense of the word) of objects to privilege by building them into our systems. Is an Address a proper object, or is it just a bit of data that can be encapsulated under a User? Are PowerUsers and NewUsers different enough to merit their own classes? (Probably not.)

But as our systems evolve, it can become difficult for existing object models to respond to new requirements. That’s apparently what is currently happening with tiger species. Because there are nine recognized tiger species, preservation efforts are spread across the six non-extinct species. A new proposal based on examination of DNA suggests that there should only be two tiger species.

As one researcher quoted in the piece says, “It’s really hard to distinguish between tigers…. The taxonomies are based on data from almost a hundred years ago.” Although your object model may not have legacy code that is quite that old, this case demonstrates the importance of reconsidering what traits and behaviors you allow to be first-class citizens in your system.

In the case of tigers, reducing the number of recognized species to two (one inhabiting continental Asia, and another the archipelago of Indonesia) would allow conservationists to try more flexible strategies. One example mentioned in the article is moving tigers within the same (redefined) species from one area to another (the updated definition of a continental Asian tiger would include the Amur tiger of Russia, the Bengal of India, and the South-China tiger). To a non-expert, it seems that interbreeding between these population groups would also help increase their numbers and perhaps increase genetic diversity.

Figuring out how to classify real-life entities can be very difficult. For tigers, what characteristics define a species? A century ago it might have been their physical appearance, while today we can look at the genetic level. In software, we have to think hard about the taxonomies we choose because they quickly become metaphors we live by.

The lesson for developers is that making your object model too fine-grained can introduce unexpected constraints when requirements change. To paraphrase Keynes, “When my information changes, I alter my code.”

Micro-Institutions Everywhere: Species and Regime Types

geeks_evolveIn a two-for-one example of micro-institutions, Jay Ulfelder blogs this paragraph from a paper Ian Lustick:

One might naively imagine that Darwin’s theory of the “origin of species” to be “only” about animals and plants, not human affairs, and therefore presume its irrelevance for politics. But what are species? The reason Darwin’s classic is entitled Origin of Species and not Origin of the Species is because his argument contradicted the essentialist belief that a specific, finite, and unchanging set of categories of kinds had been primordially established. Instead, the theory contends, “species” are analytic categories invented by observers to correspond with stabilized patterns of exhibited characteristics. They are no different in ontological status than “varieties” within them, which are always candidates for being reclassified as species. These categories are, in essence, institutionalized ways of imagining the world. They are institutionalizations of difference that, although neither primordial nor permanent, exert influence on the futures the world can take—both the world of science and the world science seeks to understand. In other words, “species” are “institutions”: crystallized boundaries among “kinds”, constructed as boundaries that interrupt fields of vast and complex patterns of variation. These institutionalized distinctions then operate with consequences beyond the arbitrariness of their location and history to shape, via rules (constraints on interactions), prospects for future kinds of change.

Jay follows this up with an interesting analogy to political regime types–the “species” that political scientists study:

Political regime types are the species of comparative politics. They are “analytic categories invented by observers to correspond with stabilized patterns of exhibited characteristics.” In short, they are institutionalized ways of thinking about political institutions. The patterns they describe may be real, but they are not essential. They’re not the natural contours of the moon’s surface; they’re the faces we sometimes see in them.

I have no comment other than that I think Jay is right, and it reminds me of a Robert Sapolsky lecture on the dangers of categorical thinking. And yes, Sapolsky is a biologist. We’ll go right to the best part (19:40-22:05) but the whole lecture is worth watching:

Four Metaphors for the Internet (and Politics)

Does the internet have a political disposition? Or is it inert, providing a tabula rasa for political ambitions already familiar to the offline world? Likely the truth lies somewhere between these two poles. Building on an earlier post, here I discuss four possible metaphors for understanding the politics of the internet: biology, institutionalism, the commons, and geography.

Biology

A Darwinian biological metaphor of the internet carries a certain appeal, resembling as it does the dominant paradigm of a “harder” science. With the advent of the internet so recent in our past, it is easy to apply the term “evolution” to its growth from a few interconnected government computers to a global communications system.

A strong interpretation of the biological theory views the internet as a sentient creature. Author Kevin Kelly adopts this perspective in his book What Technology Wants.

This approach has its weaknesses. First, the evolutionary metaphor glosses over how little we understand about the emergence of technological communications systems. It treats technological development as exogenous to politics, whereas there were clear political motivations for early computers (WWII) and computer networks (Cold War). Second, biological evolution has a clear theoretical mechanism–natural selection–that technological evolution lacks. The actions of numerous conscientious designers contributed to the internet and its structure, whereas theories of biological evolution generally lack designers.

Institutionalism

The institutional approach helps address both weaknesses of the biological metaphor. The main premise of “internet institutionalism” is that the structure of the internet can help to understand its political uses. Mike Barthel, building on ideas developed by Alexander Galloway argues that the physical structure of the internet–decentralized, as yet ungoverned–lends itself to a form of libertarianism. However, Barthel’s thesis generally concerns politics on the internet (e.g. comments on political news sites and blogs) rather than a politics of the internet (the general behavior of online communities).

Commons

The metaphor of the commons has been a powerful one for understanding political behavior since it was first introduced. A commons is a shared resource from which individuals desiring entrance cannot be denied access. Centralized governance of a commons is extremely difficult, if not impossible. However, decentralized governance mechanisms have been shown to emerge in many varied settings (see in particular the work of Elinor Ostrom).

The best picture of the commons for understanding the internet is the ocean. The sea is accessible from every continent, free to anyone who can afford the cost of the technology necessary to navigate it. Like the ocean, the internet remains largely ungoverned outside of a few limited areas. Coincidentally, one type of “outlaw” behavior on the internet also uses a familiar maritime image: piracy.

The main weakness of the commons metaphor is its strong tie–deservedly–in the minds of many scholars to Hardin’s “tragedy.” The internet is generally not at risk of overuse. Quite the opposite: through the power of network effects, the more people use the internet, the greater its value to other users. The primary strength of this approach may be more accessible through the fourth and final analogy.

Geography

The final metaphor, geography, leverages political scientists’ familiarity with the way that borders and land formations can affect political behavior.

Specifically, this metaphor draws on an understanding of geography introduced by James C. Scott. Like the physical geography of Southeast Asia that Scott describes, the internet has places that are more or less accessible to governance. The United States has some of the most advanced internet governance policies, so it is like the lowlands that were easily controlled by governors. Western Europe is somewhat less governed, like the highlands to which tribes would flee to avoid central control.

The main difference between the internet and physical geography is that changing a domain name or using an alias IP address, unlike migrating from one place to another, is virtually costless.

The geographical metaphor of the internet encompasses the stronger aspects of all three earlier theories. Like a biological creature, geography can shift over time due to exogenous effects. As with the institutional approach, the physical structure of the internet has important connections to the way that people conduct their politics. And as with the ocean, physical geography can have areas under varying degrees of governance. Geography cannot be privileged above the other explanations yet, but it could be useful as political scientists start to think more deeply about internet politics.

Statistics in Social Science–Fad or Research Frontier?

Over the weekend a NYT op-ed on “physics envy” in the social sciences made the rounds.* At the same time, a blog post (also at NYT) from January about the rapid expansion of statistical techniques made the rounds (via @brendannyhan). I find this interesting because the first is criticizing empirical models, while the latter is singing their praises.

These two pieces do not necessarily present a fundamental contradiction, since the second (earlier) piece does not have social science exclusively or primarily in view. However, it is worthwhile to spend a bit of time trying to reconcile these two perspectives–the critique of statistical social science on the one hand, and the surging popularity of statistics more generally.

The simplest possibility is that physics envy is driving the popularity of statistical social science. This seems improbable for two reasons, though. First, social scientists have been “teching up” for quite a while, employing more and more advanced models over the past twenty years.** Second, the current hype (not intended derisively) around statistics is much broader than the social sciences, including such fields as biology and finance in addition to economics, political science, and the like.

A second possibility is that the hype is driven by results: businesses are investing in analytics because understanding customer/market data is useful. There is some risk that the current popularity of statistics may drive a bubble, resulting in the application of sophisticated techniques even when the juice is not worth the squeeze. But right now there is probably much low-hanging fruit to be had simply by analyzing data that companies already have or could easily obtain. One objection that could be raised to this is that we have not seen dramatic, conclusive discoveries in empirical social science–for every published statistical finding, there is approximately one opposite finding, sometimes using the same statistical method or the same data. I personally do not endorse this criticism wholeheartedly, but it is sufficient to reject the second explanation as a reconciliation of the two viewpoints with which we began.

The third alternative is that social scientists (and others) see promise in the results of contemporary empirical work, not just satisfactory answers in extant results. The “physics envy” piece fails to recognize the exciting developments at the frontiers of statistical science and computational social science. Methods such as ensemble Bayesian model averaging or computationally-aided mechanism design are serious attempts to account for the complexity of social behavior. Ignoring these developments in order to rehash an old, tired argument is counterproductive.

 

____________________________________

*Erik Voeten comments on it at TMC, but I’m waiting until after this post to read his thoughts.

**As should become clear later in the post, this refers to the frontier of the discipline(s), not the general level of statistical acuity.

IQ, GDP, and the Abuse of Acronyms

Continuing the theme of ethical statistics from the previous two posts, I would be remiss if I did not mention one of the best books on the subject: The Mismeasure of Man, by Stephen Jay Gould. In the book, Gould aptly discusses flaws with measures of intelligence that have been used to justify social hierarchies such as racism and eugenics on a supposedly “scientific” basis. He spends about half of the book on craniometry–the measure of skulls–but gets to IQ in chapter five. A brief summary of that chapter, and some further reading on IQ and economic success, will illuminate our ongoing discussion.

The book is packed with interesting anecdotes of how scientific research is actually done in practice, and how researchers can unconsciously fall prey to their own biases. One such account concerns a project by Catherine M. Cox to analyze the IQ of great historic figures using their biographies. The project was chock full of methodological problems, the foremost of which was the availability data. A second problem was bias in the arithmetic: Cox had her five scorers start at a base IQ of 100 and add points as they were motivated by accounts of the subjects’ precocity. As a result, there were no scores below 100 and all of the scores basically reflected the amount of available information about the individual.

Cox “measured” IQ at two time points: childhood (A1) and as young adults (A2). The result?

Cox published disturbingly low A1 IQ figures for some formidable characters, including Cervantes and Copernicus, both at 105. Her dossiers show the reason: little or nothing is known about their childhood, providing no data for addition to the base figure of 100. (p. 185)

What does this have to do with politics? It seems that IQ is still popping up as an explanatory variable, even in cross-national research. (Gould adeptly describes the problem with using measures on individuals as the basis for between-group comparisons, but that is beyond the scope of the current post.) Take a look at the figure below.

Source: New Palgrave Dictionary of Economics

Garrett Jones, in The New Palgrave Dictionary of Economics, discusses this project optimistically:

Can IQ be measured across countries, even in developing countries? And if so, do these tests have similar real-world reliability to IQ tests given within OECD countries?
The answer to both questions is yes, with some modest grounds for caution….
The psychologist Richard Lynn and the political scientist Tatu Vanhanen (henceforth LV) assembled two collections of IQ scores by scouring the academic and practitioner literatures for reported IQ in a total of 113 countries (2002, 2006). They included some IQ standardisation samples and some national tests of mathematical ability, but most of the studies they used were ‘opportunity samples’, studies of an ostensibly typical classroom or school in a particular country.

The “cautions” that Jones mentions have to do with the reliability of the figures, but he accepts them because the IQ scores are correlated with GDP. If ever there were a case of begging the question, this appears to be it. What we have is a case of reification (“thing-ification”) of IQ: “because we measure it, it must be real; because it correlates with what we think it should, it must be right.”

Unfortunately, there is very little evidence that IQ actual measures anything innate. In fact, measured IQ has been going up “substantially and consistently, all over the world” since the tests started being administered in the first half of the twentieth century. (The phenomenon is known as the Flynn effect.) Scientists “correct” for this by normalizing scores to 100, hiding from the casual observer the fact that a measured IQ of 100 nowadays would be equivalent to 130 a century ago. There’s more to say on the subject but this post is long enough already, so I will leave you with some additional reading and highly recommend Gould’s book to anyone who works in quantitative science of any kind.

Further reading: 

Rising Scores on Intelligence Tests.” Ulric Nelsser, American Scientist.

Old Fashioned Play Builds Serious Skills.” Alix Spiegel, NPR. (Money quote: “In fact, good executive function is a better predictor of success in school than a child’s IQ.” The measure of ‘good executive function’? A child’s ability to stand still. via @newsyc20)