How Much Math is Enough?

Regular readers know that education policy is not my forte (although I have expressed some opinions), but there was a confluence of articles over the weekend that I feel the need to discuss here. The first was Andrew Hacker’s outrageous op-ed in the New York Times suggesting that freshman college algebra is unnecessary. It contained such doozies as this:

It’s not hard to understand why Caltech and M.I.T. want everyone to be proficient in mathematics. But it’s not easy to see why potential poets and philosophers face a lofty mathematics bar. Demanding algebra across the board actually skews a student body, not necessarily for the better.

OK, not everyone is going to be an engineer. But algebra? Come on. Virtually everyone who goes to college should have encountered algebra in high school. Now, I might be biased by this because lately I have found myself wishing I had taken much more math as an undergraduate. But even though calculus might not be necessary for everyone, the ability to solve for a single variable seems essential in everyday life. (I could provide examples, but they would cover only a small subset of the things for which algebra can be used.)

John Patty has a better response than anything I could come up with at the moment, so go read it. If anything Patty is softer on Hacker than I would be, but he does a great job pointing out the logical flaws in Hacker’s argument.

And finally, here is a piece from 2008 on the innumeracy of college professors. The bias against math and science in humanities departments is palpable. To be fair, many physicists and mathematicians have a similar antipathy toward the liberal arts. But if we are going to require art history, music appreciation and the like to broaden the minds of students (a requirement that I am not opposed to), then it would be a crime to omit such basic life skills as algebra. I am not saying that we all need to become engineers, but encouraging students to forgo math does not bode well for our future.

Cigarette Taxes and Unintended Consequences

One of the best questions you can ask a social scientist is, “and then what?” Thinking about second-order effects is essential to smart research and policy-making. Research on the unintended consequences of cigarette taxes helps to illustrate this point:

Besides resulting in a shift in purchasing choices, cigarette sin taxes also indirectly result in illegal activities like smuggling. Each of the fifty U.S. states taxes cigarettes at different levels, and this uneven price distribution opens up market avenues for nefarious wrongdoers. Cigarette trafficking does not appear to be extremely prevalent within the U.S., but it’s estimated to be a $1.5 billion industry in Canada. Traffickers commonly smuggle cheap cigarettes purchased in the United States across the border. It’s one of the few illegal drug trades where the U.S. is an exporter, not an importer.

Interestingly as well, a recent study just published in the journal Substance Abuse Treatment, Prevention, and Policy reported that boosts in state cigarette prices were associated with increases in binge drinking among persons aged 21-29, specifically a 4.06% increase for every dollar increase in cigarette price. Drinking also rose among those aged 65 and older.

Not all of the effects are negative, however. Recently I was talking to a colleague who will take up a position at University College London at the end of the year. I asked whether the poor reputation of British food was deserved, and he replied that it has improved substantially in recent years. Since smoking was banned in pubs, he explained, they had to improve the cuisine to keep customers there longer. How’s that for an unintended consequence?
Update: Going through my RSS feed, I see that Adam Ozimek wrote a related post on Bloomberg’s soda ban a few days back.

Wednesday Nerd Fun: How to Win at Jeopardy

Alex: “You know Roger, you could set a new one day record.”

Roger: “What’s the old one?”

Roger Craig should have known the answer, because he held the old record.

Craig says it works like Moneyball — a reference to the book and movie about the statistical techniques used by legendary Oakland Athletics coach Billy Beane to build a winning baseball team. Craig’s system also relied heavily on statistics.

“I actually downloaded this site called the Jeopardy! Archive, which is a fan-created site of all the questions and answers that are on the show.”

“Something like 211,000 questions and answers that have appeared on Jeopardy!,” says Esquire writer Chris Jones, a self-proclaimed “game-show nerd” who’s familiar with Craig’s tactics.

Using data-mining and text-clustering techniques, Craig grouped questions by category to figure out which topics were statistically common — and which weren’t.

“Obviously it’s impossible to know everything,” Jones says. “So he was trying to decide: What things did he need to know? He prepared himself in a way that I think is probably more rigorous than any other contestant.”

You can find the full article here, and the link comes to us via Tyler Cowen. To see Roger at work play, check out these clips:

PolMeth 2012 Round-Up, Part 2

A Map from Drew Linzer’s Votamatic

Yesterday I discussed Thursday’s papers and posters from the 2012 Meeting of the Political Methodology Society. Today I’ll describe the projects I saw on Friday, again in the order listed in the program. Any attendees who chose a different set of panels are welcome to write a guest post or leave a comment.

First I attended the panel for Jacob Montgomery and Josh Cutler‘s paper, “Computerized Adaptive Testing for Public Opinion Research.” (pdf; full disclosure: Josh is a coauthor of mine on other projects, and Jacob graduated from Duke shortly before I arrived) The paper applies a strategy from educational testing to survey research. On the GRE if you get a math problem correct, the next question will be more difficult. Similarly, when testing for a latent trait like political sophistication a respondent who can identify John Roberts likely also recognizes Joe Biden. Leveraging this technique can greatly reduce the number of survey questions required to accurately place a respondent on a latent dimension, which in turn can reduce non-response rates and/or survey costs.

Friday’s second paper was also related to survey research: “Validation: What Big Data Reveal About Survey Misreporting and the Real Electorate” by Stephen Ansolabehere and Eitan Hersh (pdf). This was the first panel I attended that provoked a strong critical reaction from the audience. There were two major issues with the paper. First, the authors contracted out the key stage in their work–validating data by cross-referencing other data sets–to a private, partisan company (Catalist) in a “black box” way, meaning they could not explain much about Catalist’s methodology. At a meeting of methodologists this is very disappointing, as Sunshine Hillygus pointed out. Second, their strategy for “validating the validator” involved purchasing a $10 data set from the state of Florida, deleting a couple of columns, and seeing whether Catalist could fill those columns back in. Presumably they paid Catalist more than $10 to do this, so I don’t see why that would be difficult at all. Discussant Wendy Tam Cho was my favorite for the day, as she managed to deliver a strong critique while maintaining a very pleasant demeanor.

In the afternoon, Drew Linzer presented on “Dynamic Bayesian Forecasting of Presidential Elections in the States” (pdf). I have not read this paper, but thoroughly enjoyed Linzer’s steady, confident presentation style. The paper is also accompanied by a neat election forecast site, which is the source of the graphic above. As of yesterday morning, the site predicted 334 electoral votes for Obama and 204 for Romney. One of the great things about this type of work is that it is completely falsifiable: come November, forecasters will be right or wrong. Jamie Monogan served as the discussant, and helped to keep the mood light for the most part.

Jerry Reiter of the Duke Statistics Department closed out the afternoon with a presentation on “The Multiple Adaptations of Multiple Imputation.” I was unaware that multiple imputation was still considered an open problem, but this presentation and a poster by Ben Goodrich and Jonathan Kropko (“Assessing the Accuracy of Multiple Imputation Techniques for Categorical Variables with Missing Data”) showed me how wrong I was. Overall it was a great conference and I am grateful to all the presenters and discussants for their participation.

PolMeth 2012 Round-Up, Part 1

Peter Mucha’s Rendering of Wayne Zachary’s Karate Club Example

Duke and UNC jointly hosted the 2012 Meeting of the Society for Political Methodology (“PolMeth”) this past weekend. I had the pleasure of attending, and it ranked highly among my limited conference experiences. Below I present the papers and posters that were interesting to me, in the order that I saw/heard them. A full program of the meeting can be found here.

First up was Scott de Marchi‘s piece on “Statistical Tests of Bargaining Models.” (Full disclosure: Scott and most of his coauthors are good friends of mine.) Unfortunately there’s no online version of the paper at the moment, but the gist of it is that calculating minimum integer weights (MIW) for the bargaining power of parties in coalition governments has been done poorly in the past. The paper uses a nice combination of computational, formal, and statistical methods to substantially improve on previous bargaining models.

Next I saw a presentation by Jake Bowers and Mark Fredrickson on their paper (with Costas Panagopoulos) entitled “Interference is Interesting: Statistical Inference for Interference in Social Net- work Experiments” (pdf). The novelty of this project–at least to me–was viewing a treatment as a vector. For example, given units of interest (a,b,c), the treatment vector (1,0,1) might have different effects on a than (1,1,0) due to network effects. In real-world terms, this could be a confounder for an information campaign when treated individuals tell their control group neighbors about what they heard, biasing the results.

The third paper presentation I attended was “An Alternative Solution to the Heckman Selection Problem: Selection Bias as Functional Form Misspecification” by Curtis Signorino and Brenton Kenkel. This paper presents a neat estimation strategy when only one stage of data has been/can be collected for a two-stage decision process. The downside is that estimating parameters for a k-order Taylor series expansion with n variables grows combinatorically, so a lot of observations are necessary.* Arthur Spirling, the discussant for this panel, was my favorite discussant of the day for his helpful critique of the framing of the paper.

Thursday’s plenary session was a talk by Peter Mucha of the UNC Math Department on “Community Detection in Multislice Networks.” This paper introduced me to the karate club example, the voter model, and some cool graphs (see above).

At the evening poster session, my favorite was Jeffrey Arnold‘s  “Pricing the Costly Lottery: Financial Market Reactions to Battlefield Events in the American Civil War.” The project compares the price of gold in Confederate graybacks and Union greenbacks throughout the Civil War as they track battlefield events. As you can probably guess, the paper has come cool data. My other favorite was Scott Abramson‘s labor intensive maps for his project “Production, Predation and the European State 1152–1789.”

I’ll discuss the posters and papers from Friday in tomorrow’s post.


*Curtis Signorino sends along a response, which I have abridged slightly here:

Although the variables (and parameters) grow combinatorically, the method we use is actually designed for problems where you have more regressors/parameters than observations in the data.  That’s obviously a non-starter with traditional regression techniques.  The underlying variable selection techniques we use (adaptive lasso and SCAD) were first applied to things like trying to find which of thousands of genetic markers might be related to breast cancer.  You might only have 300 or a 1000 women in the data, but 2000-3000 genetic markers (which serve as regressors).  The technique can find the one or two genetic markers associated with cancer onset.  We use it to pick out the polynomial terms that best approximate the unknown functional relationship.  Now, it likely won’t work well with N=50 and thousands of polynomial terms.  However, it tends to work just fine with the typical numbers of regressors in poli sci articles and as little as 500-1000 observations.  The memory problem I mentioned during the discussion actually occurred when we were running it on an IR dataset with something like 400,000 observations.  The expanded set of polynomials required a huge amount of memory.  So, it was more a memory storage issue due to having too many observations.  But that will become a non-issue as memory gets cheaper, which it always does.

This is a helpful correction, and perhaps I should have pointed out that there was a fairly thorough discussion of this point during the panel. IR datasets are indeed growing rapidly, and this method helps avoid an almost infinite iteration of “well, what about the previous stage…?” questions that reviewers could pose.

Micro-Institutions Everywhere: Walking Paths

I was delighted to discover an example of micro-institutions at work this week right in my own backyard, er, campus. Several of my classes have been held in the Social Psychology building on Duke’s West Campus. Most traffic to this building comes from the nearby Perkins library, to the southwest.

None of the sidewalks allows for the most direct route, which leads from an outdoor stairway diagonally across a small lawn. None until now, that is. This week I observed the scene in the photo: a sidewalk being put in exactly the same place as the well-worn footpath from pedestrians observing the triangle inequality. (Please forgive the poor quality of the picture.) The new sidewalk is a perfect example of a central planner responding to the emergent order of individual actions, rather than resisting it. Micro-institutions really work.

Wednesday Nerd Fun: TV Writers Podcast

From June Thomas:

The format of the Nerdist Writers Panel is pretty straightforward. Host Ben Blacker, a writer with credits on Supah Ninjas and Supernatural, interviews TV writers—often in groups of three, but occasionally one-on-one—about how they broke into the business, their experiences working on various shows, and how different showrunners, writers rooms, and networks operate. The discussions are usually taped in front of an audience (the shows benefit nonprofit tutoring program 826LA), and attract an impressive array of guests, including Breaking Bad’s Vince Gilligan, Lost’s Damon Lindelof, Justified’s Graham Yost, and Community’s Dan Harmon. Blacker is a skilful interviewer who keeps the conversation moving, asks follow-ups when they’re needed, and doesn’t shy away from sensitive topics.

You can find the complete Writers Panel podcast archives at Nerdist or iTunes.

Does State Spending on Mental Health Lower Suicide Rates?

That’s the title of a new paper (gated) in the Journal of Socio-Economics by Justin Ross, Pavel Yakovlev, and Fatima Carson. Here’s the abstract:

Using recently released data on public mental health expenditures by U.S. states from 1997 to 2005, this study is the first to examine the effect of state mental health spending on suicide rates. We find the effect of per capita public mental health expenditures on the suicide rate to be qualitatively small and lacking statistical significance. This finding holds across different estimation techniques, gender, and age groups. The estimates suggest that policies aimed at income growth, divorce prevention or support, and assistance to low income individuals could be more effective at suicide prevention than state mental health expenditures.

Their paper asks an interesting question, and apparently they are among the first to attempt an answer using empirical data. Suicide is one of the oldest topics of interest for social scientists, going back to Émile Durkheim‘s 1897 publication.

The main problem with the paper’s analysis is the use of observational data to make a causal claim.* As the authors themselves point out, state mental health spending is remarkably stable to the point that if a year of data were missing it could be interpolated by averaging the years before and after. There’s really no exogenous change observed in the sample period–no instance of a state dramatically increasing or reducing its spending is mentioned–so the comparisons are mostly between rather than within states. This setup fails to provide evidence for the authors’ claims such as, “a one percent increase in public mental health expenditures per capita would reduce the incidence of suicide among that group by 0.91 per 100,000 women in this age group [25-64].”

Fig. 1: Ross, Yakovlev, and Carson (2012)

Given the large between-state differences, a cleaner design might have looked at suicide risk for individuals who moved from one state to another. Of course, this introduces the problem that individuals who commit suicide never move to another state afterward. Furthermore, this individual-level data would like be difficult to collect. However, even a small survey of individuals would be a nice complement to this paper’s focus on aggregate statistics. (The authors are careful to point out that their paper does not assess the effectiveness of mental health treatment on suicide outcomes.)

On the positive side, it is nice to see a null finding published in a journal. Findings that are “qualitatively small lacking statistical significance” are not often seen in print even when they are justified. I only wonder in this case whether the findings will hold up.

Some other posts that I have written on mental health issues can be found here and here.


*Yes, the same could be said of much social science. That doesn’t make it OK, nor does it mean that NSF Political Science should be defunded.

Petition for TSA to Obey the Law

Thousands Standing Around in Denver

From Jim Harper (via Josh Cutler):

A year ago this coming Sunday, the US Court of Appeals for the DC Circuit ordered the Transportation Security Administration to do a notice-and-comment rulemaking on its use of Advanced Imaging Technology (aka “body-scanners” or “strip-search machines”) for primary screening at airports. (The alternative for those who refuse such treatment: a prison-style pat-down.) It was a very important ruling, for reasons I discussed in a post back then. The TSA was supposed to publish its policy in the Federal Register, take comments from the public, and issue a final ruling that responds to public input.

And the wording of the petition:

In July 2011, a federal appeals court ruled that the Transportation Security Administration had to conduct a notice-and-comment rulemaking on its policy of using “Advanced Imaging Technology” for primary screening at airports. TSA was supposed to publish the policy in the Federal Register, take comments from the public, and justify its policy based on public input. The court told TSA to do all this “promptly.” A year later, TSA has not even started that public process. Defying the court, the TSA has not satisfied public concerns about privacy, about costs and delays, security weaknesses, and the potential health effects of these machines. If the government is going to “body-scan” Americans at U.S. airports, President Obama should force the TSA to begin the public process the court ordered.

You can sign the petition, or read more about the ineffectiveness of current TSA procedures here.

Wednesday Nerd Fun: Python for iOS

This one is short and sweet. Would you like to be able to write Python code on an iOS device? Now you can, with this app.

I have spent some time playing around with the app this week, and it seems to have two main uses. The first is entertaining yourself while waiting in line/riding the bus/whatever dead time you have where you would like to do something slightly more productive than checking Twitter. The second usage I see is making small edits with an iPad or iPhone while you happen to be away from your computer. (See the screenshot below showing the file loading functionality, more here.) In other words, this would not be my primary programming environment but it is a nice complement.

If you have never given programming a chance, having this app might make learning Python seem more like a game. If that’s what it takes for you to plunge into coding, go for it!