Micro-Institutions Everywhere: Jury Duty

I could have also called this post “The Math of Jury Duty,” but it would have been hard to find two things that more Americans hate more in a single title. Readers who have responded to a jury summons will know how tedious the process can be. If you take the trouble to show up to the courthouse, you could spend the entire day waiting around only to be sent home. Many people would rather find ways to avoid showing up in the first place. This leads to a problem–how many summons does the court need to send?

On the surface the problem seems relatively straightforward. Judges submit their cases weeks in advance, so the district clerk’s office knows what’s coming. If a case needs twelve jurors, and each side can strike up to six people during jury selection, then a total of twenty-four potential jurors are needed.

Unfortunately, the math isn’t quite that easy. Some people who receive a summons are excused for hardship, as in the case of a single parent with young children at home. Others don’t qualify because of criminal records. An individual receiving a summons has the option of postponing the date of service. A summons may be mailed to an incorrect address. In Harris County, the number of returned summonses runs into the double digits — Houstonians tend to move a lot in the three years since their last call to service. And, in spite of the fact it’s a crime, many people simply ignore a summons when it arrives. The numbers add up. Roughly forty percent of people who are sent a summons don’t respond to it.

This leads to a need for “overbooking,” as airlines do with tickets. In fact, according to the post from the University of Houston’s Andrew Boyd, Harris County has to send out three to four times as many summons as they need jurors. (Boyd’s explanation is also available in an audio version.) But the court has the inverse problem of the airlines, since they need to have a minimum number of people whereas the airlines want the maximum number of people.

Further complicating the court’s process is the fact that they cannot compensate people for showing up but not serving in the same way that airlines pay passengers to wait for the next flight. Jurors that respond in Harris County are paid $6 for their first day of service, and $28 thereafter. As commissioner Jerry Eversole said, “At $28 you can pay your parking and have lunch, and that’s pretty much what the jury day consists of.”

So how do the courts get the people they need?

As a rule, courts haven’t reached that level of sophistication [NB: of airline overbooking], but many employ some level of statistical analysis. In Harris County, for example, it’s known that people respond to jury summonses at different rates at different times of the year — something the district clerk’s office takes into account when issuing summonses. And there’s another big cause of uncertainty: settling a case on the courthouse steps. If last minute negotiations lead to a settlement on the day of the trial, there’s no need for any jurors in that court. Even this type of uncertainty can be accounted for in mathematical models, but that’s uncommon.

So the next time you find yourself sitting in the jury assembly room waiting for your number to be called, remember that the courts face a challenging engineering problem with a lot of uncertainty. And don’t forget to bring a good book.

Getting Started with Prediction

From historians to financial analysts, researchers of all stripes are interested in prediction. Prediction asks the question, “given what I know so far, what do I expect will come next?” In the current political season, presidential election forecasts abound. This dates back to the work of Ray Fair, whose book is ridiculously cheap on Amazon. In today’s post, I will give an example of a much more basic–and hopefully, relatable–question: given the height of a father, how do we predict the height of his son?

To see how common predictions about children’s traits are, just Google “predict child appearance” and you will be treated to a plethora of websites and iPhone apps with photo uploads. Today’s example is more basic and will follow three questions that we should ask ourselves for making any prediction:

1. How different is the predictor from its baseline?
It’s not enough to just have a single bit of information from which to predict–we need to know something about the baseline of the information we are interested in (often the average value) and how different the predictor we are using is. The “Predictor” in this case will refer to the height of the father, which we will call U. The “outcome” in this case will be the height of the son, which we will call V.

To keep this example simple let us assume that U and V are normally distributed–in other words their distributions look like the familiar “bell curve” when they are plotted. To see how different our given observations of U or V are from their baseline, we “standardize” them into X and Y

X = {{u - \mu_u} \over \sigma_u }

Y = {{v - \mu_v} \over \sigma_v },

where \mu is the mean and \sigma is the standard deviation. In our example, let \mu_u = 69, \mu_v=70, and \sigma_v = \sigma_u = 2.

2. How much variance in the outcome does the predictor explain?
In a simple one-predictor, one-outcome (“bivariate”) example like this, we can answer question #2 by knowing the correlation between  X and Y, which we will call \rho (and which is equal to the correlation between U and V in this case). For simplicity’s sake let’s assume \rho={1 \over 2}. In real life we would probably estimate \rho using regression, which is really just the reverse of predicting. We should also keep in mind that correlation is only useful for describing the linear relationship between X and Y, but that’s not something to worry about in this example. Using \rho, we can set up the following prediction model for Y:

Y= \rho X + \sqrt{1-\rho^2} Z.

Plugging in the values above we get:

Y= {1 \over 2} X + \sqrt{3 \over 4} Z.

Z is explained in the next paragraph.

3. What margin of error will we accept? No matter what we are predicting, we have to accept that our estimates are imperfect. We hope that on average we are correct, but that just means that all of our over- and under-estimates cancel out. In the above equation, Z represents our errors. For our prediction to be unbiased there has to be zero correlation between X and Z. You might think that is unrealistic and you are probably right, even for our simple example. In fact, you can build a decent good career by pestering other researchers with this question every chance you get. But just go with me for now. The level of incorrect prediction that we are able to accept affects the “confidence interval.” We will ignore confidence intervals in this post, focusing instead on point estimates but recognizing that our predictions are unlikely to be exactly correct.

The Prediction

Now that we have set up our prediction model and nailed down all of our assumptions, we are ready to make a prediction. Let’s predict the height of the son of a man who is 72″ tall. In probability notation, we want

\mathbb{E}(V|U=72),

which is the expected son’s height given a father with a height of 72”.

Following the steps above we first need to know how different 72″ is from the average height of fathers.  Looking at the standardizations above, we get

X = {U-69 \over 2}, and

Y = {V - 70 \over 2}, so

\mathbb{E}(V|U=72) = \mathbb{E}(2Y+70|X=1.5) = \mathbb{E}(2({1 \over 2}X + \sqrt{3 \over 4}Z)+70|X=1.5),

which reduces to 1.5 + \mathbb{E}(Z|X=1.5) + 70. As long as we were correct earlier about Z not depending on X and having an average of zero, then we get a predicted son’s height of 71.5 inches, or slightly shorter than his dad, but still above average.

This phenomenon of the outcome (son’s height) being closer to the average than the predictor (father’s height) is known as regression to the mean and it is the source of the term “regression” that is used widely today in statistical analysis. This dates back to one of the earliest large-scale statistical studies by Sir Francis Galton in 1886, entitled, “Regression towards Mediocrity in Hereditary Stature,” (pdf) which fits perfectly with today’s example.

Further reading: If you are already comfortable with the basics of prediction, and know a bit of Ruby or Python, check out Prior Knowledge.