How to Get FIPS Codes from Latitude and Longitude

FIPS codes are unique identifiers for geographic units within the US. Although they have technically been withdrawn as a standard, they are still widely used in political science and other applications for geographic categorization of data. For example, the CBS/New York Times monthly polling series includes the FIPS code for the county in which each respondent lives.

Say you have some other data with latitude and longitude indicators that you would like to combine with FIPS-coded data. I have written a short Ruby script below that will do exactly this. It assumes that you have your data in .csv format, since that is a pretty generic format and you can usually convert your data to that if it is currently stored in another form. You will also need the Ruby geokit gem:

gem install geokit

Once you have the data ready and the gem installed, you are good to go. Just fill out the lines with comments and run the following from IRB (or however you like to run your Ruby scripts):

require 'geokit'
require 'CSV'

filename = # csv file
fipslist = []

CSV.foreach(filename) do |row|
  lat = # latitude column
  long = # longitude column
  ll = GeoKit::LatLng.new(lat, long)
  fcc = Geokit::Geocoders::FCCGeocoder.reverse_geocode(ll)
  puts fcc.district_fips
  fipslist << fcc.district_fips
end

You can then do anything you want to with the fipslist object, including writing it out to a file. If you want to share improvements or have questions, please use the comments section below.

PyCon 2012 Video Round-Up

The videos from PyCon 2012 are posted. Here are the ones I plan to watch, along with their summaries:

Checking Mathematical Proofs Written in TeX

ProofCheck is a set of Python scripts which parse and check mathematics written using TeX. Its homepage is http://www.proofcheck.org. Unlike computer proof assistants which require immersion in the equivalent of a programming language, ProofCheck attempts to handle mathematical language formalized according to the author’s preferences as much as possible.

Sketching a Better Product

If writing is a means for organizing your thoughts, then sketching is a means for organizing your thoughts visually. Just as good writing requires drafts, good design requires sketches: low-investment, low-resolution braindumps. Learn how to use ugly sketching to iterate your way to a better product.

Bayesian Statistics Made (as) Simple (as Possible)

This tutorial is an introduction to Bayesian statistics using Python. My goal is to help participants understand the concepts and solve real problems. We will use material from my (nb: Allen Downey’s) book, Think Stats: Probability and Statistics for Programmers (O’Reilly Media).

SQL for Python Developers

Relational databases are often the bread-and-butter of large-scale data storage, yet they are often poorly understood by Python programmers. Organizations even split programmers into SQL and front-end teams, each of which jealously guards its turf. These tutorials will take what you already know about Python programming, and advance into a new realm: SQL programming and database design.

Web scraping: Reliably and efficiently pull data from pages that don’t expect it

Exciting information is trapped in web pages and behind HTML forms. In this tutorial, you’ll learn how to parse those pages and when to apply advanced techniques that make scraping faster and more stable. We’ll cover parallel downloading with Twisted, gevent, and others; analyzing sites behind SSL; driving JavaScript-y sites with Selenium; and evading common anti-scraping techniques.

Some of it may be above my head at this stage, but I think it’s great that the Python community makes all of these resources available.