Cities from spreadsheets
My first Kiplinger project of the new year: “The 12 Best Cities for High-Paying Jobs,” with guest appearances by Trenton, Cedar Rapids and a few other places you might not expect.
It’s probably not evident from the final product, but a huge amount of data goes into these city slideshows — many-thousand-row spreadsheets from the Census Bureau, BLS, etc. I’ve done about a dozen and gotten “really good at spreadsheets,” in the bemused words of my assigning editor. The process intrigues me more and more as I refine it. You start with a mess of statistics and, many formulas and filters later, pull out a concrete list. Voila!
Next step: Learning to scrape from websites and PDFs, so I can expand past released data sets. I’ll refer to ProPublica’s exhaustive guide for that one.
MS Excel 2010 (Office 14) deals better with data imports from text (what happens when you strip data from a pdf) than prior versions!