Using CartoDB 2.0 to geocode private prisons in comparison to state-by-state incarceration rates

The last post I wrote about CartoDB used the CartoDB API and Leaflet.js to produce a hover tooltip map of census data. Looking back, while it’s quite useful to learn CartoDB’s API, it’s now possible to achieve those same visual cues and interactivity without spending time wading through complex, undocumented JavaScript. You can now do everything I did here using little more than CartoDB’s frontend CMS. Continue reading

Using plotl.y to analyze and visualize data in variety of interactive charts, graphs

It’s been a while since I’ve had the leisure to play with new open-source tools for data journalism. It’s been even longer since I’ve written a tutorial. Today I wanted to explore a fairly WYSIWYG web app Plotly. Although most of the features on face value appear to be not much more than what you can do in Google Visualization Playground or an open-source library like Highcharts.js – basic chart types on-the-fly, hover interactions – what particularly stands out about Plotly is its ability to perform many of the statistical analyses to your dataset before you go through the process of visualizing it.

For example, you can calculate percent change from a set of chronological raw numbers to display accurate trend data by simply choosing “Data Analysis>Percent Change” in the Grid View. I tried this out with some historical data scraped by the folks at on the percentage of the Savannah-area workforce in the hospitality industry, even adding in a fit or “trend” line to verify the upward tick:
Continue reading

Making the case for the “post-platform” journalist

Yesterday, I presented a five-minute lightning talk at the Center for Collaborative Journalism on what I’m tentatively calling the “post-platform” journalist. From “print journalist”, to “data journalist”, to “multimedia journalist” and even to “social journalist,” it seems as though the qualifiers that often get placed before the word “journalist” abound in almost cliche numbers. Each of these “types” of journalism typically seek to clarify the platform in which the individual tells the story rather than the content of the story itself. But in an era of information abundance and the democratization of publishing, we’ve seen the rapid rise of almost limitless numbers of platforms that require a range of almost limitless skills. Continue reading

So long, Savannah, and a new entrepreneurial path

Perhaps my favorite Instagram shot ever snapped for the @savannahnow Instagram account, which we grew to nearly 2,000 followers in a year’s time.

This week, I put in my two weeks’ notice at Savannah’s Morning News Media and our parent company, Morris Publishing Group, to move back after the holiday to New York City to join an up-and-coming NYC-based startup.

In my year-and-a-half at the helm of the SMN’s digital content and product development strategies, including launching, I learned invaluable management skills, conflict resolution and the courage to push fearlessly for innovation, even when tradition and red tape got in the way. Continue reading

Why we built the new and Do mobile app

After launching the expanded standalone print edition of Do, our weekly arts-and-entertainment section for Savannah Morning News, back in March, we had a feeling the new physical product would soon outgrow its current digital home at We just didn’t know how soon that day would come.

So, after only three hectic months of strategy meetings, planning sessions, company proposals, product development, site design and end-user implementation, we’re happy to announce the launch of our new Do website at, as well as an accompanying interactive mobile Do app for iOS and Android devices. Continue reading

“Newspapers are the new startups”

Newspapers are the new startups . . . we’re starting to see a lot of great changes as technologies improve and cultures change.”
-John Levitt, Director of Sales and Marketing,

Levitt’s is one of the most insightful takes on the publishing industry I’ve heard in a while. It’s going to take a lot of restructuring and a ground-up approach, but I’m excited to be a part of it as we embrace the start-up culture in Savannah.

SavSwap: Tackling the online classified ads market

Innovative, quality journalism takes money to produce. In the past, one of the largest revenue streams for news organizations has traditionally come from the classified ad market – a revenue stream that has all but dried up in today’s era of Craigslist and eBay. As an online editor, developer, manager and digital strategist for Savannah Morning News and, a midsized news organization owned by Morris Publishing Group, I’ve sat through countless digital strategy meetings discussing how we as a company can win back a sliver of the online classified ad market, if for no other reason than audience growth, and with the longterm goal of driving revenue to support our company’s journalistic efforts.

Continue reading

Visualizing 2012 census estimates using CartoDB and Leaflet

I’ve been tinkering around with some new mapping tools lately, and figured I’d put them to good use by displaying the 2011-2012 population estimates released last week by the U.S. Census Bureau. The inherently geographical nature of the census makes it a data set just begging to be mapped.

Rather than the de facto Google Maps JavaScript API V3, I decided to go with CartoDB and Leaflet to see what I could produce.

As I mentioned in a recent post, CartoDB offers an excellent Fusions-esque interface, although it allows for far less front-end customization and requires more beneath the hood programming. Nonetheless, CartoDB can make pretty maps right out of the box, which you can then fully customize using the CartoDB API and basic SQL statements. There’s one caveat, however: The service only allows you to upload 5 tables for free. That could be a dealbreaker for cash-strapped news organizations and freelance data journalists.

Anyhow, I downloaded a .zip shapefile package of all 159 Georgia counties from the U.S. Census Bureau, then brought the package into CartoDB using the service’s default upload interface. Using Excel, I calculated the percent change from the most recent population estimates to last year’s estimates. I then added the resulting values as a column in my CartoDB table, which you can see here.

After playing a bit with the API, I was able to format a diverging chloropleth map from my table with the following style parameters, written using 0to255 to ensure an equidistant color scheme:

#statewidepop {
#statewidepop [percent_change< =5.5] {
#statewidepop [percent_change<=4] {
#statewidepop [percent_change<=3] {
#statewidepop [percent_change<=2.25] {
#statewidepop [percent_change<=1.5] {
#statewidepop [percent_change<=0.75] {
#statewidepop [percent_change<=0.3] {
#statewidepop [percent_change<=0] {
#statewidepop [percent_change<=-0.5] {
#statewidepop [percent_change<=-1] {
#statewidepop [percent_change<=-2] {
#statewidepop [percent_change<=-3] {
#statewidepop [percent_change<=-4] {
#tl_2009_13_county[percent_change<=-5] {

Check out the resulting map:

The map above shows the percent change in population from July 2010 to July 2011 in all 159 Georgia counties, as estimated by the U.S. Census Bureau. The darker the green, the higher the positive percent change. The darker the red, the higher the negative percent change. Click on a county to see its percent change.
Continue reading

Building a responsive site in less than 20 minutes

An ever-so-sleek responsive portfolio site I designed for a friend in less than 20 minutes using Skeleton as a foundation.

With all this talk lately of the new era of responsive design, I realized today that I’ve yet to create anything that’s actually responsive. Given that I’ve only pondered using it in the implementation of complex, database-driven news sites, the task of tweaking every level of CSS to fit perfectly into a responsive grid system has so far seemed too daunting to tackle. Continue reading

Using data-viz to make a wire story stand out from the pack

I’ve been interested lately in finding examples of online-only, collaborative, non-profit newsrooms who’ve utilized the power of data visualization techniques to give added value to stories that otherwise wouldn’t necessarily be unique, and in doing so beat out legacy news organizations who published a text narrative alone. Take, for example, this data-rich story and interactive map displaying statewide testing results published by NJSpotlight Friday. While the news that only 8 out of 10 graduating seniors had passed New Jersey’s current standardized test in 2011 was widely reported across the state last week, including by the Star-Ledger in Newark and by The Press of Atlantic City, only NJSpotlight took advantage of the story’s strong data element to produce a more concise, data-driven visual narrative.
Continue reading

Overlaying a bubble chart onto a Google map

Others may hate, but I’m a big fan of using bubbles to display data. When implemented correctly (i.e. scaled in terms of area instead of diameter), bubbles can be an aesthetically appealing and concise way to represent the value of data points in an inherently visual format. Bubbles are even more useful when they include interactivity, with events like mouseover and zoom allowing users to drill down and compare similar-sized bubbles more easily than they can in static graphics. So, when I was recently working on a class project on autism diagnoses in New York City, I decided to use bubbles to represent the percentage of students with individualized education plans at all 1250 or so K-8 New York City schools. Continue reading

Why calculus matters when it comes to data-driven stories

A quick refresher from my data visualization professor here at Columbia a couple of weeks ago reminded me why I was forced to spend all those grueling hours calculating standard deviation back in high school.

See, when you’re using a data set to tell a story, the first step is to understand what that data says. And to do that, you’ve got to have a good idea of the range and variation of the values at hand. Not only can figuring that information out help you determine whether there’s any statistical significance to your data set, but it can also pinpoint outliers and possible errors that may exist within the data before you begin the work of visualizing it.

Thanks to powerful processing programs like Excel, we can figure out the variability of data sets pretty easily using the program’s built-in standard deviation function (remember this intimidating-looking equation from calculus class?). Still, it always helps to know how to calculate the information out by hand, if only to get a conceptual idea of why numbers such as the standard deviation (the variability of a data-set) and the z-value (the number of standard deviations a given value is away from the mean) even matter in the first place when it comes to data visualization.

So, to brush up on my formulas and also better understand the numbers behind an actual story assignment for one of my classes, I recently hand-calculated the standard deviation and z-values for a set of data on state-by-state obesity rates. From my calculations, I was able to use the standard deviation (3.24) to determine that, on average, most states fell within the middle of the bell-curve for the average national obesity rate (27.1 percent) . In addition, the z-values helped me understand which states stood out from the pack as possible outliers (Mississippi is by far the most obese with a 2.13 z-value, Colorado the least obese with a -1.9 z-value). To get an idea of how those formulas look hand-calculated in Excel, check out my spreadsheet here. And keep these formulas in mind while working on your next data story. They can potentially save you time and effort by helping you figure out what your data set says before you have to go through the often-lengthy process of visualizing it.

What makes “the world’s best designed website”

With the Pulitzer Price announcements coming up later this afternoon, you’d think I’d be writing about whose up for the “Best Deadline Reporting” or “Best Public Service Journalism” prizes. But instead I want to talk about a different media award doled out during the past week:’s designation as the “world’s best designed website” by the Society for News Design. Put simply, I can’t say I disagree. Continue reading

Using data analysis to assess a digital news startup’s future

In a March 2011 blog post, Editor-in-Chief Henry Blodget gleefully announced that the startup tech and business news site had turned a net profit of $2,127 during fiscal year 2010 – just enough, in his words, “to buy a MacBook Pro.” While that may sound like chump change for most any business, in the rough-and-tumble world of startup media companies, it’s almost like getting a six-figure salary. As Blodget puts it,

Making $2,127 feels about 2,127 times as good as losing money. And it makes us confident that, if we keep working hard, and we keep getting better, we’ll be able to build a successful business and a truly great product someday.

In less than two years since its launch in February 2009, had achieved what most other media startups only dream of: turning a profit. Blodget also pointed out that the company had taken in $4.8 million in revenue during 2010, almost all of which came from online advertising. Since then, executives from the company have yet to provide any further updates as to how the site fared financially in 2011. But a close look at the numbers shows that, assuming the company’s ad rates and inventory held steady, Business Insider nearly doubled its annual revenue to somewhere closer to $8.5 million in 2011. That makes it one of the most successful online media sites of its scale. Continue reading

Critique: “Agreement Groups in the United States Senate”

Take a look at this fascinating visualization of U.S. senate agreement groups made by Ph.D. student Adrian Friggeri. Using a complex agreement algorithim based upon data from, the visualization displays how much all 100 senators of each U.S. Congress during the last 15 years have crossed the aisle –– or stuck to party lines –– on senate-floor votes.

From a design standpoint, the visualization is nearly flawless. The thin red and blue lines help the user form an instant party association, and the light gray bars in the background distinguish each Congress from the next without leading to visual clutter. What’s perhaps most impressive is that, despite the fact that the visualization contains far more than 100 different data points, the information is still fairly easy to access and the interface is stil simplistic in feel. Because each Senator’s entire individual trajectory is highlighted on mouseover, users can get a glimpse at how willing their respective Senator has been to negotiate a compromise across party lines over the years.

Most of all, the visualization does what all good visualizations should do: tells a story without text. As we can see, the number of Democrats who have crossed the aisle is notably larger than that of their GOP counterparts. This becomes ever more clear when we drill down to look at each party’s trajectory individually, where the connections can be seen more clearly. Perhaps what I would’ve liked to have seen in addition, however, is some sort of summary or average value of the disparity between the two parties on agreement rates, even if just a number at the bottom of the visualization. As it stands, the user has to dissect the visualization a good bit to tell that Democrats have a higher “agreement rate” than Republicans.

The networked line structure reminds me a lot of the Wall Street Journal’s “What They Know” visualization, except that this visualization has a good bit less clutter and complexity, and much better styling choices.

Response to Norman, “Emotional Design”

Good aesthetics are more than just fluff when it comes to design. They are a core part of a product’s functionality. Such is the argument Donald A. Norman makes in his insightful 2005 book Emotional Design: Why We Love (or Hate) Everyday Things. For Norman, attractive things work better by boosting the mood of the user and therefore allowing him or her to think more clearly and operate it more efficiently.

Undergirding Norman’s thesis that aesthetics directly influence operability is his distinction between the three basic levels of human cognition: the visceral (jumping at a sudden sound in a quiet room), the behavioral (relaxing in the solitude of a quiet room) and the reflective (thinking to oneself about why a quiet room is more enjoyable). As Norman asserts, these three levels of thought processing “interact with one another, each modulating the others” (7). You cannot escape the effect that one level of thought processing has on the other. As such, a visceral reaction to an external stimuli influences the subsequent behavioral reactions we have, which in turn influence our reflective conclusions about the stimuli itself. If we have a negative visceral reaction to a poorly design website, our mood is negatively affected in such a way that hinders our ability to navigate and use the site, even if there’s nothing wrong with the navigation or user interface from a technical standpoint. All our brain can focus on is the poor design. This reaction is similar to the way humans form first impressions of others; if an individual makes a poor first impression (a visceral reaction), we are less likely to act on his or her future actions or speech (a behavioral reaction), which in turn affects the entire way we think about that person (a reflective reaction).