Although geared primarily toward the production of static graphics for print publications, Dona M. Wong’s The Wall Street Journal Guide to Information Graphics (2010) provides a wealth of salient and time-honored tips and guidelines that any student of data visualization would be well-advised to follow. Continue reading
With all this talk lately of the new era of responsive design, I realized today that I’ve yet to create anything that’s actually responsive. Given that I’ve only pondered using it in the implementation of complex, database-driven news sites, the task of tweaking every level of CSS to fit perfectly into a responsive grid system has so far seemed too daunting to tackle. Continue reading
I’ve been interested lately in finding examples of online-only, collaborative, non-profit newsrooms who’ve utilized the power of data visualization techniques to give added value to stories that otherwise wouldn’t necessarily be unique, and in doing so beat out legacy news organizations who published a text narrative alone. Take, for example, this data-rich story and interactive map displaying statewide testing results published by NJSpotlight Friday. While the news that only 8 out of 10 graduating seniors had passed New Jersey’s current standardized test in 2011 was widely reported across the state last week, including by the Star-Ledger in Newark and by The Press of Atlantic City, only NJSpotlight took advantage of the story’s strong data element to produce a more concise, data-driven visual narrative.
Others may hate, but I’m a big fan of using bubbles to display data. When implemented correctly (i.e. scaled in terms of area instead of diameter), bubbles can be an aesthetically appealing and concise way to represent the value of data points in an inherently visual format. Bubbles are even more useful when they include interactivity, with events like mouseover and zoom allowing users to drill down and compare similar-sized bubbles more easily than they can in static graphics. So, when I was recently working on a class project on autism diagnoses in New York City, I decided to use bubbles to represent the percentage of students with individualized education plans at all 1250 or so K-8 New York City schools. Continue reading
A quick refresher from my data visualization professor here at Columbia a couple of weeks ago reminded me why I was forced to spend all those grueling hours calculating standard deviation back in high school.
See, when you’re using a data set to tell a story, the first step is to understand what that data says. And to do that, you’ve got to have a good idea of the range and variation of the values at hand. Not only can figuring that information out help you determine whether there’s any statistical significance to your data set, but it can also pinpoint outliers and possible errors that may exist within the data before you begin the work of visualizing it.
Thanks to powerful processing programs like Excel, we can figure out the variability of data sets pretty easily using the program’s built-in standard deviation function (remember this intimidating-looking equation from calculus class?). Still, it always helps to know how to calculate the information out by hand, if only to get a conceptual idea of why numbers such as the standard deviation (the variability of a data-set) and the z-value (the number of standard deviations a given value is away from the mean) even matter in the first place when it comes to data visualization.
So, to brush up on my formulas and also better understand the numbers behind an actual story assignment for one of my classes, I recently hand-calculated the standard deviation and z-values for a set of data on state-by-state obesity rates. From my calculations, I was able to use the standard deviation (3.24) to determine that, on average, most states fell within the middle of the bell-curve for the average national obesity rate (27.1 percent) . In addition, the z-values helped me understand which states stood out from the pack as possible outliers (Mississippi is by far the most obese with a 2.13 z-value, Colorado the least obese with a -1.9 z-value). To get an idea of how those formulas look hand-calculated in Excel, check out my spreadsheet here. And keep these formulas in mind while working on your next data story. They can potentially save you time and effort by helping you figure out what your data set says before you have to go through the often-lengthy process of visualizing it.
With the Pulitzer Price announcements coming up later this afternoon, you’d think I’d be writing about whose up for the “Best Deadline Reporting” or “Best Public Service Journalism” prizes. But instead I want to talk about a different media award doled out during the past week: BostonGlobe.com’s designation as the “world’s best designed website” by the Society for News Design. Put simply, I can’t say I disagree. Continue reading
A close look at the numbers shows that, assuming the company’s ad rates and inventory held steady, BusinessInsider.com nearly doubled its annual revenue in 2011 to about $8.5 million.Continue reading
Take a look at this fascinating visualization of U.S. senate agreement groups made by Ph.D. student Adrian Friggeri. Using a complex agreement algorithim based upon data from GovTrack.us, the visualization displays how much all 100 senators of each U.S. Congress during the last 15 years have crossed the aisle –– or stuck to party lines –– on senate-floor votes.
From a design standpoint, the visualization is nearly flawless. The thin red and blue lines help the user form an instant party association, and the light gray bars in the background distinguish each Congress from the next without leading to visual clutter. What’s perhaps most impressive is that, despite the fact that the visualization contains far more than 100 different data points, the information is still fairly easy to access and the interface is stil simplistic in feel. Because each Senator’s entire individual trajectory is highlighted on mouseover, users can get a glimpse at how willing their respective Senator has been to negotiate a compromise across party lines over the years.
Most of all, the visualization does what all good visualizations should do: tells a story without text. As we can see, the number of Democrats who have crossed the aisle is notably larger than that of their GOP counterparts. This becomes ever more clear when we drill down to look at each party’s trajectory individually, where the connections can be seen more clearly. Perhaps what I would’ve liked to have seen in addition, however, is some sort of summary or average value of the disparity between the two parties on agreement rates, even if just a number at the bottom of the visualization. As it stands, the user has to dissect the visualization a good bit to tell that Democrats have a higher “agreement rate” than Republicans.
The networked line structure reminds me a lot of the Wall Street Journal’s “What They Know” visualization, except that this visualization has a good bit less clutter and complexity, and much better styling choices.
Good aesthetics are more than just fluff when it comes to design. They are a core part of a product’s functionality. Such is the argument Donald A. Norman makes in his insightful 2005 book Emotional Design: Why We Love (or Hate) Everyday Things. For Norman, attractive things work better by boosting the mood of the user and therefore allowing him or her to think more clearly and operate it more efficiently.
Undergirding Norman’s thesis that aesthetics directly influence operability is his distinction between the three basic levels of human cognition: the visceral (jumping at a sudden sound in a quiet room), the behavioral (relaxing in the solitude of a quiet room) and the reflective (thinking to oneself about why a quiet room is more enjoyable). As Norman asserts, these three levels of thought processing “interact with one another, each modulating the others” (7). You cannot escape the effect that one level of thought processing has on the other. As such, a visceral reaction to an external stimuli influences the subsequent behavioral reactions we have, which in turn influence our reflective conclusions about the stimuli itself. If we have a negative visceral reaction to a poorly design website, our mood is negatively affected in such a way that hinders our ability to navigate and use the site, even if there’s nothing wrong with the navigation or user interface from a technical standpoint. All our brain can focus on is the poor design. This reaction is similar to the way humans form first impressions of others; if an individual makes a poor first impression (a visceral reaction), we are less likely to act on his or her future actions or speech (a behavioral reaction), which in turn affects the entire way we think about that person (a reflective reaction).
Interactive designer Don Saffer artfully captures both the practical and the theoretical aspects of his profession in his 2006 book Designing for Interaction: Creating Smart Applications for Clever Devices. From its title, Saffer’s book may sound like a simple “how-to” guide to creating web apps with interactivity. Yet while it is certainly that to an extent, the book is more broadly a treatise and exploration of the ideology and terminology behind interactive design.
Saffer sets out to answer seemingly simple questions such as “What is interaction design?” and “What is the value of interaction design?” with thoughtful, reflective analyses. The principle purpose of interaction design, he argues, is “its application to real problems” and its ability to “solve specific problems under a specific set of circumstances” (5). As such, interaction design is inherently attached to the physical world, and is “by its nature contextual,” changing and evolving in its definition over the course of time and space (4). Paradoxically, however, Saffer argues that the core principles of good interaction design are “technologically agnostic,” and don’t change along with ebbs and flows of technological innovation: “Since technology frequently changes, good interaction design doesn’t align itself to any one technology or medium in particular” (7). How can Saffer simultaneoulsy assert that interaction design is tethered to its particular context in time, while in the same breath arguing that it remains unchanging in its core values? Don’t the two statements on some level contradict one another? Saffer would likely respond by saying that only the “principles” behind interaction design – helping people communicate with one another and, to a lesser degree, with computers – remain constant amid technological upheaval. But this view verges on downplaying the power of future technological change to fundamentally alter every known aspect of the way we communicate. What if, in ten years, a technology comes along that automates communication in such a way that leads to a paradigmatic shift in the role of the interaction designer? Although Saffer is correct in his assertion that even the rise of the Internet has not so far altered the core principles of interaction design, that doesn’t mean that such constancy will always be the case.
When trying to reach a mass audience, what’s the best platform to share your content? Well, the obvious answer is as many places as you can. But according to a post by bitly analyzing traffic patterns, links shared on YouTube have a lifespan of 7.3 hours, compared to 2.8 hours on Twitter and 3.4 hours on Facebook. Why such the disparity? Why does YouTube have such a longer lifespan?
Is it because video has a longer lifespan than all other forms of content? Or is it because YouTube has a different user-experience than other social media platforms? While YouTube content is slower to peak, it lasts far longer in the online ecosystem than content posted on other social media platforms such as Twitter and Facebook. The most obvious answer for the cause of this phenomenon would be that video is a medium that inherently captures our attention for a longer, and slower, period of time. We tend to go back, rewatch and share video more than we do text-based content, causing video to have a longer lifespan.
But there’s also another possible explanation for YouTube’s lengthier half-life. It could just be the nature of YouTube’s network structure. Facebook and Twitter are more of aggregators than YouTube, which is a platform for user-generated content rather than just a portal. So, because of their vast user base and high rate of captivity, Facebook and Twitter by their nature attract attention quicker. But that attention is often only surface attention, which is possibly a reason those networks have a shorter half-life than YouTube. People go to YouTube videos more frequently as a destination, whereas other social media platforms only act as a portal.
In keeping with our recent weekly reading about the growing ‘gamification‘ of data, I wanted to focus my critique this week on a map-styled data-driven game made my a group of researchers at Rutgers University called Salubrious Nation. The game attempts to engage users more deeply with public health data by luring them in with an addictive system of points and rewards.
In terms of functionality, the game play operates fairly simply. A map presents demographic data about every county in the 48 states of the continental U.S. The game then chooses one county at random and asks the player to guess a public health statistic about it, like binge drinking, teenage pregnancies, diabetes, obesity rates, etc. The game features two types of interaction: the user can mouse over any county to see demographic data about it (population, poverty rate, life expectancy, etc.), and a slider at the top to enter the player’s guess for the county up for play. As you moved the slider up and down, you can get hints about how close you are by looking at whether the surrounding counties are above or below the value you’ve chosen. Based upon how close the player’s guess is to the actual statistic, the player earns a corresponding amount of points. After eight rounds of the play, the game ends, and the player is told how his or her performance matches up to others who’ve played the game before. Apparently, I scored higher than 62 percent of other players. Woohoo! Just enough of a dopamine rush to get me to play again.
What’s cool about this game is that it makes data something to get immersed in for the fun of it, and you learn along the way. Over time, you begin to notice patterns emerging as you learn the tricks and strategies of the game. You figure out that the Western half of the country tends to have a higher rate of binge drinking. You learn that diabetes and obesity is the worse in the South. As one of the game’s creator, Nick Diakopoulous, explains, the gamification of health data provides a good opporunity for users to focus on data they might otherwise ignore: “Considering the selective attention issue, where people are more likely to pay attention to things that they already agree with, this result suggests an opportunity to get players to look at aspects of the data that they might not otherwise be inclined to look at.”
I can only find a few possible qualms with the game. One is that it operates off of flash, meaning that it can’t be run on most smartphones or tablets. Another is that the yellow-to-orange color scheme seems to be a bit disorienting on the eyes. Perhaps the developers would’ve been wiser to choose softer colors – possibly even a red-to-green graduated scale with a neutral middle value. Another thing that irked me, although I see little simple solution, is that county-level guessing seems almost so geographically-specific that it’s hard for most people (including myself) to have much knowledge of which specific counties in Oklahoma or Kansas have the highest obesity rates.
Lately I’ve been trying to get my feet wet with Django, an open-source Python web framework that’s well-suited to producing complex news apps under tight deadlines. I haven’t had enough free time yet to get into the nitty gritty of it, but I’m getting there slowly. What first piqued my interest in Django was a brilliant news app I ran across a couple of months ago called Curbwise, which was built with Django by the news developer team at the Omaha World Herald/Omaha.com.
Curbwise advertises itself as “your one-stop shop for the latest on real-estate in Douglas County.” But Curbwise is much more than your standard, run-of-the-mill real-estate section of most local news websites. It allows the user to fine-tune what neighborhoods he or she wants to view, and compare demographic data and housing prices side-by-side. Using a complex, clickable system of Google maps with a clean design and corresponding tables, you can drill-down to see all sorts of individual data charted out in an appealing red color-scheme, along with a listing of houses that are currently on the market in the neighborhood. You can even click on individual properties to see the historical and current valuations not only of the property in question,but of all the properties nearby. The warm yellow used to display the property tracts on the map invites the user to mouse over all the houses to see highly stylized infoWindows with more information. It’s really hard to find anything about the navigation, interface and design to complain about. The only thing that might possibly make the app better is adding interactivity to the static charts on the neighborhood and property pages.
Obviously, all of this data is of immense value to users on an evergreen basis, not just a transitory news cycle. What’s also impressive is that it’s useful for both interested home buyers looking to browse the marketplace and for current home-owners who want to see the valuations of their home compared to nearby homes. For a small fee, the app even lets you download a custom report with all of that information contained within it upon entering your address. And, just in case a homeowner suspect his or her home may be overvalued, the interface includes a handy guide to protesting your valuation with local government agencies.
On a whole, Curbwise is the epitome of a solid, innovative app built by a news organization that works to protect consumers and inform the public. Even better, the money made off custom report sales provides the paper with an additional revenue stream that likely helps offset the loss in print advertising in recent years.
For my final critique, I decided to look at a more straightforward and well-known visualization on gender wage gaps created by The New York Times back in 2010. The “Why is Her Paycheck Smaller” visualization shows how simple, mostly static scatter plots can sometimes be the most efficient and informative way to tell a story.
Functionality-wise, the visualization is not terribly impressive. Not only does it run on clunky, often-inoperable Flash, but it has little in terms of interactivity. All you basically do is click on each of the occupations to see where the dots for that occupation fall, and then mouse over the dots to see more specific information. The clean, crisp design, on the other hand, makes the colored dots stand out, basking in the surrounding minimalism. The notations help explain possible outliers without cluttering the graph, and the charts on the bottom right put the data into a larger context neatly and concisely.
For its time, this visualization probably was cutting-edge. But despite its less sophisticated technologies looking back now, it communicates just as powerfully as any of the best visualizations do in 2012. The “Why is her Paycheck Smaller” visualization shows that, no matter what technology, good charting, design and editing makes for a strong story. It’s easy to get caught up in the technologies, but sometimes less is more.
An interesting question came up at last Wednesday’s Doing Data Journalism (#doingdataj) panel hosted by the Tow Center for Digital Journalism here at Columbia’s J-School: Should there be data specialists in the newsroom, or can everyone be a data journalist? For New York Times interactive editor Aron Pilholfer, who participated in the panel, the question is not so much should everyone do data as will everyone do data. And for Pilholfer, the answer to that question clearly seems to be no:
I kind of naively thought that at one time you could train everybody to be at least a base level of competency with something like Excel, but I’m not of that belief anymore. I think you do need specialists.
I’ve always hated the idea of having technology or innovation ‘specialists’ in a work environment that should ideally be collaborative. So, at first I tended to disagree with Pilholfer’s argument. But what won me over was the reasoning behind his claim. For Pilholfer, it’s not that the technology, human talent or open source tools aren’t there for everyone to scrape, analyze and process data –– in fact, it’s now easier than ever to organize messy data with simple and often free desktop applications like Excel and Google Refine. The problem is that there’s a cultural lack of interest within newsrooms, often from an editorial level, to produce data-driven stories. As Pilholfer says in what appears to be an indictment of upper-level editors for disregarding the value of data,
The problem is that we continue to reward crap journalism that’s based on anecdotal evidence alone . . . But truly if it’s not a priority at the top to reward good data-driven journalism, it’s going to be impossible to get people into data because they just don’t think it’s worth it.
I totally agree, but with one lurking suspicion. As with the top-level editors, many traditional users –– or ‘readers,’ as one might call them –– still at least think they like to read pretty, anecdotal narratives, and tend not to care as much whether the hard data backs them up. In other words, it’s an audience problem just as much as it is a managerial or institutional one. Some legacy news consumers just still aren’t data literate. Because they’re not accustomed to even having such data freely available to them, they don’t even value having it. As the old saying goes, “You can’t miss what you never had.” Yet as traffic and engagement statistics continually confirm, as soon users have open data readily available to them through news apps and data visualizations, they spend more time accessing the data than they do reading the print narrative.
Totally agree, but harbor the lurking suspicion that many traditional readers still like to read pretty narratives and don’t care as much if the facts back them up. In other words, it’s an audience problem just as much as it is an editorial one.
Today I’d like to critique the recent redesign of my former newspaper’s website, The Telegraph/Macon.com, in Macon, Ga. For years we’d suffered with a cluttered, portal-styled homepage that devoted more screen real estate to aggregated national news and real estate widgets than it did to original, local content (see a small snippet of old site here). The share buttons were completely out of date, the commenting system was clunky and the overall feel of the site was one of chaos. Even when we did do special features including complex packages and maps, they looked out of place against the stark ad-cluttered grid of the site. While the deep red color at the top of the page gave the site a sense of branding, it ultimately left too much white space to make a strong impression.
The new Macon.com is a marked improvement over its predecessor. For one thing, The Telegraph masthead in place of the Macon.com header logo signals a greater emphasis on local reporting and a shift away from trying to be a portal to all things Middle Georgia. Not using the word ‘Macon’ also helps brand the site to users outside of the confines of the Macon city limits into the growing suburbs of Warner Robins, Centerville and North Bibb County.
From a design standpoint, the new site has a lot going for it. Perhaps most strikingly, the bold, gradated masthead gives a feeling of coherence and robustness to the site. I’m also a big fan of combining the traditional newspaper headline font alongside the modern-looking serif used for the phrase “Middle Georgia’s News Source.” What also stands out, though you can’t see it at the current moment, is the decidedly photo-friendly nature of the new design. Images on articles fill the full column-width, creating a strong visual impact. On the homepage, the feature story usually includes a relatively large 600px by 400px image, with a black opaque strip along the bottom containing the headline and excerpt in a contrasting white text. Going back to font choice, what also makes the site stand out is its mix of serif and sans-serif across the site. In addition, the grey bars that underlie section headings create an organized, grid-like feeling without being two distracting.
Most of all, unlike many other news sites, the new Macon.com doesn’t feel like everything has been squished together. The content navigation bar is kept to a maximum of 8 items in a large sans-serif font, with a drop-down jQuery to display children categories. The top page-nav bar also reflects visual restraint by sticking to only six items, rather than a massive compendium of every single vertical the paper offers the community. Finally, the choice to move the “Latest headlines” section from the middle of the page to the right-hand column is a wise one, as our eyes naturally look to the left or right before the middle.
The latest delegate counts show that Mitt Romney nearly doubled his delegate total after Super Tuesday’s primary elections. Click on the table to see how the delegates from Tuesday’s vote broke down.
For a lot of self-indulgent reasons, I secretly love The Huffington Post. But well-designed visualizations and interactive interfaces have never been the news organization’s strong suit. While their live coverage of Tuesday night’s GOP primary in Michigan had all the flavor of a classic HuffPo report – updates faster than you can send a Tweet, snarky comments, and dramatic headlines – what stood out to me was how they integrated real-time election results into a mapping format. And not only was the map visually appealing, with clean lines, distinctive color choices and a refreshing sense of minimalism, but it also did a good job of allowing the user to know what was going on across the state as the results were being tallied. The legend makes it clear which candidates are leading using numbers, while the map allows viewers to see which part of the state Santorum and Romney have claimed.
Having this geographic breakdown is particularly important in Michigan. For one, the notorious swing-state is vastly different demographically from one area to another. People in unionized Detroit vote nothing like the more conservative folks on the Michigan panhandle. Moreover, knowing who received what votes where is even more important in Michigan because of the fact that it’s Romney’s home state. If Romney didn’t come off with big margins in and around his hometown south of Detroit, it would have seriously hurt his momentum going forward. The importance of the Michigan vote to Romney becomes even more important in light of his recent insistence that the auto-companies should not have received a bailout and that the country “should let Detriot go broke.” But it doesn’t seem as though Romney’s comments lost him the urban areas entirely, as he easily carried Detroit and Grand Rapids by huge margins.
This map displays the results from Tuesday night’s Michigan GOP primary by county. The darker shade blue represents a higher percentage of voters for Mitt Romney, who narrowly won the race despite Michigan being his native state. Click on each county to see a breakdown of how Michigan Republicans cast their ballots.
SOURCE: Michigan Dept. of State, Feb. 28, 2011, 11:34 E.S.T.
So I’ll admit it: I’ve always kind of had a design crush on the Guardian‘s website, and I may or may not have tried to emulate it in various other news websites I’ve developed. What I love most about the Guardian’s design is simply its proprietary typeface. That slightly “Georgia” looking serif with the curbed nodules and cut-off “G’s” instantly alerts the user that they’re interacting with the Guardian brand. Another strong aspect of the site is that it succeeds where many legacy news organizations fail in that it successfully and cleanly integrates an array of different content, from videos, to mugshots for columnists, to vertical celebrity shoots and to landscape scenes of world political affairs and crises. Though it may seem obvious, the coordianted color schemes on the site allow the user to receive visual cues about which section she’s reading or encountering. Color is perhaps the Guardian’s strongest visual element.
What also makes the Guardian site in my view the almost perfect model for for-profit news sites is its interactivity. Designers don’t have to worry about whether the body text of the articles will make the page look visually too distracting, as users can simply hover over a picture to read the excerpt. It also likely increases audience engagement, asssuming that people click or hover on stories who may not have otherwise.
I could go on and on for days about what a groundbreaking model the Guardian’s website is––like how its use of white space around the header gives users a sense of minimalism, or the way in which the site displays its ads. But I won’t. All I’ll say is that it’s so user-friendly that it’s hopped over the pond to circulate in America.
In contrast with Norman –– who argues flatly for programmers to adopt a more immersive, task-centered approach to computer design rooted in cultural conventions ––Manovich contends in his paper on human-computer interfaces that designers should instead seek to embrace the new language of the computer medium, the language of the interface. The failure of programmers to make use of the full power of the interface as a language in and of itself, Manovich argues, can be traced back to two competing impulses: representation and control. The desire to make computing “represent” or “borrow ‘conventions’ of the human-made physical environment” often inevitably limits the full range of “control” or flexibility the computer interface can offer. But although Manovich clearly leaves some room for common ground between the impulses of representation and control, he tends to paint them at times as almost mutually exclusive to each other. While he is no doubt correct in his assumption that “neither extreme is ultimately satisfactory by itself,” he particularly laments the arbitrary shoveling of old cultural conventions onto the role of the computer as a control mechanism.
Rather than seek to imitate pre-existing forms of communication mediums, Manovich asserts that programmers should embrace the “new literary form of a new medium, perhaps the real medium of a computer – its interface” (92). Only when a user has learned this “new language” can he or she have a truly immersive computing experience. This stands in sharp contrast to Norman, who champions s more populist message of usability and rails against the notion that “if you have not passed the secret rites of initiation into programming skills, you should not be allowed into the society of computer users.”