Data Visualization on Steroids

I’ve talked about Hans Rosling’s presentation at TedTalks before, but given my recent post on IBM’s new data visualization site and my newfound ability to embed video on my blog, I’m going to promote this amazing talk once again.

Rosling talks about world health, but in the process gives a master class on data visualization. You’ll never look at Excel’s chart plotting function the same way again.

At one point (around the 13–minutes into the video), Rosling is displaying five dimensional data on a two-dimensional figure by using the x- and y-axis, data symbol size, data symbol color, and a trail that shows how the data changes over time. Wow.

Rosling has been invited back to speak at this year’s TedTalks. Another speaker of interest to readers of this blog is MIT Media Lab’s design simplicity guru John Maeda. I’ll see you all there (kidding). Any TedTalks speakers who would like to speak at a research center in Berkeley, e-mail me (not kidding, although not delusionally optimistic either).

San Francisco Dreams, Fast Food Nightmares

The IBM Visual Communication Lab has created an interesting website called Many Eyes that allows people to get an intuitive understanding of data through different visualization techniques. The site provides a wide array of formats for data display, including the recently developed format of Tree Maps that I’ve found useful for analyzing my hard disk content using SequoiaView.

The Many Eyes service is free and easy to use. One can both upload data and explore visualizations of data others have uploaded. I’ll show you two that I’ve created.

Someone uploaded to Many Eyes the nutritional content of items from McDonald’s menu, so I decided to create a chart that I thought would highlight the good, bad and ugly. Using a scatterplot, the horizontal axis shows trans fat content, the vertical axis shows saturated fat content, and the size of each data symbol shows the cholesterol content. Click on the figure to the right to go to the visualization I created that allows you to see which data points are for which products and to create your own visualizations from this data set.

Plotting the trans fat, saturated fat, and cholesterol of the McDonald’s menu in this way makes serveral facts obvious:

  • The Deluxe Breakfasts should come with a defibrillator. They are the large-sized data points in the top right that max out all unhealthy categories.
  • Beware of products that proclaim low trans fat. Look for the small dot at the top left which is the Double Quarter Pounder with Cheese. It has very low trans fat and very low cholesterol, but more saturated fat than any other product they sell.
  • There are some products with small-size data points in the lower left corner indicating healthy goodness. Next time that I’m stuck at an airport and the only place to eat is McDonald’s, I’ll be eating an Asian Salad with Grilled Chicken or the Premium Grilled Chicken Sandwich. Well…more likely I’ll wait until I get home and use the visualization described below to decide where to eat.

I love eating out, so I uploaded all of the ratings data for San Francisco restaurants from the San Francisco Chronicle’s Top 100 Bay Area Restaurants. Each restaurant was given ratings for Overall Quality, Food, Service, Price, Atmosphere, and Noise Level. Higher is better for all ratings except for Price where higher means more expensive, and Noise Level where higher means noisier (unless, of course, you like deafening crowds).

I created a visualization that readily displays the Overall Quality, Food and Price ratings. Click on the figure to the right to see which restaurants correspond to which data point. Restaurants further to the right in the plot have better food, ones higher up are pricier, and ones with larger data point sizes have better overall ratings. (Note that I had to add some randomness to each data point so that individual restaurants could be seen, otherwise too many restaurant data points fell on top of each other, making individual restaurants impossible to see as exhibited in my first attempt to visualize this data).

Comparing different aspects of restaurants using the Chronicle website is difficult and the advantages that some restaurants have over others are not obvious. This Many Eyes chart makes certain facts about these top restaurants very clear:

  • La Folie and Fleur de Lys reign supreme. Personally, I’m a little surprised by this because my one visit to La Folie wasn’t nearly as impressive an experience as at Gary Danko or Masa’s.
  • Ton Kiang provides the best value of all restaurants according to the Chronicle ratings. You’ll find it’s data point in the lower right: a high Food rating and a low Price rating. From personal experience, I can also add that it ranks high on the Sidewalk Waiting rating for the long waits on Sunday mornings to get in for dim sum.
  • What are Kokkari and House of Prime Rib doing on this list? They are the data points to the far left indicating low Food ratings. A quick change of the axes using the Many Eyes tools show that they are also low on Service but have decent Atmosphere ratings. High prices, mediocre food, poor service and good atmosphere: not a great combination for Top 100 restaurants.

I did a little other manipulation of the data axes to pull out interesting information. The figure on the right shows Food rating along the horizontal axis and Overall Quality along the vertical axis (Again, click on the figure to explore the individual data points). You’ll notice that the Food rating has a strong relationship to the the Overall Quality rating. If the horizontal axis is changed to service, price, or atmosphere (go ahead, click on the figure and change the axes yourself), you’ll find that these categories are not so strongly related to the Overall Quality rating, indicating how strongly food quality impacts the overall rating of a restaurant (as well it should).

If you pay more at one of these restaurants, are you more likely to get better food? Nope, the figure on the right shows that there is a slight trend to getting better food the more that you pay, but not much. A similar plot of Price versus Service indicates a similar disconnection. What you mostly pay for, according to this data, is ambiance: plotting Price against Atmosphere does show that the Atmosphere rating tends to increase as the Price rating increases.

To understand these relationships more precisely, I decided to do some statistical analysis on my own (not an ability available on the IBM website). The chart below shows the correlation matrix for each of the restaurant ratings. Numbers vary from 0 to 1, where a 0 means that two factors are uncorrelated, while a 1 means that the two factors are perfectly correlated. The two factors being compared by each number shown are the categories associated with the row and column of each number.

Ratings Correlation

Some interesting insights from the correlation matrix are:

  • The rating category most correlated with Overall Quality is Food, with a correlation of 0.89.
  • Price is more correlated with Atmosphere (0.6) than Food (0.45) . This means that by paying more, you have a better chance of increasing the look of the restaurant than the quality of the food.
  • I’m very surprised by how little Price is correlated with Service (0.22). Paying more appears to have little effect on the quality of the wait staff. This may be due to the expectations of the reviewer: higher-priced restaurants might have gotten penalized more in the Service rating for service faux pas (such as not providing clean utensils between courses) than lower-priced restaurants.
  • The noise level of a restaurant had next to no impact (-0.18) on the overall rating assigned to the restaurant.

That’s all I’ve got to say on San Francisco restaurants for now.

The service provided by Many Eyes is an interesting one and demonstrates how plotting data in the proper way can quickly pull out relationships and interesting features of a dataset. I’m sure the Edward Tufte would approve. I look forward to IBM allowing people to embed these visualization applets on their website, which would allow the SF Chronicle to provide this service on their own website.

He Put the i in Design, and iPod, and iMac, and iPhone…

For those who can’t get enough information about the new iPhone or those who, like me, are eager to find more information about the Apple design innovation process, here’s a BusinessWeek article from a few months ago on the designer of the iPhone. And the iMac. And the iPod. Given his name, I guess we know what the “i” stands for.

Jonathan Ive heads Apple’s design group, a team that primarily works in San Francisco. The relationship between Ive and Jobs is interesting to read about given Ive’s quiet public demeanor and Steve’s attention-grabbing one. More interesting are the details of Ive’s design process.

I won’t regurgitate details from the article, but there are two aspects of Ive’s process that are worth noting.

Ive works closely with engineers to understand what’s possible, marketers to understand usability and consumer needs, and manufacturers to understand, well, manufacturability. As BW puts it,

Ive’s team at Apple isn’t the usual design ghetto of creativity that exists inside most corporations.

People and companies look at the success of the iPod and Apple’s dominance of design and conclude that they can emulate that design genius by focusing on cool new looks or on the latest business mantra Simplicity. Focusing on design as a creative-only process, as if the iPod dominates its market because someone designer rubbed the right genie bottle one day, misses the point and the value that Ive brings to Apple. His design process is one of intensive hard work and the ability to reduce expertise from multiple disciplines into the form factor of a single product. Great design exists in the harmonious combination of function and aesthetics, and processes to achieve this do exist and are perfected at Apple.

Edison’s maxim, Genius is one percent inspiration and ninety-nine percent perspiration, applies to design as well as it does to engineering or science. All three require innovation processes that include trial and error, intuition, investigation, and hard work. The lone genius creating innovations through flashes of creativity rarely exists in these fields. A description of Ive’s career makes clear that his and Apple’s success in design is the product of an incredibly disciplined process and a daunting amount of work. And, of course, terribly brilliant people for whom their job is their passion.

Which leads to the second point worth noting from the story, which is Ive’s process of creating hundreds of prototypes in the process of investigating ideas and refining designs. Ive has invested heavily into advanced tooling capabilities that allows his team to rapidly prototype ideas and quickly determine what’s good and bad about design ideas. While most companies examine designs by looking at 3–D CAD drawings projected onto a meeting room wall, Ive creates the designs as physical objects that he can hold and physically assess, sometimes using materials as simple as sculpted styrofoam, and figures out what aspects work and what ones don’t. This is also part of the IDEO way: to rapidly create prototypes so that designs can be assessed in terms of usability in a way that can never be done just by looking at a CAD design, and to iterate quickly on alternate designs, integrating the best concepts of each prototype to create a superior product.

Reduce to practice, investigate, try again, dare to create faulty designs so that they can inform the path to better ones. Apple’s success (and IDEO’s, and a few others’) has clearly proven this process as a valuable approach to successful design innovation. Not only can other companies learn by examining this approach closely, but other disciplines could probably improve their approaches to innovation as well by emulating aspects of this process. I’m sure that there are several business school dissertations developing those ideas already…

TedTalks Scoops Steve Jobs

I was going to write about something else, but I feel compelled to write a post on Apple’s iPhone that was just introduced. I have no doubt that it is being assessed, critically or not, in almost every tech blog today. So, I’m not going to go over the features or exclaim my enthusiasms for their latest innovation…make that innovations.

I will say that the iPhone appears to be a beautiful example of innovating to meet the needs of the consumer. Also, the audacity of spec’ing a cellphone to have just a single button is pretty amazing. Okay, that was a couple enthusiastic exclamations.

FYI, the unique multi-touch user interface was previewed by a researcher from NYU in this amazing video from TedTalks:

Festivus 2.0

ServersMassive traffic to iTunes from new iPod owners and gift card recipients swamped the iTunes servers on Christmas day, causing crashes, slow downloads, problems with purchases, and general griping. iTunes apparently received over 400% more traffic than last Christmas, and Apple was caught unprepared. Similar chaos happened with the music service Rhapsody, with users complaining of an unwelcome Christmas present when songs wouldn’t stream and faulty DRM permissions prevented users from listening to their rightfully licensed music. Meanwhile, just two days after Christmas, an earthquake knocked out nearly all internet access to Asia. The thought of a continent cut off from the internet is mind-boggling—a natural experiment indeed!

These hiccups demonstrate one difficulty with the new movement towards online services replacing traditional modes of activity. Online data applications are great until you find yourself without an internet connection or a software service that upgraded to a version that doesn’t work as well. Once Google acquired Writely and changed it to Google Documents, I’ve found collaborators have more trouble figuring out how to login to work on a document. That part should be easy. Servers going down, slow connections, upgrades to Internet Explorer or Firefox that suddenly make your Web 2.0 application go buggy are all reasons that the typical consumer is going to reject these alternatives to current approaches to services.

Still, the expectation is that people’s reliance on internet connectivity is going to increase exponentially in the near future. The Los Angeles Times asked several technologists what trends they saw taking shape in 2007, and their answers were surprisingly consistent. Steve Ballmer sees great things in online TV and digital rights management, and is optimistic about how communications technology will improve through convergence:

2007 will be the year that unified communications technology helped us regain control of our information and our lives.

Ned Sherman says that virtual worlds, such as Second Life and Warcraft, will grow to a prominent place in our lives. Rafat Ali predicts that online personalities will take hold of our attention. Kevin Webach predicts P2P television will take off. Chris Anderson sees online gaming enabling online video on televisions. Hank Barry sees “virtualization technologies” allowing people to become computer agnostic and transfer their data and applications with ease from one machine to another. John Brockman sees WiFi as an enabler to “continuous computing” where your data and and apps can follow you wherever you go.

All of these predictions, perhaps, overestimate the eagerness of the average person to embrace new technology, learn new procedures, and generally change the way that they live their lives. Look at how long online retail really took to become a major force. Look at how Tivo is still struggling to survive.

New approaches are enthusiastically embraced by younger generations. MySpace is dominated by teenagers and twenty-somethings. There is a strong age divide, however. I am continually amazed whenever I talk to someone who is middle-aged and realize that they have no idea what YouTube is. Next time you’re talking to someone over the age of 30, ask them what online music sources they listen to, and if they prefer streaming radio or podcasts. Predictions such as those made in the LA Times need to tempered by these generational gaps in consumer acceptance or risk being perceived as being simply the wet dreams of executive technologists.

Technology will change, but it will take time for the typical consumer to accept these changes. Changes are going to have to be easy to adapt to and there must be a strong reason to motivate people to change. I sound like a luddite here even though I am thoroughly absorbed by such technologies. While driving home from work at the end of the year, I heard a music critic on NPR state that CD sales were down 5% in 2006 and then he went on to list his top 10 albums of the year. Soon after arriving home, I was listening to one of the critic’s picks—My Chemical Romance—through the online music service Rhapsody and was wondering at the vast number of people who still buy CDs and keep CD sales from plummeting at an even faster rate.

WSJ, Hearing and the Looming AAAS Conference

The Wall Street Journal today mentioned a conference session for which I am both a co-organizer and speaker. The WSJ article has an interview with Stefan Heller, a professor at Stanford University who is one of the invited speakers in the session, on the damage to hearing caused by such popular products as the iPod—a topic that I’ve posted at length on before. Dr. Heller’s research is on the use of embryonic stem cells to restore hearing to those with sensorineural hearing loss. The WSJ article simply discusses the potential for damage from current audio products and the fact that people don’t know that they are causing damage to their hearing until it’s too late:

WSJ: Can you actually kill some cells just from listening to a single CD on an iPod at top volume?
Heller: There probably are some people that can turn the volume of their iPods up to the limit and never have a problem. But other people might do it once and wipe out their high frequencies. And once that damage is done, it will get progressively worse. But you can only know which group you are in after you’ve lost your hearing.

The conference at which both Dr. Heller and I are speaking is the annual meeting of the American Association for the Advancement of Science, the organization that publishes Science Magazine, which is possibly the most cited scientific publication in the world. The meeting is in San Francisco from Feb 15–19, 2007. The theme of the conference this year is Science and Technology for Sustainable Well-Being, and the session that I am co-organizing with Dr. Steven Greenberg is titled Hearing Health—The Looming Crisis and What Can Be Done. (For you loomers out there who found this post after googling “Loom”: Welcome. Please link to me on your Looming site.) Looks like the conference will be an interesting one, see the bottom of this post for a sampling of session titles.

I believe that we’re going to be reading a lot more about prevalence of hearing damage and attempts at hearing conservation over the next few years. A small startup is addressing these issues with their recently launched iHearSafe earbuds that have hearing protection built right into them. This accessory to the iPod and other audio products appears to be designed with a more rigorous approach to hearing conservation than the iPod firmware upgrade last year that purported to address similar concerns about hearing conservation. As further evidence, over 150 scientists and intellectuals responded to web magazine Edge’s new year’s inquiry, “What are you optimistic about? Why?” and among such responses as Nathan Myhrvold’s “The Power of Educated People to Make Important Innovations,” Jared Diamond’s “Good Choices Sometimes Prevail,” and Steven Pinker’s “The Decline of Violence” was David Myer’s optimism towards benefit from hearing aids.

Back to the AAAS meeting: I’ll be speaking at the Hearing Health session about the application of hearing science to hearing technology. Due to an AAAS embargo on releasing presentation material before the session, I won’t be posting my talk or providing details from it until after the conference. This is done to ensure that the conference receives maximum press coverage, I suppose.

The program at the conference is extensive and incredibly diverse. As an example, below are listed the symposia that will occur on Friday at 8:30am:

  • Achieving and Sustaining a Diverse Science Work Force
  • Addiction and the Brain: Are We Hard-Wired To Abuse Drugs?
  • Research Competitiveness Strategies of Small Countries
  • Communicating Climate Change: Strategies for Effective Engagement
  • Science, Society, and Shared Cyberinfrastructure: Discovery on the Grid
  • Smart Prosthetics: Interfaces to the Nervous System Help Restore Independence
  • The New Mars: Habitability of a Neighbor World
  • Tinkerers and Tipping Points: Invention and Diffusion of Marine Conservation Technology
  • The Crime Drop and Beyond: Explaining U.S. Crime Trends
  • Dynamics of Extinction
  • Achieving Sustainable Water Supplies in the Drought-Plagued West
  • National Innovation Strategies in the East Asian Region
  • Mixed Health Messages: Observational Versus Randomized Trials
  • Education in Developing Countries and the Global Science Web
  • Food Safety and Health: Whom Can You Trust?
  • Numbers and Nerves: Affect and Meaning in Risk Information
  • Teaching Sustainable Engineering
  • Anti-Evolutionism in Europe: Be Afraid, Be Very Afraid, or Not?

See you there.