Better than Prime Time TV

I just discovered a pretty interesting video site that describes itself as a Brilliant Ideas Network for Discourse and Debate: It’s videos include conference speeches and interviews, such as several from the Aspen Idea Festival. Unfortunately, most of the ones that I saw were abbreviated versions of the full speech/interview—tantalizing tidbits instead of complete content.

The video/audio content on the site is organized by category and varies from Entrepreneurship to Science to Visual Arts & Film to (of course) Innovation.

The videos aren’t as mind-changing as as the incredible TED talks, but are definitely worth spending time with.

As a sample, check out this discussion on innovation and R&D breakthroughs:


New Yorker on Speech Recognition

I always hold my breath and get a sinking feeling in my stomach whenever a field in which I have expertise takes center stage in a news story or pop-culture piece. More often than not, there are misrepresentations of both sophisticated and not-so-sophisticated aspects of the field (e.g., see Wired Magazine).

Such errors are a common occurrence in movies and television—accuracy in details play a secondary role to the story, and the vast majority of the audience has no idea whether the details are accurate or not. Pilots may object that the location of landing gear switches are not accurately portrayed in a movie, but does anyone else really care? (I recall the howls of protest that arose as outraged chess players complained about inaccuracies in the portrayal of competitive chess in the charming and under-rated movie Searching for Bobby Fischer—seriously, does anyone really care that the players weren’t writing down their moves, or that the games were actioned-up? Chess players worldwide should have been grateful that such a beautiful portrayal of the game was the framework for such a great family film).

HALThus, it was with surprise that I read a recently published article in the New Yorker on speech recognition by John Seabrook that provided an interesting and accurate tour of speech recognition, with brief asides on on a variety of related fields—the physiology of speech production, the physiology of hearing, prosody of speech—all tied together by the promise of computer-based communication that HAL presented in 2001 when the article’s author was  a little kid. I was also surprised to see a popular magazine reference John Pierce’s Acoustical Society letter Whither Speech Recognition, a scathing throwdown on the field of speech recognition in 1969 by the then executive director of research at Bell Laboratories (in this highly debated letter, Pierce criticized the state of speech recognition research at the time for having a “scarcity in the field of people who behave like scientists and of results that look like science.”) I highly recommend reading this New Yorker article for anyone with an interested in the topic.

One odd aspect of the story is that it ends with a discussion of a company called Sound Intelligence, which has developed audio sensor technology that detects violent activity on city streets for use by police. The company is cited as an example of the successful application of the work that Seabrook detailed on detecting emotion in speech. An engineer of the company, whom I heard speak about their technology last year, is quoted as saying that the Sound Intelligence grew out of auditory modeling research at the University of Groningen and its application to separating speech from background noise. It’s unclear to me how much the success of the technology requires complex auditory models or any of the science and technology the article had detailed up to that point. While I applaud Sound Intelligence’s success, the inclusion of their technology as the coda to an otherwise great review of the speech recognition field makes for an empty conclusion. I’m sure that the folks at Sound Intelligence, however, would disagree with me completely.

Cognition Boom

In August I spoke at the major hearing aid conference of the year, the International Symposium on Auditory and Audiological Research. What struck me at this year’s meeting was the preponderance of talks on cognitive issues. Two years ago, there were less than a handful of people presenting at these conferences on cognition and hearing loss or hearing aids. Now, it’s starting to become a dominant topic at conferences, and I’m more often hearing from PhD students who are basing their dissertations in this broad area.

I’ve posted before on the emergence of cognition as a major theme in many areas. Earlier this month, I was at a conference on Aging and Speech Communication, where the focus was on how how changes to cognition and hearing from aging affect communication ability. Several research presentations made clear that older subjects are more distracted by irrelevant information and were less able to ignore this information than younger people. When conducting tasks on a computer screen, the older subjects were less able to do the task when there were many items on the screen, and benefited more than younger subjects did by a clean and simple graphical user interface. Similar findings occurred with other modes of information.

This kind of research has huge implications for companies producing products for the older crowd, targeting the aging population of America. Several social networks targeted at the aging population have sprung up (Boomj, where customers must be too old to be worried about the “bj” favicon; Eons, which has the trademarked search engine cRANKy), and Facebook has been invaded by the post-college crowd who probably find the interface a little busy. A company that develops an understanding of how different age groups process information will provide an advantage over competitors that think the only change that needs to be made to such networks is content: Taking a social network designed for younger people and adding an obituaries section and a place to post photos of grandkids isn’t going to cut it. Tools that measure visual clutter or screen complexity could likely identify sites doomed for failure among the older crowd.

Certainly, an understanding of the unique cognitive demands and capabilities of the older population will be necessary for businesses targeting that market. In any business with targeted customers types, I expect that companies will begin to hire cognitive scientists as consultants and employees as they seek to understand their customers better. While User Experience Designer is a hot role in companies today, we could see User Cognition Researcher as the hot position of the future.

The Challenges Facing the Hearing Aid Field

HR SummitThe latest issue of the Hearing Review features seven papers from a research summit that I organized and my company hosted in January of this year in Napa, CA. The goal of the two-day meeting, attended by many of the nation’s thought-leaders in hearing aid research and key decision-makers within my company, was to set guidelines for the next 5 years of progress in our field.

The outcome of the meeting was consensus statements on the top issues facing our field today. I’m very pleased with the quality of the papers that resulted and, more importantly, with the value of the guidance that was developed at the meeting and expressed in the papers. I believe that years from now these papers will be seen as important guideposts, if not watersheds, for the direction that our field takes over the next decade.

The following is from my overview paper and provides the motivation for organizing this meeting:

The hearing care field is at a fascinating point in its history. Technological developments are accelerating almost too quickly to follow, and paradoxically, our science has matured to the point where only now do we recognize the vast number of research questions that still remain to be answered. The size of the hearing-impaired population is about to explode, and its demographics are changing in a way that will test our current products, services, and delivery models…this turbulent sea of change in which we find ourselves will have to be navigated with the precision that comes from careful planning, analysis, and dedicated problem solving. Ideally, a course must be charted that everyone can navigate.

The charted courses are provided by the accompanying papers that represent consensus statements from some of the nation’s top researchers, each one summarizing the challenges that our field currently faces and outlining guidelines for how to address these challenges. The issues addressed in the six papers are detailed in my overview as follows:

    • [Clinical Validity] Why does hearing aid benefit measured in the clinic often differ from benefit experienced by hearing aid wearers in the real world? Can we align the two to better meet the needs of the hearing impaired?
    • [[Individual Differences] How can we better comprehend individual differences in speech understanding ability and subsequently provide improved individualized hearing solutions based on measures of cochlear damage, psychoacoustic performance, and cognitive function?
    • [Evidence-based Practice] How can our field implement evidence-based practice and evidence-based design such that dispensing professionals can more effectively meet the needs of their patients?
    • [Wireless Technology] What are the challenges that must be addressed for wireless technology to reach its full potential for patient benefit?
    • [Aural Rehabilitation] How can our field optimize its use of aural rehabilitation in the hearing health care process?
    • [Future of Hearing Health Care Delivery] What challenges from changing patient demographics, changing technology, and changing market expectations are faced by the hearing health care delivery model?

This overview doesn’t begin to hint at the depth of thought provided in the papers, however, so if you are in the least bit interested in any of these topics, I recommend that you read the corresponding papers.

They key to successfully creating such an impressive collection of insight, of course, is to include an impressive collection of people who can develop and debate the ideas and then elegantly crystallize the discussion into the few critical points. I will probably post in the future on other key elements of hosting conferences such as these.

Data Sharing Article in Nature

The weekly science journal Nature just published an article on online data sharing that quotes me. My comments are from an e-mail exchange that I had with their Senior Reporter Declan Butler about the potential of new online data sharing sites such as Swivel and IBM’s Many Eyes. I’ve posted about Many Eyes before.

DataAccording to Declan’s e-mail to me, some scientists are already using these new tools to share sequence and microarray data. The potential value from scientists openly sharing their data is huge, possibly akin to the value provided by open-source software development. More people exploring data is always a good thing, and someone could discover meaningful information in data that the original owner/researcher missed. Or one’s interests might be different than that of the original owner/researcher and thus one could analyze the data in a different way that is meaningful to questions not investigated by the original researcher. In a scientific publication, the author can’t produce every possible permutation of the data that the readers might want, so letting the “reader” explore the data themselves through online accessibility has value. As Edward Tufte says in his book Visual Explanations,

When assessing evidence, it is helpful to see a full data matrix, all observations for all variables, those private numbers from which the public displays are constructed. No telling what will turn up.

(Thanks to Squaring the Globe blog for providing this quote.)

Anyone who has tried to obtain the raw data behind published research, however, knows that it can be difficult to get for many reasons: researchers have difficulty retrieving the data from media that is no longer used, researchers not having the time to search for and provide the data in an understandable format, researchers simply not wanting to lose any perceived advantage in pursuing future funding.

I’ve thought that a way around this is for NIH (or whatever the funding organization is) to require that all data from NIH-funded research be submitted to the NIH and be made publicly available. There are many difficulties with this proposal, of course, not the least of which is ensuring that others know how to read and interpret the data. The potential for misinterpretation would be huge. One possible solution to this would be to make available only data associated with a publication that details the methods and procedures of the data collection. This could become a policy that the publishing journal mandates rather than the funding organization.

I’ve been told that a proposal was made within the NIH to do just this several years ago for a discipline that is data-heavy, but the scientists in that field shot down the idea for several reasons, one of which was that they didn’t want any errors in their own data analysis discovered. Whatever the reasons, published figures and tables have been the primary form of information transmission of data for hundreds of years. With today’s electronic tools, there is no reason to limit our data sharing ability to techniques developed centuries ago.

WSJ, Hearing and the Looming AAAS Conference

The Wall Street Journal today mentioned a conference session for which I am both a co-organizer and speaker. The WSJ article has an interview with Stefan Heller, a professor at Stanford University who is one of the invited speakers in the session, on the damage to hearing caused by such popular products as the iPod—a topic that I’ve posted at length on before. Dr. Heller’s research is on the use of embryonic stem cells to restore hearing to those with sensorineural hearing loss. The WSJ article simply discusses the potential for damage from current audio products and the fact that people don’t know that they are causing damage to their hearing until it’s too late:

WSJ: Can you actually kill some cells just from listening to a single CD on an iPod at top volume?
Heller: There probably are some people that can turn the volume of their iPods up to the limit and never have a problem. But other people might do it once and wipe out their high frequencies. And once that damage is done, it will get progressively worse. But you can only know which group you are in after you’ve lost your hearing.

The conference at which both Dr. Heller and I are speaking is the annual meeting of the American Association for the Advancement of Science, the organization that publishes Science Magazine, which is possibly the most cited scientific publication in the world. The meeting is in San Francisco from Feb 15–19, 2007. The theme of the conference this year is Science and Technology for Sustainable Well-Being, and the session that I am co-organizing with Dr. Steven Greenberg is titled Hearing Health—The Looming Crisis and What Can Be Done. (For you loomers out there who found this post after googling “Loom”: Welcome. Please link to me on your Looming site.) Looks like the conference will be an interesting one, see the bottom of this post for a sampling of session titles.

I believe that we’re going to be reading a lot more about prevalence of hearing damage and attempts at hearing conservation over the next few years. A small startup is addressing these issues with their recently launched iHearSafe earbuds that have hearing protection built right into them. This accessory to the iPod and other audio products appears to be designed with a more rigorous approach to hearing conservation than the iPod firmware upgrade last year that purported to address similar concerns about hearing conservation. As further evidence, over 150 scientists and intellectuals responded to web magazine Edge’s new year’s inquiry, “What are you optimistic about? Why?” and among such responses as Nathan Myhrvold’s “The Power of Educated People to Make Important Innovations,” Jared Diamond’s “Good Choices Sometimes Prevail,” and Steven Pinker’s “The Decline of Violence” was David Myer’s optimism towards benefit from hearing aids.

Back to the AAAS meeting: I’ll be speaking at the Hearing Health session about the application of hearing science to hearing technology. Due to an AAAS embargo on releasing presentation material before the session, I won’t be posting my talk or providing details from it until after the conference. This is done to ensure that the conference receives maximum press coverage, I suppose.

The program at the conference is extensive and incredibly diverse. As an example, below are listed the symposia that will occur on Friday at 8:30am:

  • Achieving and Sustaining a Diverse Science Work Force
  • Addiction and the Brain: Are We Hard-Wired To Abuse Drugs?
  • Research Competitiveness Strategies of Small Countries
  • Communicating Climate Change: Strategies for Effective Engagement
  • Science, Society, and Shared Cyberinfrastructure: Discovery on the Grid
  • Smart Prosthetics: Interfaces to the Nervous System Help Restore Independence
  • The New Mars: Habitability of a Neighbor World
  • Tinkerers and Tipping Points: Invention and Diffusion of Marine Conservation Technology
  • The Crime Drop and Beyond: Explaining U.S. Crime Trends
  • Dynamics of Extinction
  • Achieving Sustainable Water Supplies in the Drought-Plagued West
  • National Innovation Strategies in the East Asian Region
  • Mixed Health Messages: Observational Versus Randomized Trials
  • Education in Developing Countries and the Global Science Web
  • Food Safety and Health: Whom Can You Trust?
  • Numbers and Nerves: Affect and Meaning in Risk Information
  • Teaching Sustainable Engineering
  • Anti-Evolutionism in Europe: Be Afraid, Be Very Afraid, or Not?

See you there.

Kevin Kelly on Trends in Science

Kevin Kelly, co-founder of Wired magazine, The WELL, and author of the blog Cool Tools gave a talk called "The Next 100 Years of Science: Long-term Trends in the Scientific Method" at the Long Now Foundation Lecture Series. I was not really familiar with Kelly’s writings, but I attended the talk because of my interest in the topic and because I was familiar with Kelly’s reputation as a respected commentator. Needless to say, his Cool Tools blog did not prepare me for what to expect from his talk, nor does the blog do his talents justice.

Kelly is a self-proclaimed scientist groupie, being a college drop-out and having never participated in technology as a scientist or engineer. He contributes as a cultural commentator, which is how he approached his lecture. Kelly said that he is more interested in the process of science rather than science itself and noted that most scientists are “clueless” about the topic. His interest in talking about the future of science is in how the process will evolve, rather than what actual breakthroughs will be made. So, there was no speculation on the forthcoming prevalence of jetpacks, flying cars or replicants (those would be technological advances rather than scientific advances, anyway).

Despite the forward-looking title, Kelly spent much of his talk detailing key developments in the past history of science. To predict future developments in the scientific method, he would look for patterns in the scientific process over the past 2000 years.

Kelly’s abbreviated history of the scientific process timeline went like this:

2000 BC: first bibliography
250 BC: First catalog
200 BC: first library with an index
1000 AD: first collaborative encyclopedia
1590: first controlled experiment
1600: Introduction of laboratories
1609: Introduction of scopes
1650: Society of Experts created
1665: The concept of necessary repeatability introduced
1665: First scholarly journal published
1675: Peer review introduced
1687: The concept of hypothesis/prediction introduced
1920: Falsifiability introduced
1926: Randomized design created
1937: Controlled placebo approach developed
1946: First computer simulation
1950: Double-blind refinement
1962: Kuhn’s Study of the Scientific Method

All of these are changes to the process of how we know something. The introduction of Falsifiability, for example, affected what we would consider a scientific theory: if a theory could not be proven wrong, then it wasn’t a theory at all (and could more likely be categorized as a belief).

After detailing his view of how the scientific method has evolved up until now, Kelly then went on to present five predictions of how science and the scientific method would change over the next century:

  1. Science will change in the next 50 years as much as it changed in the last 400. No doubt. Everything is accelerating, although we are highly unlikely to achieve a singularity as Ray Kurzweil suggests.
  2. It will be a Bio century. Kelly provided data that demonstrates how biology is already the biggest scientific field today and suggested that the amount that we have to learn over the next several decades will overshadow developments in every other field.
  3. Computers will lead the Third Way of Science. Kelly suggested that the general methods for making scientific progress have so far been Measurement and Hypothesis. He suggests that Computer Simulations will become just as important a tool in the scientist’s arsenal for advancing our knowledge and understanding. Don’t know how something works? Run simulations of every possible parameter set and permutation until you accurately model the behavior of the process that you are observing. I see this already in my field, and certainly simulations play a significant role in our understanding of many different systems today, from economics to physiology.
  4. Science will create new ways of knowing. Kelly (I think) is talking about tools here. He mentioned wikis, distributed computing, journals of negative results, and triple-blind experiments as examples of recent changes to the process of developing and sharing information. Distributed computing is the distribution of a parallel-processed problem to be solved across many connected computers, as is already being done by SETI and for conducting cancer research. Triple-blind experiments refer to the gathering of massive amounts of data and storing it for future experiments that haven’t been specified yet, with such a broad swath obtained that the control data can also be extracted from the database.
  5. Science will create a new level of meaning. Here Kelly extrapolated the concept of distributed computing by speculating on the power of all the computers on the internet as a single computing machine. He created analogies between this massive system and the structure of the brain. I have to admit, my notes are sketchy on this section, but they include discussion of both science and religion as consisting of infinite games and recursive loops, and proclamations that Science Is Holy and that the long-term future of science is a divine trip. I guess you’ll have to wait for his book for an explanation of these concepts.

The Q&A section after his talk was perhaps the most interesting part of the seminar. Kelly has clearly spent a lot of time thinking about these issues, and his thoughts are both entertaining and intellectually interesting even if you think that he has completely missed the boat and take issue with his non-scholarly approach.

Keven Kelly seems like he would be an interesting guy to meet at a party for a memorable night of discussion.

Linguistics Explains Elmo’s Death Threat

I’ve seen the stories (on boingboingremember the news) about the Elmo kids’ book that has interactive audio and has been telling kids, "Who wants to die!" in an apparent prank by someone involved with making the book. I’ve also read the press release today by the publisher:

the track was recorded as ‘Uh oh, who has to go’ and due to compression of the digital audio file, some consumers hear a different phrase… We are absolutely certain that the audio file was not tampered with.

Covering their ass, I thought, until I heard the audio sample in this news video from KNDU and now I believe that the publisher is correct. The Elmo sentence under question is an excellent example of the psychological principal of Priming, whereby what you perceive can be affected by your expectations. Listen to the sample expecting to hear "Who wants to die," and that is exactly what you hear. However, listen expecting to hear "Who has to go," and then the correct phrase becomes what you hear. Listen to the video a couple of times and force yourself to "expect" the two different phrases, and many of you will in fact switch what you hear depending on your expectation.

Of course, the first person who misidentified the sample as "Who wants to die" wasn’t expecting to hear this frightening threat, so priming wasn’t the reason they had their misunderstanding (even though the sentence demonstrates priming very well). How did consumers hear this unintended death threat, then, if priming wasn’t the reason?

There are two main confusions with the sentence in question: "has" is confused with "wants", and "go" is confused with "die". The Elmo book certainly uses some severe compression scheme to reduce the bit rate necessary to store the speech in the book as the publisher stated–that’s obvious just by listening to it. This compression scheme  distorts the speech (in addition to the speech distortion that occurs from the annoying Elmo voice), adds a certain amount of noise, and reduces the speech bandwidth. All of these could lead to confusions in consonants and vowels perceived in the sentence. I decided to pull out some research papers on speech confusion and see if there’s an explanation for this mix-up.

Classic research on consonant confusion by Miller and Nicely in 1955 looked at the impact of noise and bandwidth on consonant confusions. According to their research, for speech at a +12 signal-to-noise ratio and a bandwidth of 200-1200 Hz (probably not a bad approximation to the sever compression applied to the Elmo speech), the phoneme /g/ will be incorrectly identified as a /d/ as often as it is correctly identified as a /g/ (click on the figure to the right to see the full-sized confusion matrix–the data of relevance is highlighted in yellow). This begins to explain confusing "go" with "die": the word sounds like it starts with a /d/ instead of a /g/ due to the crappy compression system.

The vowel confusion is a little more difficult to explain, but I’ll try assuming that they are represented by the dipthongs /OW/ and /AY/. The vowel sound in "go" has a similar first formant time-course to the vowel in"die" (according to Rabiner and Juang), so again a compression system that limits the bandwidth of speech might make the two vowel sounds more alike.

So now I’ve explained from a scientific basis how Elmo’s "go" could be misinterpreted as "die".

A similar explanation can be made for the vowels in the confusion of "has" with "wants": both words have similar first formants. The consonant confusion with these words is more difficult to explain. Confusing /h/ with /w/ isn’t common according to research by Wang and Bilger in 1973 (Miller and Nicely’s  paper did not look at these consonants). The /h/ is a frication, the /w/ is voiced–the two are rarely confused. I suspect that the compression distortion obliterated the soft consonant /h/ and allowed the user to imagine whatever consonant they want.

This opens a whole new line of work for linguists–alerting companies when their crappy compression systems may cause customers mental anguish (or worse if it’s in a car’s GPS system). You don’t need to mind your p’s and q’s but be careful because, according to Miller and Nicely under the noisy conditions I considered above, the phoneme /t/ is more likely to be heard incorrectly as a /p/ than correctly as a /t/. So, if you get your face slapped at a noisy bar asking a woman if she wants to see your cool trick, at least now you know why.

Cognition and Speech

Went to an excellent conference in Indiana on cognition and speech communication. Discussion was on how our cognitive ability changes as we age and what the impact is on speech understanding. Some interesting non-scientific facts (and not necessarily new):

  • Standard measures of our cognitive function decline approximately 1% per year starting age 20.
  • Older people are generally poorer at understanding speech in noise than younger people even if their hearing ability as measured by the audiogram is the same as the younger group.
  • Multiple measures of cognition and the auditory system demonstrate age-related slowing of behavior and neural signaling.
  • Neural inhibition degenerates with age and this may affect our ability to inhibit or ignore unwanted sounds, such as that annoying person sitting at the table beside us while we are concentrating on the person speaking at our own table.
  • As we get older, our brain works harder to compensate for our aging ears and to understand speech in noisy situations, but those extra cognitive resources added take away from other abilities like remembering what we are hearing.

Many of the talks were right in line with either experiments or general guiding concepts at the Starkey hearing Research Center (SHRC), so I’m very glad that I went–see my recent interview on cognition and the SHRC.