NPR Optimizes Audio Files For Search Engine Purposes
National Public Radio, in an effort to make their audio content crawlable, has begun transcribing audio news streams into text files. These efforts appear to be paying off for NPR; these transcriptions have begun appearing on Google and Yahoo News, complete with links to source audio file.
In an extensive article by News.com, NPR online director Maria Thomas stated, "our site is primarily full of rich audio, and we want people to find it when it's relevant. The big search engines' technologies don't have the ability to get inside the audio or video. With the little bit of text we have on NPR, it's not always good enough to find our content, and reference the page."
The shortcomings of search engines is what caused NPR to consider audio transcription. Currently, search engine technology is geared towards finding contextual, keyword-related content. Because of this, major search engines are not capable of crawling multimedia content, unless there is a textual representation available.
Yahoo-owned AltaVista is a search engine that offers audio and video searches. However, it too crawls the text associated with these multimedia files. According to CNet, a handful of companies are attempting to create software that actually extracts portions of audio and video files in order to determine relevance.
Since NPR has begun transcribing their audio content, the site has increased its visitors in what is being referred to as "record spikes." Although, Thomas did not release specific traffic figures.
In order to accomplish the transcription, NPR is using StreamSage, a speech recognition software that was introduced last year. StreamSage also uses a contextual analyzer that parses the language into themes. It then generates a text file similar to a table of contents that can be spidered and indexed by search engines.
For accuracy purposes, NPR then replaces StreamSage's transcriptions, which can be inaccurate and garbled, with a human version.
The obvious goal would be the ability to search these files without having to wait for the text version. In News.com's article Jay Webster, chief technology officer of interactive agency Fathom Online, said, "where it gets cool is if you could search on any keyword and find it within audio and that audio would come up in search results. But I don't think we're there yet."
Our Daily Email of Breaking eBusiness News
About the Author:
WebProNews | Breaking eBusiness News
Your source for investigative ebusiness reporting and breaking news.
WebProNews RSS Feed