Government Technology

    Digital Communities
    Industry Members

  • Click sponsor logos for whitepapers, case studies, and best practices.
  • McAfee

First Low Cost, Efficient Search Engine for Video Content


Search Engine for Video Content
Search Engine for Video Content

July 8, 2009 By

With video content becoming ever more plentiful on the Web and in police and security operations generally, the problem of finding a specific segment which one recalls is ever more challenging.

It's a problem that TV journalists have long coped with, one they usually solved with detailed notes and time codes in the video tracks. However, as speech recognition software developed, journalists and other media archivists no longer had to painstakingly search to find a specific video section. But they had to invest some serious money in speech recognition software which still required the search to be updated regularly by specialists. (These systems are based on a kind of thesaurus containing all the words they can recognize. However, new topics and personalities bring along new words like "financial crisis" or names such as "Obama". These terms need to be transferred to the thesaurus so that they can be found.)

According to a news release, researchers at the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS in Sankt Augustin, Germany have developed a speech recognition system that does not require expensive updating measures.

"Our system is based on a syllable thesaurus instead of a word thesaurus. Conventional speech recognizers can only discern a limited number of words, while the total number of words in existence is too vast to handle. The number of existing syllables, on the other hand, is manageable. With about 10,000 stored syllables we can make up any word," says IAIS scientist Daniel Schneider in the release. The program can even acquire new words independently by composing them from the stored syllables: fi-nan-cial cri-sis. It does not need to be updated and so does not entail any running costs.

For each search, the programs are first of all split into segments. Whenever a new speaker starts to talk or a film contribution begins - in which case the content of the audio track changes - the program saves the following scene as a new segment. The user can then navigate from speaker to speaker, and can choose to watch only the contributions of one particular interview partner. In a second step, the individual words are analyzed by speech algorithms. Users can apply the program just like a conventional search engine. You simply enter the search term, and a few milliseconds later the program has scanned 10,000 hours of processed data. Just like an Internet search engine, it displays the results in context in their given sentences. The user then simply clicks on a word to play back the relevant section of film material. The system can find over 85 percent of the spoken words in a program, and 99 out of a 100 located contributions are correct. A license model of the program is already available.

Photo by Chris Radcliff. CC Attribution-Share Alike 2.0 Generic

 


| More

Comments

Add Your Comment

You are solely responsible for the content of your comments. We reserve the right to remove comments that are considered profane, vulgar, obscene, factually inaccurate, off-topic, or considered a personal attack.

In Our Library

White Papers | Exclusives Reports | Webinar Archives | Best Practices and Case Studies
Digital Cities & Counties Survey: Best Practices Quick Reference Guide
This Best Practices Quick Reference Guide is a compilation of examples from the 2013 Digital Cities and Counties Surveys showcasing the innovative ways local governments are using technological tools to respond to the needs of their communities. It is our hope that by calling attention to just a few examples from cities and counties of all sizes, we will encourage further collaboration and spark additional creativity in local government service delivery.
Wireless Reporting Takes Pain (& Wait) out of Voting
In Michigan and Minnesota counties, wireless voting via the AT&T network has brought speed, efficiency and accuracy to elections - another illustration of how mobility and machine-to-machine (M2M) technology help governments to bring superior services and communication to constituents.
Why Would a City Proclaim Their Data “Open by Default?”
The City of Palo Alto, California, a 2013 Center for Digital Government Digital City Survey winner, has officially proclaimed “open” to be the default setting for all city data. Are they courageous or crazy?
View All