Too many photos, too little time...
THERE'S nothing quite like downloading your digital photos to your PC and showing them off to your friends. But how much more useful it would be if your PC could listen to the banter and use it to caption and index the pictures. That's exactly what researchers at Hewlett-Packard in California are trying to do. Digital photography is booming, and people are storing ever greater volumes of photos on their hard drives. The trouble is that people rarely label their photos. "This is the weak link for digital photo collections," says Margaret Fleck at HP's lab in Palo Alto. "In 10 years' time, finding something amongst them will be very difficult." Fleck's answer is to tap into the wealth of information in the conversations we have when we talk about our photos with friends. She says the stories we tell don't merely describe the photo, but also talk about the events that happened before and after the picture was taken. To harness this information, Fleck has developed software that records these conversations to hard disc, converts the speech to text using a speech-recognition program, and then extracts keywords with which the photos are captioned and indexed. Her current prototype runs on a PC equipped with a microphone, and automatically starts recording when you open a digital photo album and begin a commentary on the pictures. It stops after 30 seconds if you are not talking. The system converts the soundtrack into text in real time, and identifies keywords such as "Venice", "honeymoon" or "Christmas" that can be used to index the photograph. To find an image later, you simply type keywords into a search box. As speech-recognition software becomes more accurate, the system should be able to generate lengthy captions describing each scene. Today's commercially available speech recognition systems are nearly 99 per cent accurate, but they have to be trained for a particular user, and the user has to speak directly into a microphone. However, Fleck wanted to build a system that wasn't limited to one individual, and could catch "open-air" conversations- not spoken directly into a mike- between people crowded around a monitor looking at photos. Although such systems are far less accurate, the conversations they capture are richer. She used some of HP's own speech-recognition software, originally intended for transcribing webcasts of TV and radio news bulletins spoken by multiple presenters. It gathered enough keywords to index the photographs. "It's a really clever way of annotating pictures," says Mor Naaman of Stanford University, also in Palo Alto, who is working on using GPS devices built into digital cameras to annotate photographs with details about the picture's location (see "Wish you were here"). Fleck believes her method will be just one of many that we will rely on to help us organise digital files, as hard drives approach terabyte levels over the next few years. "Probably any good solution is going to use several different approaches," she says, pointing to work at the University of California in Berkeley. Researchers there have developed software that can identify key elements in photos, such as types of animal, flowers, geographic features like rivers and mountains, and use them to index pictures.
Source: Eurekalert & othersLast reviewed: By John M. Grohol, Psy.D. on 21 Feb 2009
Published on PsychCentral.com. All rights reserved.
Only I can change my life. No one can do it for me.
-- Carol Burnett