Tuesday, November 16, 2010

Can be likened to the search

In some professional picture company, many employees daily to categorize the various pictures, annotations, write descriptive information.

When your computer like a human to read like a picture? the Internet Weekly reporter Yang interviewed for this exploration of Microsoft Research Asia, and share with a "can be likened to the" search technology.

Text: the Internet Yang-weekly "

Market situation is that search engines can read but cannot read it.

Whether it is based on the desktop picture management software or Web-based image search, are still at a simple level, mainly dependent on the file name of the picture itself and an introduction to obtain a picture of meaning. Give search engines a picture, it is hard to like search keywords, search out the associated picture.

Image search can become more intelligent? Microsoft Research Asia researcher Dr. 张磊 to the Internet weekly brought a definite answer.

Next-generation image search technology can already be like people, "" to a picture in the center of interest, and identifying it as a portrait or landscape photographs, taken in the indoor or outdoor. Even, with the assistance of the person, computer, it is able to find many sheets of photo onto the same sheet of face. These sounds incredible things that are already in Microsoft's lab has become a reality, and even some technology has been applied to the portion of the product.

Commonly used digital camera photos with timestamp, so when the system is in accordance with the time group, relatively easy to achieve.

In the timeline, the photo time stamp distribution is uneven, and according to their interval density, you can put the system over a period of time automatically group photographs. So users can easily follow the time sense of the event to elect need photos.

Whereas in accordance with indoor/outdoor, urban/scenery to group implement the slightly more difficult.

The system passes on the image's color, texture, to discern. Generally, indoor warm tones, backgrounds, textures, changes are not large, outdoor, on the contrary. In dealing with the urban architecture/landscape pictures, analytical methods are also similar. The system will pictures divided into 5 * 5 child diagram, each child diagram extracts corresponding color moment information, and the edge of the texture is oriented vertically or horizontally.

Usually the background more muted, outdoors, trees, blue sky and horizon more of the elements; and the line of urban buildings have edged with angle, the greater the background change.

Based on these characteristics make the classifier, with tens of thousands of pictures training a model, whenever a new photo, the system will be based on the classifier classification automatically. And because some of the photos do not accurately categorized as indoor/outdoor, or city/scenery, so the system is based on the former classification accuracy rate roughly 90 percent, while the latter currently can achieve 80% or more, and automatic rejection may be divided into wrong picture.

The most interesting point is that the system can be an imported picture of face mark and a picture library queries with similar pictures.

For example, you and your friends to play with a companion, shot a lot of photos and want to find your photos, using this technique can save you a lot of trouble. First of all, you want a character according to mark, let the computer know that face belongs to who. Then, you can confirm by several times, the training computer, it will be automatically recognized.

This leverages the face tagging technology in the realization of difficult, because the face recognition technology in the area of the entire computer is still faced with many unresolved problems, the system also considered other contextual factors, such as the characters wear, environmental background and so on, in the same group of photographs, these factors clearly can play a key role in support of.

At the same time, through the human dimension, you can significantly increase the degree of recognition.

People see an image, usually to remain at a certain point, this is the photographers were called "the center of interest."

Today, the computer will also be able to do this. This technology to build a user attention model, from the Visual, psychology, color contrast, and many other angles to determine which part of the picture will be more attractive to users. 张磊 as reporter shows an automatic search for a picture in the center of interest of the screensaver. Feel free to select several pictures, the system can generate human sensory angle, and always a browse path around the center of interest on the picture to zoom. This system automatically generated screensaver, looks like the TV directors cut good stretch like a wonderful lens.

At present, some commercial picture and Internet with context of picture information, help to facilitate the completion of the work, but the callout training is still a lot of noise.

With several million pieces have been tagging a picture library, the system can automatically mark the new picture. For a new picture, you can in a picture library to find similar other pictures and results analysis and clustering, you may obtain a new picture of the callout. This is an ability to search inside the picture. It allows the computer to not only understand the sunset and the sunset, also recognize the people and animals, but apparently there are still many issues for further exploration and settlement.

No comments:

Post a Comment