Point of View Science: Analysis, methods, experimental trilogy

– "Customers such as water," the best student thesis ideas CIKM2009

Author: Beijing Bo Geng

Information and knowledge management (CIKM, Conference on Information and KnowledgeManagement) is the United States computer society (ACM) organized a Conference on the international top level.

Successfully held since 1992, has become the knowledge highlights and share academic melting pot, attracted databases, data mining, knowledge management areas of the world's top experts in the dedication of their potential.

In 2009, Microsoft Research Asia interns Geng Bo, won the best student paper award of the General Assembly.

Let's take a look at him in this article, "will simply be like water" article road out of the award-winning experience. The so-called "plain", is winning the calm after the light, but also in the interpretation of the "Advanced", which in layman's language of the old energy solutions.

October 28, 2009, I received the CIKM 2009 organizing Committee, said that I won the best student paper (Ranking model adaptation fordomain-specificsearch).

This news makes me pleasantly surprised. In repeated reading the email, and is more and more friends onlookers and congratulations, I convinced myself it's authenticity. 2009 CIKM received 847 articles in a long article, of which 123 article acceptance for the oral report, accept only 14.5% rate. Best student paper from information retrieval, knowledge management, database three research direction all the papers selected for the best student papers. This thesis is in my Microsoft Research Asia multimedia Computing Group internship, remember that period of various experiences, feelings.

Traditional methods of research and analysis

— — Rounded square inch deceit, Big Sky; homes lifted up asks, is Heaven Chi

CIKM2009 best student paper award winner-Geng Bo

My current research interest is mainly as a sort of image search, that is, how to improve the image and video search engine sorting algorithms, so that the user most want to need images in front of the returned results, not relevant in the back.

However, due to the limitations of computer vision development, current technology is still very difficult for any one image for effective analysis and understanding of the content. Or simply speaking, it is difficult to have the computer automatically and accurate to judge whether a picture package we need most is to contain information, such as automobile, aircraft, whether someone is running, at play, whether it contains Obama, Jordan, and so on.

A current computer vision research is the first part of the callout, and then through the "machine learning" approach to learning the underlying features of the image and the relationship between high concept, and then use this relationship to predict a picture contains a high-level concepts of probability and, finally, according to the probability sort for pictures.

That is, for example I would like to search for a running boy, we first artificial callout number contains little boy running picture and does not contain a picture, and then training a detector to predict any given image contains the "little boy running" probability, which in turn helps us to find images containing the search target.

However, this approach severely limited image underlying characteristics (such as color, texture, shape, etc.) and high-level concept "semantic gap", is still very difficult to achieve more accurate detection.

And, we will find useful in image search thousands of query keywords, users can use different keyword combinations to be queried. However, it is not possible on every query has to learn a high-rise concept detector, because it requires a significant amount of manpower costs go annotation samples, as well as a large amount of computing resources and time to train the model. Even worse, because the query with time and use the occasion of the constantly changing, so every new queries occur, if on every new queries to tagging large amounts of data and builds the corresponding detectors, dimensioning and training costs, and therefore this method of scalability is very poor.

In the light of the above issues, current practical network image search engines are based on the image of text information.

In other words, use an image of the title, the text around the image, as well as the Web page's text information for indexing and retrieval, but does not consider the content of the image. Under this method, an image can usually be understood as an article from the above text information using a combination of documents, and therefore the classic information retrieval methods can be effective search on an image. This method is simple and efficient, and scalable, on practical results in their returns and more is here to stay. Therefore, although a large number of hotspots is still concentrated in image content analysis for image search sort of application, from the practical point of view, temporarily abandoned the use of image content information, turn to text information, to improve image search sort model performance.

Thinking about the problem of abstract and

— If there is sound on the piano, made, in the case of wrong song? if statement in the fingers, why not to listen to you means?

This

On the source of inspiration to me before the "migration learning (transferlearning)" fields of study. The so-called "migration" means that a different distribution, different data fields of the data or the combination of the model to the data in the target domain in the training of more robust model. This article solved problem, actually fall within the scope of the transfer of learning. Because learning model parameters, the training and test data distribution is not exactly the same, resulting in the training set is the best model to test set is not optimal. While the Web page and document search data and image search data distribution, if directly to Web search model "to" application in image search, cannot be satisfied with the search results. Specifically, search page, the title is very important, but for images, image surrounding text or tags maybe even more important. However, these different domains of data also have a certain relevance. The current Web page and image with text search, they both similar and different focus. Use of the existing model of information, we can for an unknown domain, the combination of domain annotation data to training for the domain model. At this point, because the existing sorting model provides prior information, as long as the annotations are rarely part of information, after the existing model of migration to the unknown domain, can the new domain to the smallest amount of callout, builds performance model.

From a statistical standpoint, the training of more data, the model would be better.

However, the more training requires manual tagging and the time. We use a model that have reduced the amount of training time and the callout. This method implements the following characteristics:

(1) based on the model of migration, i.e. we only need the secondary domain model without its associated data.

We only need to see some sort of related information, the model without the need for training those who already have any of the data model;

(2) based on the black box of migration, you only need to know the output, we do not need to know how it works, use it as a "black box", reducing the dependence of a model;

(3) reduction in the target domain data volume, thus marking the artificial reduction of the cost.

Thatis, the amount in the same dimension, our algorithms to achieve better sort performance;

(4) reduces the cost of training the model, prove that this method of training time decreases, less data needs time.

That is, our algorithm training time complexity only and the target domain is the amount of data that have been marked, for which the domain migration in relatively small data size. These excellent characteristics makes our algorithms on the one hand, to have the sort of training data and the model has the details of the model itself need not be too demanding requirements, on the other hand reduces the data for the target domain and the training cost of tagging, allowing us to approach more simple and robust.

Was to present some reflections as water

Ideas, in process, we grab the Internet close to a million pages, and image data, and to take a lot of different features in-depth analysis.

We have been training by a good Web search sort model migration into the image field in the sort of image search, find our algorithm can has only a small amount of image annotation data from the fairly maximizes image search results sorting performance. Additionally, for a different model, we have moved to the migration of measurement to estimate the different existing sort the usefulness of the model, and that greater mobility of sort model through migration after get stronger sort results. This standard can be migrated to good to have an effective choice model.

There is good thinking and solid experimental procedure, in the review process, CIKM, judges for our highly method.

The three judges in the two played out, and our ideas of innovative, effective, experimental short informative fully. If the judges said, the success of this article is primarily the background for the target and the inherent contradiction in the abstract and in-depth analysis, made simple and proven method, and a detailed informative experiments. For these reasons maybe sounds will simply as water, including women and children, no mystery, but a success but it originated from this and other.

Think about the process of writing papers, Yang Jun, and China to better teachers accompany me along the way, careful guidance, really I worked a lot.

From December 2008 began with the idea, until April 2009, we will meet regularly to discuss. By nearly half a year of working day and night, and finally let paper prototyping and into the CIKM.

If you ask other feelings, finished paper been filled sleep, it was Carnival.

To me, the end of an article, it seems that no special feelings, does not seem to be very easy to deploy in the dissertation, slightly grateful and enriched with a laugh, and then wash sleep. Life is just a step ahead, just doing things at hand, just an article after solving new problems and write new articles. Scientific research is simply, "it will simply like water".

———————————————————————————————————————————————————

About the author:

Geng Bo, Microsoft Research Asia multimedia Computing Group interns, are now available at Peking University, Department of intelligent science doctorate vision group, graduated from Fudan University Department of computer science and technology.

Won the 2009 best student paper award CIKM, as well as the 2010 Beijing Shi qingyun, academician of excellent paper award.

Point of View Science

Friday, November 19, 2010

Analysis, methods, experimental trilogy

No comments:

Post a Comment