It is precisely because people gradually aware scheduling problems as regards the importance of search engine efficiency, so the last few years, the entire academic community awareness of the sort is also quickly lifting height, we all want to be very formal to sort as a study on the academic problems, and eventually formed a complete theoretical system.
It can be said that sort has become the machine learning in a new branch.Just like the area classification, regression, cluster and other some has been a very thorough study, how the hell should be sorted? what features exist and there is no theory of knowledge contains one? how do I automatically through machine learning to build sort of model?
Microsoft Research Asia (MSRA) in about three years ago to begin learning about "sort" (Learningto Rank).
In 2007, MSRA an article entitled "Learning to Rank: FromPairwise Approach to ListwiseApproach published during the entire academic community has aroused strong repercussions. It is dedicated to the subject of fellow lihang Dr, and the information explosion era have a certain relationship between the amount of information before the user is too large, "a lot of things will hope to have a row of a sequence, search is the most typical example to help them to access to the most wanted information. ”"In fact sort is a relationship of representation, unlike previous such as classification, regression is an object or an object's properties," and do the topic together lihang research of Dr. LAU Tit rock told reporters, "said before a Web page, it is about news or talk about sports? is actually a absolute thing, get this page everything is known, is its own properties.
But this sort is a Web page with another page comparison between a relationship. For example, previously called unary learning, so now it is a higher-order $ and more. ”(Sort of main responsible lihang doctoral)
Lihang think, this is the entire traditional machine learning is a very big challenge.
Because according to the traditional view, will the existence of some basic assumptions, each sample is also behind a rule in the control. But for sorting, "in fact we want mining is to meet the kind of object, it cannot be used before the assumption to think, at least in certain circumstances, have not fully established, so there will be some new theoretical and practical to occur".And before the "Pairwise", lihang and LAU Tit rock their new research methods "Listwise" is based on a list of learning, that is, to a list as basic learning units, "because a list itself contains some rows of a sequence of documents, some relationships have been embedded in such presentation.
"LAU Tit rock said," so we don't need like the previous studies that assume that the document will be between the relative sizes of relations, which are already in our learning unit inside, this makes some based on the theory and practice are relatively smoothly, and previously have more different. ”Listwise is a concern, because the evaluation sorting results is good or bad, it corresponds to the query words in the holistic consideration of all the documents, global, and previous work in the eyes in a single document or a document; and the relationship between the document, such as similarity modeling, etc., so you can define the sort function more effectively; in addition, because the list level, it can take full advantage of the document in the list of location information, so you can place more emphasis on top of the document, and a more consistent user experience.
Back to search, lihang said: "sort" the study to learn more about algorithms and ranking model.
For example, in Internet search, the importance of a Web page is an important feature, but also to consider the relevancy; only the importance and relevance may also not enough, also consider other factors. Network search till today, people have come to realize that there are too many factors affect the scheduling of these factors as features used some methods considered a most reasonable sort, this is Learningto Rank to resolve problems.
No comments:
Post a Comment