Random walk term weighting for information retrieval

Christina Lioma, Roi Blanco

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

22 Citationer (Scopus)

Abstract

We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.
OriginalsprogEngelsk
TitelSIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
ForlagAssociation for Computing Machinery
Publikationsdato2007
Sider829-830
StatusUdgivet - 2007
Udgivet eksterntJa

Bibliografisk note

Copyright is held by the author/owner(s).
SIGIR’07, July 23–27, 2007, Amsterdam, The Netherlands.
ACM 978-1-59593-597-7/07/0007.

Citationsformater