Abstract
Inuenza-like illness (ILI) estimation from web search data is an important
web analytics task. The basic idea is to use the frequencies of
queries in web search logs that are correlated with past ILI activity
as features when estimating current ILI activity. It has been noted
that since inuenza is seasonal, this approach can lead to spurious
correlations with features/queries that also exhibit seasonality, but
have no relationship with ILI. Spurious correlations can, in turn, degrade
performance. To address this issue, we propose modeling the
seasonal variation in ILI activity and selecting queries that are correlated
with the residual of the seasonal model and the observed ILI
signal. Experimental results show that re-ranking queries obtained
by Google Correlate based on their correlation with the residual
strongly favours ILI-related queries.
web analytics task. The basic idea is to use the frequencies of
queries in web search logs that are correlated with past ILI activity
as features when estimating current ILI activity. It has been noted
that since inuenza is seasonal, this approach can lead to spurious
correlations with features/queries that also exhibit seasonality, but
have no relationship with ILI. Spurious correlations can, in turn, degrade
performance. To address this issue, we propose modeling the
seasonal variation in ILI activity and selecting queries that are correlated
with the residual of the seasonal model and the observed ILI
signal. Experimental results show that re-ranking queries obtained
by Google Correlate based on their correlation with the residual
strongly favours ILI-related queries.
Originalsprog | Engelsk |
---|---|
Titel | SIGIR '17 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Forlag | Association for Computing Machinery |
Publikationsdato | 2017 |
Sider | 1197-1200 |
ISBN (Elektronisk) | 978-1-4503-5022 |
DOI | |
Status | Udgivet - 2017 |
Begivenhed | 40th International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR '17 - Shinjuku, Tokyo, Japan Varighed: 7 aug. 2017 → 11 aug. 2017 |
Konference
Konference | 40th International ACM SIGIR Conference on Research and Development in Information Retrieval |
---|---|
Land/Område | Japan |
By | Shinjuku, Tokyo |
Periode | 07/08/2017 → 11/08/2017 |