Basit öğe kaydını göster

dc.contributor.authorDincer, B. Taner
dc.contributor.authorMacdonald, Craig
dc.contributor.authorOunis, Iadh
dc.date.accessioned2020-11-20T16:19:00Z
dc.date.available2020-11-20T16:19:00Z
dc.date.issued2014
dc.identifier.isbn978-1-4503-2259-1
dc.identifier.urihttps://doi.org/10.1145/2600428.2609625
dc.identifier.urihttps://hdl.handle.net/20.500.12809/3678
dc.description37th Annual International ACM Special Interest Group on Information Retrieval Conference on Research and Development in Information Retrieval - JUL 06-11, 2014 - Gold Coast, AUSTRALIAen_US
dc.descriptionDincer, Bekir Taner/0000-0002-0660-7239; Ounis, Iadh/0000-0003-4701-3223en_US
dc.descriptionWOS: 000450717600003en_US
dc.description.abstractThe aim of risk-sensitive evaluation is to measure when a given information retrieval (IR) system does not perform worse than a corresponding baseline system for any topic. This paper argues that risk-sensitive evaluation is akin to the underlying methodology of the Student's t test for matched pairs. Hence, we introduce a risk-reward tradeoff measure T-Risk that generalises the existing U-Risk measure (as used in the TREC 2013 Web track's risk-sensitive task) while being theoretically grounded in statistical hypothesis testing and easily interpretable. In particular, we show that T-Risk is a linear transformation of the t statistic, which is the test statistic used in the Student's t test. This inherent relationship between T-Ri(sk) and the t statistic, turns risk-sensitive evaluation from a descriptive analysis to a fully-fledged inferential analysis. Specifically, we demonstrate using past TREC data, that by using the inferential analysis techniques introduced in this paper, we can (1) decide whether an observed level of risk for an IR system is statistically significant, and thereby infer whether the system exhibits a real risk, and (2) determine the topics that individually lead to a significant level of risk. Indeed, we show that the latter permits a state-of-the-art learning to rank algorithm (Lamb-daMART) to focus on those topics in order to learn effective yet risk-averse ranking systems.en_US
dc.description.sponsorshipACM Special Interest Grp Informat Retrieval, Baidu, Google, Microsoft Res, Tourism & Events Queensland, eBay, Huawei, Seznam cz, Facebook, IBM, Pivotal, Yahoo, Labs, Yandex, Queensland Univ Technol, RMIT Univ, Univ Melbourne, Univ Otagoen_US
dc.item-language.isoengen_US
dc.publisherAssoc Computing Machineryen_US
dc.item-rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectRisk-Sensitive Evaluationen_US
dc.subjectStudent's T Testen_US
dc.titleHypothesis Testing for the Risk-Sensitive Evaluation of Retrieval Systemsen_US
dc.item-typeconferenceObjecten_US
dc.contributor.departmenten_US
dc.contributor.departmentTemp[Dincer, B. Taner] Mugla Univ, Dept Stat & Comp Engn, Mugla, Turkey -- [Macdonald, Craig; Ounis, Iadh] Univ Glasgow, Sch Comp Sci, Glasgow, Lanark, Scotlanden_US
dc.identifier.doi10.1145/2600428.2609625
dc.identifier.startpage23en_US
dc.identifier.endpage32en_US
dc.relation.journalSigir'14: Proceedings of the 37Th International Acm Sigir Conference on Research and Development in Information Retrievalen_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US


Bu öğenin dosyaları:

DosyalarBoyutBiçimGöster

Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster