Published October 21, 2025 | Version v3
Dataset Open

LongEval 2024 Test Collection

Description

The collection consists of queries and documents provided by the Qwant search Engine (https://www.qwant.com). The queries, which were issued by the users of Qwant, are based on the selected trending topics. The documents in the collection were selected with respect to these queries using the Qwant click model. Apart from the documents selected using this model, the collection also contains randomly selected documents from the Qwant index. All the data was collected over June 2023 and August 2023. In total, the collection contains 1,925 test queries. The set of documents consist of 4,321,642 downloaded, cleaned and filtered Web Pages. Apart from their original French versions, the collection also contains translations of the webpages and queries into English. The collection serves as the official test collection for the 2024 LongEval Information Retrieval Lab (https://clef-longeval.github.io/) organised at CLEF.

The data is released under the Qwant LongEval Attribution-NonCommercial-ShareAlike License.

This version includes the topics (questions) that have been used in the LongEval 2024 Lab and their qrels.

Files

_Qwant_License.txt

Files (21.3 GiB)

NameSize
md5:d3585cb497e1bb57841fa1e940f5b49a
16.6 KiBPreview Download
md5:be883184f9373264648b4184afe9a6de
41.5 KiBPreview Download
md5:ec95b30ba6bc11fae67a03e8f63c7644
21.3 GiBPreview Download
md5:46fdb7a0d18b9c70e8db07219be7239d
1.5 MiBPreview Download
md5:fb9fd5aa3681f6fb1e6a1234a98a35dc
88.8 KiBPreview Download

Additional details

Related works

Is required by
Book: 10.1007/978-3-031-71908-0_10 (DOI)
Is version of
Dataset: 10.48436/wm79f-88x06 (DOI)

Funding

FWF Austrian Science Fund
Kodicare I4471-N