Published July 21, 2025 | Version v1
Dataset Open

The Virtual Patent (VP-WPI) Test Collection

  • 1. ROR icon International Hellenic University
  • 2. ROR icon TU Wien

Description

The VP-WPI Test Collection is a novel dataset that implements the Virtual Patent (VP) concept. A Virtual Patent is a synthesized document that represents a single patent, created by merging the most up-to-date information from its various publication stages (e.g., kind codes A1, A2, B1, B2). 

Specifically, VP-WPI is as a specialized vertical of the WPI+ resource, which offers a unified, non-redundant view of patents by aggregating all relevant documents from the WPI test collection at the kind-code level to create unified VP documents. 

This collection serves as an abstraction layer over WPI, designed to:

  • Simplify analysis by reducing document redundancy.
  • Enhance data consistency by providing a single source of truth.
  • Preserve traceability with links back to all original source documents.

Further Information

For full technical details, including collection statistics, data specifications, and the creation process, please refer to:

Resources: 

Files

Technical Note_ The Virtual Patent (VP-WPI) Test Collection.pdf

Files (44.8 GiB)

NameSize
md5:c38929d671eb92273a45938d689a3f76
4.7 KiBPreview Download
md5:25fe6ad2e2386b30a959095480999a1e
181.2 KiBPreview Download
md5:975ecf8c84cb5e8a754e8805d74aa20c
8.9 GiBDownload
md5:5766b1e7e3cfcb0ddab8dee55ccc9ca1
4.5 GiBDownload
md5:c7dc2082ca455c554aef412f50ee0190
8.7 GiBDownload
md5:838d48d37e8df19ea1b4e8fac10a6dda
3.7 GiBDownload
md5:532572fb1dcae1734fed6d54cf57e500
14.8 GiBDownload
md5:d69f4febcb7f61bf77f47e6a3a63f7d7
4.2 GiBDownload

Additional details

Related works

Has part
Software: https://github.com/cs1msa/WPIplus/ (URL)
Is derived from
Thesis: https://repository.ihu.gr/handle/11544/47881 (Handle)
Is described by
Journal Article: 10.1016/j.wpi.2025.102389 (DOI)
Is supplement to
Dataset: 10.5281/zenodo.1489994 (DOI)
Journal Article: 10.1016/j.wpi.2019.02.002 (DOI)