CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts

Brave API

CLEF HIPE-2026 is an evaluation lab dedicated to advancing the extraction of person–place relations from noisy, multilingual historical texts, focusing on two relation types: at ("Has the person ever been at this place?") and isAt ("Is the person located at this place around publication time?"). The task requires systems to perform semantic relation extraction by leveraging temporal reasoning, geographical inference, and interpretation of sparse or indirect contextual cues in historical documents. This initiative extends the HIPE series from prior work on named entity recognition and linking (HIPE-2020 and HIPE-2022) into the domain of relation extraction, supporting applications in knowledge-graph construction, historical biography reconstruction, and spatial analysis in digital humanities.

The evaluation framework includes a three-fold profile assessing accuracy, computational efficiency, and domain generalization. The accuracy profile measures system performance using macro Recall across relation labels, ensuring balanced evaluation despite class imbalances. The efficiency profile rewards lightweight and scalable models by considering model size and compute costs, aligning with growing interest in sustainable NLP. The generalization profile evaluates model robustness on unseen data from a different domain, testing adaptability beyond historical newspapers.

Training and test data are drawn from historical newspapers in four languages: English, German, French, and Luxembourgish, enabling multilingual evaluation. Entity pairs are provided, and systems must classify each pair within its document context, a setup designed to be accessible to both large language models (LLMs) and traditional classification approaches. Early results indicate that models like GPT-4o achieve moderate alignment with human judgments (up to 0.8 Cohen’s kappa for at), though performance on isAt remains more variable, and inference costs are high, underscoring the need for efficient methods.

The evaluation period is scheduled for May 5–7, 2026, with results and system descriptions to be presented at the CLEF 2026 conference in Jena, Germany, from September 21–24, 2026. Registration remains open until April 23, 2026, and training data was fully released on January 19, 2026. By addressing multilinguality, noise, and domain shift, HIPE-2026 establishes a new benchmark for robust and sustainable NLP in low-resource historical contexts.

Generated Feb 22, 2026

Cerebras Thinking

This material introduces the HIPE-2026 shared task, a CLEF initiative designed to benchmark the performance of Natural Language Processing (NLP) systems in extracting semantic relations between persons and places from historical texts. Specifically, the task targets the identification of at and isAt relations, which determine whether a person is located at or associated with a specific geographic entity. The evaluation is conducted on noisy, multilingual datasets derived from historical newspapers and archives, presenting significant technical challenges due to Optical Character Recognition (OCR) errors, archaic vocabulary, and diverse linguistic structures.

The key contribution of HIPE-2026 is the establishment of a standardized framework for assessing both the accuracy and computational efficiency of relation extraction models in a low-resource, high-noise domain. By moving beyond simple Named Entity Recognition (NER) to complex relational extraction, this task provides critical insights into the capabilities of modern architectures—such as Large Language Models (LLMs) and specialized transformers—to generate structured knowledge from unstructured digitized heritage. This work is vital for the digital humanities community, as it enables the construction of richer knowledge graphs and improves semantic search capabilities within vast historical collections.

Generated Mar 12, 2026

Open-Weights Reasoning

Summary of CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts

The HIPE-2026 shared task, part of the CLEF Conference, focuses on evaluating person-place relation extraction (specifically the at/isAt relations) from noisy multilingual historical texts. This task addresses the challenge of extracting structured knowledge from unstructured historical documents, which often suffer from OCR errors, archaic language, and inconsistent formatting. The evaluation framework assesses both accuracy (precision/recall) and efficiency (computational cost), making it relevant for researchers working on information extraction from heritage collections.

Key contributions of this work include: - A benchmark dataset of multilingual historical texts annotated with person-place relations, enabling comparative evaluation of state-of-the-art models. - A holistic evaluation metric that balances performance (relation extraction quality) and efficiency (inference speed), addressing practical deployment constraints. - Insights into model robustness against noise, including OCR artifacts and linguistic variability across languages. - Encouragement of low-resource and multilingual NLP approaches, aligning with the growing need for accessible digital humanities tools.

This task matters because historical text analysis is a critical application for digital libraries, genealogy research, and cultural heritage preservation. Traditional NLP models often fail on historical data due to domain shift, making this benchmark essential for advancing domain-adaptive relation extraction. The focus on efficiency also ensures scalability for large-scale digitized archives, bridging the gap between research and real-world deployment.

For more details, see the [arXiv paper](https://arxiv.org/abs/2602.17663).

Generated Mar 12, 2026