SinNer@CLEF-HIPE2020: Sinful Adaptation of SotA models for Named Entity Recognition in Historical French and German Newspapers


In this article we present the approaches developed by the Sorbonne-INRIA for NER (SinNer) team for the CLEF-HIPE 2020 challenge on Named Entity Processing on old newspapers. The challenge proposed various tasks for three languages, among them we focused on Named Entity Recognition coarse-grained in French and German texts. The best system we proposed ranked third for these two languages, it uses FastText embeddings and ELMo language models (FrELMo and German ELMo). We combine several word representations in order to enhance the quality of the results for all NE types. We show that reconstruction of sentence segments has an important impact on the results.

In Conference and Labs of the Evaluation Forum - Identifying Historical People, Places and other Entities - 2020
Pedro Ortiz Suarez
Pedro Ortiz Suarez
Senior Research Scientist

I’m a Senior Research Scientist at the Common Crawl Foundation.