Pedro Ortiz Suarez
Pedro Ortiz Suarez
Home
Publications
Talks
Projects
Contact
CV
Light
Dark
Automatic
English
English
Deutsch
Español
Français
Recent & Upcoming Talks
Des Méthodes de TAL modernes pour l'Enrichissement de Documents
Nous présentons une pipeline pour le traitement et l’enrichissement de documents basée sur les dernières méthodes d’apprentissage neuronal.
Pedro Ortiz Suarez
Sep 22, 2020
Slides
Follow
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
We explore the impact of the training corpus on contextualized word embeddings in five mid-resource languages.
Pedro Ortiz Suarez
,
Laurent Romary
,
Benoît Sagot
Jul 6, 2020
Slides
Video
Follow
Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures
We propose a new pipeline to filter, clean and classify Common Crawl by language, we publish the final corpus under the name OSCAR.
Pedro Ortiz Suarez
,
Benoît Sagot
,
Laurent Romary
Jul 22, 2019
PDF
Code
Slides
Follow
Preparing the Dictionnaire Universel for Automatic Enrichment
A talk about automatic enrichment of dictionaries.
Pedro Ortiz Suarez
,
Laurent Romary
,
Benoît Sagot
Jun 13, 2019
Slides
Follow
Reducing computation time by months by rewriting Bash scripts in Go
Pedro Ortiz Suarez
Mar 24, 2019
Code
Slides
Follow
Cite
×