Deep Learning

Lessons from the Trenches on Reproducible Evaluation of Language Models featured image

Lessons from the Trenches on Reproducible Evaluation of Language Models

Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation …

ArXiv
Stella Biderman
,
Hailey Schoelkopf
,
Lintang Sutawika
,
Leo Gao
,
Jonathan Tow
,
Baber Abbasi
,
Alham Fikri Aji
,
Pawan Sasanka Ammanamanchi
,
Sidney Black
,
Jordan Clive
,
Anthony DiPofi
,
Julen Etxaniz
,
Benjamin Fattori
,
Jessica Zosa Forde
,
Charles Foster
,
Jeffrey Hsu
,
Mimansa Jaiswal
,
Wilson Y. Lee
,
Haonan Li
,
Charles Lovering
,
Niklas Muennighoff
,
Ellie Pavlick
,
Jason Phang
,
Aviya Skowron
,
Samson Tan
,
Xiangru Tang
,
Kevin A. Wang
,
Genta Indra Winata
,
François Yvon
,
Andy Zou
IKER-GAITU: research on language technology for Basque and other low-resource languages featured image

IKER-GAITU: research on language technology for Basque and other low-resource languages

The general objective of the IKER-GAITU project is to research on language technology to increase the presence of Basque in the digital environment. It will be carried out between …

PROJECTS & DEMOS SEPLN - CEDI 2024
Eneko Agirre
,
Itziar Aldabe
,
Xabier Arregi
,
Mikel Artetxe
,
Unai Atutxa
,
Ekhi Azurmendi
,
Iker De la Iglesia
,
Julen Etxaniz
,
Victor García-Romillo
,
Inma Hernaez-Rioja
,
others
PDF
XNLIeu: a dataset for cross-lingual NLI in Basque featured image

XNLIeu: a dataset for cross-lingual NLI in Basque

XNLI is a popular Natural Language Inference (NLI) benchmark widely used to evaluate cross-lingual Natural Language Understanding (NLU) capabilities across languages. In this …

NAACL 2024
Maite Heredia
,
Julen Etxaniz
,
Muitze Zulaika
,
Xabier Saralegi
,
Jeremy Barnes
,
Aitor Soroa
Latxa: An Open Language Model and Evaluation Suite for Basque featured image

Latxa: An Open Language Model and Evaluation Suite for Basque

We introduce Latxa, a family of large language models for Basque ranging from 7 to 70 billion parameters. Latxa is based on Llama 2, which we continue pretraining on a new Basque …

ACL 2024
Julen Etxaniz
,
Oscar Sainz
,
Naiara Perez
,
Itziar Aldabe
,
German Rigau
,
Eneko Agirre
,
Aitor Ormazabal
,
Mikel Artetxe
,
Aitor Soroa
NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination for each Benchmark featured image

NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination for each Benchmark

In this position paper, we argue that the classical evaluation on Natural Language Processing (NLP) tasks using annotated benchmarks is in trouble. The worst kind of data …

EMNLP 2023 Findings
Oscar Sainz
,
Jon Ander Campos
,
Iker García-Ferrero
,
Julen Etxaniz
,
Oier Lopez de Lacalle
,
Eneko Agirre
Do Multilingual Language Models Think Better in English? featured image

Do Multilingual Language Models Think Better in English?

Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external …

NAACL 2024
Julen Etxaniz
,
Gorka Azkune
,
Aitor Soroa
,
Oier Lopez de Lacalle
,
Mikel Artetxe
Grounding Language Models for Compositional and Spatial Reasoning featured image

Grounding Language Models for Compositional and Spatial Reasoning

Humans can learn to understand and process the distribution of space, and one of the initial tasks of Artificial Intelligence has been to show machines the relationships between …

ADDI
Julen Etxaniz
,
Oier Lopez de Lacalle
,
Aitor Soroa
Image Caption Generation featured image

Image Caption Generation

Automatic Image Caption Generation model that uses a CNN to condition a LSTM based language model.