Natural Language Processing
Large Language Models
Deep Learning
Evaluation
Commonsense Reasoning
Italian
GITA4CALAMITA - Evaluating the Physical Commonsense Understanding of Italian LLMs in a Multi-layered Approach: A CALAMITA Challenge
In the context of the CALAMITA Challenge, we investigate the physical commonsense reasoning capabilities of large language models (LLMs) and introduce a methodology to assess their …
Giulia Pensa
Ekhi Azurmendi
Julen Etxaniz
Begoña Altuna
Itziar Gonzalez-Dios



