CWALM @ Study Day on the Management of Noise in Computer Processing of Linguistic Corpora

The PRIN project CWALM – A lexical corpus-based model of Contemporary Written Arabic was introduced at the Study Day Background Noise or Added Value? Managing Noise in Computer Processing of Linguistic Corpora (Grenoble, 28 April 2023) during an invited talk entitled Producing Tools for Studying Under-researched Languages: the Case of Contemporary Written Arabic given by Ouafae Nahli (CNR-ILC CoPhiLab | CWALM Research Unit 2 Leader).

CWALM, co-funded by the Italian Ministry of University and Research within the PRIN 2020 Funding Programme, boasts the participation of the Roma Tre University, the Institute for Computational Linguistics “Antonio Zampolli” of the National Research Council of Italy and the Free University of Languages ​​and Communication IULM.

The Study Day, organized by the Université Grenoble Alpes and the University of Rome “La Sapienza”, was aimed at PhD students, young researchers and post-docs as well as experienced researchers. Its main goal was to understand to what extent noise can be a source of information, particularly during the corpora annotation stage. It was an occasion to reflect on noise management methods in Natural Language Processing and Corpus Linguistics, on the impact it has on the quality of linguistic data and on the potential impact of noise in the phases of data collection, recording and annotation.