A metric for robust model evaluation through text perturbations
Smirnov, Sergei (2024)
Diplomityö
Smirnov, Sergei
2024
School of Engineering Science, Tuotantotalous
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2024052737139
https://urn.fi/URN:NBN:fi-fe2024052737139
Tiivistelmä
This thesis introduces a new approach to improve robustness of text generative models by introduction of noise or text perturbations in prompts. A metric is developed to evaluate how well large language models perform when faced with noisy data. The efficiency of the approach is demonstrated experimentally by measuring models’ accuracy on controlled perturbations in 1200 questions from CommonsenseQA dataset. The result opens a new promising direction to evaluate and improve AI based solutions quality.
