Evaluating LLMs' Performance At Automatic Short-Answer Grading
Type
conference paper
Date Issued
2024-07-08
Author(s)
Abstract
In recent years, the use of Large Language Models (LLMs) has become more accessible and widespread. With a free-of-charge access types people have began applying the models to various tasks beyond the task of next-word prediction. In an exploratory study, we take a closer look at the use of LLMs for Automatic Short Answer Grading. We compare the grading of short-answer tasks by two human graders to this of an LLM. We discuss the results and present examples of observed shortcomings in the annotation and grading.
Keywords
automatic short-answer grading
large language models
automated scoring
Event Title
Workshop on Automated Evaluation of Learning and Assessment Content (EvalLAC 2024)
File(s)![Thumbnail Image]()
Loading...
open.access
Name
paper_published.pdf
Size
954.13 KB
Format
Adobe PDF
Checksum (MD5)
bf908378f517581aaa18175ea24db18f