Skip to content

Artificial Intelligence Improves in Essay Scoring: Is It Appropriate for Educators to Adopt This Practice?

Artificial Intelligence assessment tools demonstrate comparable accuracy and uniformity to human grading in specific subject matters and scenarios, sparking debates about the moral implications of machine marking.

Artificial Intelligence Improves in Assessing Academic Work. Is It Ethical for Educators to Employ...
Artificial Intelligence Improves in Assessing Academic Work. Is It Ethical for Educators to Employ This for Grading?

Artificial Intelligence Improves in Essay Scoring: Is It Appropriate for Educators to Adopt This Practice?

In the ever-evolving landscape of education, artificial intelligence (AI) is making a significant impact, particularly in the realm of grading and assessment. Deirdre Quarnstrom, Vice President of Education at Microsoft, has highlighted the growing interest in using AI to streamline tasks traditionally performed by educators, such as grading and assessment [1].

However, concerns have been raised about the potential loss of human interaction in writing, a fundamental aspect of education. Kwame Anthony Appiah, a philosophy professor at NYU, addressed this issue, stating that using AI to grade student work, while prohibiting students from using AI to submit their own work, is ethically acceptable [2].

AI is indeed capable of performing initial evaluations based on a set of instructions or prompts or criteria in the grading and assessment process. For instance, Michael Klymkowsky, a biology professor at the University of Colorado Boulder, is developing an AI tool to help assess biology students' progress, not to grade [3]. Similarly, tools like Turnitin’s Gradescope and Draft Coach use generative AI for formative feedback on essays and projects, checking grammar, coherence, and organization, offering real-time improvement suggestions [4].

Research has shown that AI tools in grading can improve efficiency, consistency, and provide useful, personalized feedback that helps students improve their skills for subsequent assignments [5]. Studies have demonstrated that AI-powered grading systems like CoGrader enhance grading efficiency and consistency significantly, while providing peer-comparative and benchmark-driven feedback that instructors trust [1]. These systems have been found to produce meaningful feedback perceived helpful by students and reliably discriminate grades by mastery level [2].

In fact, AI grading, combined with detailed, customized, data-driven feedback, has been reported to increase student performance by up to 40%. Professors saved up to 70% of grading time, enabling more focus on teaching and mentorship [3].

Despite these advancements, challenges and concerns remain. AI systems struggle with grading nuance, creativity, and complex, open-ended tasks that require subjective judgment, which humans still assess better [5]. Trust and acceptance depend on transparency, editable benchmarks, and instructor control over AI recommendations to avoid overreliance or potential biases [1][5]. There remain concerns around fairness, potential bias in AI models, and the need for human review to ensure quality and equity.

Steve Graham, a co-author on the studies and professor at Arizona State University, emphasizes the importance of building trust in AI-generated feedback through increased research and studies [5]. He underscores that while AI grading and assessments can potentially help students learn and ease time constraints for teachers, the human element should always be considered.

In conclusion, AI grading tools are effective and generally accepted for structured assignments and formative feedback that aids skill development. They are best used as assistants to human graders to address challenges in complex judgment, trust, and fairness [1][2][3][4][5]. The ethical question remains whether AI grading tools can fairly grade students and help them improve for the next assignment, like a skilled teacher.

Read also:

Latest