INFEDU

Informatics in Education

2335-8971 1648-5831

INFEDU_2024_1_10

10.15388/infedu.2024.10

Article

Reliability and Validity of an Automated Model for Assessing the Learning of Machine Learning in Middle and High School: Experiences from the “ML for All!” course

Rauber

Marcelo Fernando

marcelo.rauber@ifc.edu.br Graduate Program in Computer Science, Department of Informatics and Statistics, Federal University of Santa Catarina, Florianópolis/SC, Brazil. Federal Institute Catarinense (IFC) - Camboriú/SC - Brazil. von Wangenheim

Christiane Gresse

c.wangenheim@ufsc.br Graduate Program in Computer Science, Department of Informatics and Statistics, Federal University of Santa Catarina, Florianópolis/SC, Brazil. Barbetta

Pedro Alberto

pedro.barbetta@ufsc.br Graduate Program in Methods and Management in Evaluation - Federal University of Santa Catarina, Florianópolis/SC, Brazil. Borgatto

Adriano Ferreti

adriano.borgatto@ufsc.br Graduate Program in Methods and Management in Evaluation - Federal University of Santa Catarina, Florianópolis/SC, Brazil. Martins

Ramon Mayor

ramon.mayor@posgrad.ufsc.br Graduate Program in Computer Science, Department of Informatics and Statistics, Federal University of Santa Catarina, Florianópolis/SC, Brazil. Hauck

Jean Carlo Rossa

jean.hauck@ufsc.br Graduate Program in Computer Science, Department of Informatics and Statistics, Federal University of Santa Catarina, Florianópolis/SC, Brazil.

23 2 409 437

2023

Vilnius University, ETH Zürich

Open access article under the CC BY license.

The insertion of Machine Learning (ML) in everyday life demonstrates the importance of popularizing an understanding of ML already in school. Accompanying this trend arises the need to assess the students’ learning. Yet, so far, few assessments have been proposed, most lacking an evaluation. Therefore, we evaluate the reliability and validity of an automated assessment of the students’ learning of an image classification model created as a learning outcome of the “ML for All!” course. Results based on data collected from 240 students indicate that the assessment can be considered reliable (coefficient Omega = 0.834/Cronbach's alpha α=0.83). We also identified moderate to strong convergent and discriminant validity based on the polychoric correlation matrix. Factor analyses indicate two underlying factors “Data Management and Model Training” and “Performance Interpretation”, completing each other. These results can guide the improvement of assessments, as well as the decision on the application of this model in order to support ML education as part of a comprehensive assessment.

Keywords K-12 Middle and high school Machine Learning Artificial Intelligence Neural network Image Classification Assessment Evaluation