Machine learning can be used to mine the ever-increasing data from online classes, helping teachers deliver on personalized education
When I started teaching data science and AI in Duke University’s Pratt School of Engineering, one thing that frustrated me was how little insight I actually felt I had into how effective my teaching was, until the end of semester final exam grades and student assessments came in. Being new to teaching, I spent time reading up on pedagogical best practices and how methods like mastery learning and 1:1 personalized guidance could drastically improve student outcomes. Yet even with my relatively small class sizes I did not feel I had enough insight into each individual student’s learning to provide useful personalized guidance to them. In the middle of the semester, if you had asked me to tell you exactly what a specific student had mastered from the class to date and where he/she was struggling, I would not have been able to give you a very good answer. When students came to me for 1:1 coaching, I had to ask them where they needed help and hope that they were self-aware enough to know.
Knowing that my colleagues in other programs and universities teach much larger class sizes than mine, I asked them how aware they felt they were of each of their students’ level of mastery at any point in time. For the most part, they admitted they were also largely “flying blind” until final assessment results came in. It is historically one of the most vexing problems in education that there is a tradeoff between scale and achievable quality of education: as class sizes grow larger, the ability of a teacher to provide the type of personalized guidance shown by learning science research to be most effective is diminished.
Yet as instructors in the new world of online education we have access to ever-increasing amounts of data at our fingertips which may give us insights into individual student learning — from recorded lecture videos, electronically submitted homework, discussion forums, and online quizzes and assessments. In summer 2020 I began a research project to explore how we could use this data to help us as instructors do our job better. The specific question I set out to answer was: “as an instructor, how can I use the data available to me to support my ability to provide effective personalized guidance to my students?”.
What I wanted to know was, for any given student in my class at any point in time during a semester, what they understood and what they did not understand from the material that had already been presented in the course. In the academic literature this is often referred to as a student’s “knowledge state”. The model of Knowledge Space Theory, introduced by Doignon and Galmagne in 1985 and significantly expanded on since, posits that a given “domain” of knowledge (such as the subject of a course), contains a discreet set of topics (or “items”) which often have interdependencies. The set of topics which a student has mastered to date is called their “knowledge state”. In order to provide effective instruction for the whole class and to provide personalized guidance for individual students, understanding the knowledge state of each student at any point in time is critical.
So how does one identify a student’s knowledge state? The most common method is through assessment — either via homework and/or quizzes and tests. For my classes, I use low-stakes formative quiz assessments each week. Each quiz contains around 10 questions, with roughly half of the questions evaluating student’s knowledge of topics covered in last week’s lecture, and the remaining half covering topics from earlier in the course. In this way, I continue to evaluate students’ mastery of topics from the whole course each week. In addition we have weekly homework which tests a variety of topics covered to date.
But digging through dozens or hundreds of quiz/homework question results for tens or hundreds of students in a class to identify patterns which provide insight on the students’ knowledge states is not the easiest task. Effective teachers need to be good at a lot of things — delivering compelling lectures, creating and grading homework and assessments, etc — but most teachers are not also trained data scientists, nor should they have to be to do their jobs. This is where machine learning comes in. Fundamentally, machine learning is used to recognize patterns in data, and in this case the technology can be used to identify students’ knowledge states from their performance patterns across quizzes and homeworks.
In order to help improve my own teaching, and that of my fellow faculty members in Duke’s AI for Product Innovation masters program, I put my own AI expertise to work and began a project with the goal of developing a system that could, given a set of class quiz/homework results to date and a set of learning topics, identify each student’s learning state at any point in time and present that information to both instructor and learner. This would facilitate more effective personalized guidance by the instructor, and better awareness on the part of the student as to where they need to put additional focus in their study. Additionally, by aggregating this information across the class, an instructor could gain insight into where the class was successfully learning the material and where he/she needs to reinforce certain topics.
The project culminated in the creation of a prototype tool called the Intelligent Classroom Assistant. The tool reads in instructor-provided class quiz and/or homework results to date and the set of learning topics covered in the course to date. It then analyzes the data using a machine learning algorithm and provides the instructor with three automated analyses:
- Quiz/homework question analyzer: identifies quiz or homework questions on which students in the class have struggled, enabling the instructor to focus additional time on the question/answer in reviewing.
- Learning topic analyzer: identifies which of the course topics the students in the class have largely mastered, and which topics need additional reinforcement to help the majority of students achieve mastery
- Personalized student analysis: for a given student in the class, identifies which topics they have mastered and which topics they are struggling with, facilitating more effective instructor intervention. The tool also visualizes student performance patterns over time for each topic, enabling instructors to see whether their mastery of a topic has improved over time.
One of the key challenges in developing the tool was the mapping of quiz/homework questions to the most relevant learning topic. To accomplish this, a novel, custom natural language processing algorithm was developed to understand the context of each question and map it to the primary learning topic it was designed to evaluate.
The Intelligent Classroom Assistant tool was built concurrently with my teaching of the Sourcing Data for Analytics course at Duke, an introductory-level data science course for graduate engineering students which covered a number of technical as well as regulatory and ethical topics around data analytics. This gave me an opportunity to try out the tool on my class as the semester progressed. One of the key things I wanted to evaluate was how well the algorithm behind hood of the tool could classify each quiz/homework question into its most relevant learning topic. On the full set of 85 quiz questions I used during the semester, the algorithm achieved an accuracy upwards of 82% in correctly identifying the relevant learning topic for each question out of the 20 topics covered in the course. While not perfect, this was good enough to make the analyses provided by the tool useful to me as the course instructor.
During the course I used the in-development tool in two main ways to inform my teaching:
- I spent extra time in lecture sessions covering learning topics and specific quiz questions which the tool flagged due to low student performance
- During 1:1 help sessions with students, I was able to use the personalized student analysis module of the tool to give me an understanding of where the student needed extra reinforcement. This helped make the tutoring sessions more focused.
While the tool provided value for me and my class of 16 students, I was curious to see how well the tool could be adapted to courses from other subjects and of various sizes. We are in the process of trying out the tool on another engineering class of 25 and also an undergraduate finance class of over 200 students. I also plan to use the prototype tool in my spring machine learning class to guide my teaching through the semester. Since students can benefit from seeing the results of the tool’s analysis as much as instructors, for spring we hope to include the addition of a student portal allowing students to see their own results and providing personalized study recommendations to students based on their identified knowledge state.
Great teachers must be good at a lot of things — delivering interesting and compelling lectures which retain students’ attention, planning and delivering learning experiences during classtime and homework which help students truly learn to apply the material, creating fair assessments that actually evaluate what students are expected to know , and providing useful, personalized guidance and feedback to students. The amount of electronic data now available to instructors can help support them in these aspects of their teaching. But teachers are not (usually) data scientists themselves, and need analytics tools to help them extract value from the data. The Intelligent Classroom Assistant research project showed one example of how machine learning can be utilized to distill data into insights for teachers to help them do their job better. Such tools can not only help teachers to improve the quality of their classes (as measured by student learning outcomes), but also enable them to do so at increased scale, offering the promise of more personalized teaching even at larger scale.
Based on the initial results from the fall semester, we plan to continue and expand the pilot use of the Intelligent Classroom Assistant in our AI masters program at Duke. We also welcome feedback from other educators on ways to continue to improve the tool. When teachers can teach more effectively, learners can learn more, and as a society we all reap the benefits.