KDD Papers

Semi-Supervised Techniques for Mining Learning Outcomes and Prerequisites

Igor Labutov (Carnegie Mellon University);Yun Huang (University of PIttsburgh);Peter Brusilovsky (University of PIttsburgh);Daqing He (University of PIttsburgh)


Educational content of today no longer only resides in textbooks and classrooms; more and more learning material is found in a free, accessible form on the Internet. Our long-standing vision is to transform this web of educational content into an adaptive, web-scale ``textbook’‘, that can guide its readers to most relevant ``pages’’ according to their learning goal and current knowledge. In this paper, we address one core, long-standing problem towards this goal: identifying outcome and prerequisite concepts within a piece of educational content (e.g., a tutorial). Specifically, we propose a novel approach that leverages textbooks as a source of distant supervision, but learns a model that can generalize to arbitrary documents (such as those on the web). As such, our model can take advantage of any existing textbook, without requiring expert annotation. At the task of predicting outcome and prerequisite concepts, we demonstrate improvements over a number of baselines on six textbooks, especially in the regime of little to no ground-truth labels available. Finally, we demonstrate the utility of a model learned using our approach at the task of identifying prerequisite documents for adaptive content recommendation—- an important step towards our vision of the ``web as a textbook’‘.