Skip to content

The data and source code for the paper "MoocRadar: A Fine-grained and Multi-aspect Knowledge Repository for Improving Cognitive Student Modeling in MOOCs"

Notifications You must be signed in to change notification settings

THU-KEG/MOOC-Radar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MoocRadar

MoocRadar is maintained by the Knowledge Engineering Group of Tsinghua University with the assistance of Insititute of Education, Tsinghua Univerisity. This repository consists of 2,513 exercises, 14,226 students and over 12 million behavioral data and 5,600 fine-grained concepts, for supporting the developments of cognitive student modeling in MOOCs. The raw data is from XuetangX (https://www.xuetangx.com/).

We summarize the features of MoocRadar as:

  • Abundant Learning Context: MoocRadar provides the relevant learning resources, structures, and contents about the students' exercise behaviors, which can enrich the selection candidates for the modeling methods.
  • Fine-grained Knowledge Concepts: All the fine-grained concepts have been manually annotated and checked by the experts, which guarantees the quality of such specifical knowledge.
  • Cognitive Level Labels: We invoke the Bloom Cognitive Taxonomy to construct "Cognitive Level" tags for the exercises, which can be further explored in subsequent research.

We are still going on the extension and annotation of this repository.

Based on MoocRadar, developers can attempt to build a more informative profile for each student, as introduced in our paper.

task

News !!

  • Exercise amount is extended to 9,384 !!

  • Our paper is submitted to SIGIR resource track !!

  • Update the annotation guidance of fine-grained concepts and cognitive labels.

Data Access

There are multi-level data to be used, including:

Dataset Description Download Link
MoocRadar_Raw All data of MOOC-radar. Raw link

Reproduction Model

Rsearchers can set up the presented models with EduKTM and EduCDM.

We provide several basic model's demo, including:

We also provide the performance of the improvement of DKVMN and NCDM with side information (i.e. cognitive and video).

Data for baselines reproduction:

  1. --mode (Option: Coarse/Middle/Fine) for your settings

  2. --data_dir with Corresponding granularity data from above table.

    for example, for --mode Middle setting, prepare the following files:

    • ./data/student-problem-middle.json
    • ./data/problem.json
  3. then generate train/test dataset by setting: --data_process in scripts

Data for improvement reproduction with cognitive and video side information:

Option 1: generate by setting: --data_process in scripts

Option 2: download from there

Toolkit & Guidance

There are also several tools and guidance for extending and employing the data.

For extending the data from MOOCCubeX knowledge base.

For further data annotation:

For more information:

Feature

The distribution of students' exercise behaviors, accurate rates and concept-linked exercises.

Reference

 @article{MOOCRadar,
  title={MoocRadar: A Fine-grained and Multi-aspect Knowledge Repository for Improving Cognitive Student Modeling in MOOCs},
  author={Jifan Yu, Mengying Lu, Qingyang Zhong, Zijun Yao, Shangqing Tu, Zhengshan Liao, Xiaoya Li, Manli Li, Lei Hou, Haitao Zheng, Juanzi Li, Jie Tang},
  year={ 2023 }
 }

About

The data and source code for the paper "MoocRadar: A Fine-grained and Multi-aspect Knowledge Repository for Improving Cognitive Student Modeling in MOOCs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published