jhljx / gkt Goto Github PK

Graph-based Knowledge Tracing: Modeling Student Proficiency Using Graph Neural Network

License: MIT License

Python 99.53% Shell 0.47%

knowledge-tracing knowledge-tracing-models graph-based-learning graph-based-model edge-inference time-series educational-data-mining

gkt's Introduction

GKT

The implementation of the paper Graph-based Knowledge Tracing: Modeling Student Proficiency Using Graph Neural Network.

The architecture of the GKT is as follows:

Setup

To run this code you need the following:

a machine with GPUs
python3
numpy, pandas, scipy, scikit-learn and torch packages:

pip3 install numpy==1.17.4 pandas==1.1.2 scipy==1.5.2 scikit-learn==0.23.2 torch==1.4.0

Note that don't use pandas with 0.23.4 version, because it will cause bugs when perform the following command in the processing.py file.

df.groupby('user_id', axis=0).apply(get_data)

If you use 'assistment_test15.csv' file to test, then in pandas 0.23.4 version, after groupby users, it will return 16 students. But if you use pandas in 1.x version, it will return 15 students. (This bug is found by vinnnan)

Training the model

Use the train.py script to train the model. To train the GKT model on ASSISTments2009-2010 skill-builder dataset, simply use:

python3 train.py --data-file=skill_builder_data.csv --model=GKT --graph-type=Dense

We also provide the baseline, i.e. Deep Knowledge Tracing(DKT) for performance comparison. To train the DKT model on ASSISTments2009-2010 skill-builder dataset, simply use:

python3 train.py --data-file=skill_builder_data.csv --model=DKT

You might want to at least change the --data_dir and --save_dir which point to paths on your system to save the knowledge tracing data, and where to save the checkpoints.

gkt's People

Contributors

Stargazers

Watchers

gkt's Issues

RuntimeError: CUDA out of memory.

Yes I really want to run this program on large-scale datasets, but I really don't know how to optimize the code on the GPU. Please give me a guide.

I've tried to set "--epochs=5 --batch-size=32" when training the model, but still got the following:

RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 10.76 GiB total capacity; 6.91 GiB already allocated; 5.81 MiB free; 6.93 GiB reserved in total by PyTorch)

I'm just a beginner. Appreciate your sincere help.

Answers from raw data are not shifted by 1

First of all, I would like to thank you for implementing the codebase in such an efficient and comprehensive way!

Just one minor issue that I have noticed that in processing.py as you are converting raw data from CSV files to answers, questions, and features, you mention in comments that in Step 4, answers need to be shifted by 1, which I think is in line with the problem definition and the paper. However, I don't think there are any shift operations been done in preparing the raw data, nor is there any operation that does this when iterating the dataloader.

So I would like to know whether there is anything that I overlooked, or this is indeed a typo in the codebase.

Thank you!

Below is the code snippet for Step 4 from the file.

    # Step 4 - Convert to a sequence per user id and shift features 1 timestep
    feature_list = []
    question_list = []
    answer_list = []
    seq_len_list = []

what kind of GPU do you use to run this model

Hi, this code is beautiful, but it runs so slow on my Nvidia GTX 2080 Ti, taking 248 seconds for one batch (on assist2009, batch size = 128). And once I use dataset with larger num of skills, the program corrupted due to lack of GPU memory. So I wonder what kind of GPU do you use to run this model and how long does it take to train?

extract the learned graph structure from the trained GKT model

Hello, I don't understand how to extract the learned graph structure. I haven't found relevant information in the code

Gradient explosion ?

Hi, jhljx. What a brief and beautiful implement ! But , when I run the code on my mechine, the predict values of the model were nan afer few epochs . And I found that the parameters of the model were updated by nan through back propagation. I am not sure if this is caused by gradient explosion. If it is , how to solve the problem. I have tried to decrease the learning rate and batch size , but it seems not work.

jhljx / gkt Goto Github PK

gkt's Introduction

GKT

Setup

Training the model

gkt's People

Contributors

Stargazers

Watchers

Forkers

gkt's Issues

RuntimeError: CUDA out of memory.

Answers from raw data are not shifted by 1

what kind of GPU do you use to run this model

extract the learned graph structure from the trained GKT model

Gradient explosion ?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent