Comments (8)
you can use small batch size to train the model, maybe batch-size=32 is to big.
from gkt.
you can use small batch size to train the model, maybe batch-size=32 is to big.
I set batch-size=16.
But I got this log from 59th batch:
...
batch idx: 57 loss kt: 0.6516776084899902 auc: 0.5851154181184669 acc: 0.6264667535853976 cost time: 6.809619903564453
batch idx: 58 loss kt: 0.6162782907485962 auc: 0.5378101714226197 acc: 0.7085820127598271 cost time: 11.871150255203247
batch idx: 59 loss kt: nan auc: -1 acc: -1 cost time: 5.788291692733765
batch idx: 60 loss kt: nan auc: -1 acc: -1 cost time: 4.6798787117004395
...
batch idx: 112 loss kt: nan auc: -1 acc: -1 cost time: 1.3512508869171143
batch idx: 113 loss kt: nan auc: -1 acc: -1 cost time: 6.595600843429565
(Looks like it‘s not gonna be an end.)
Is there something wrong with the code? I am still working on this.
from gkt.
you can use small batch size to train the model, maybe batch-size=32 is to big.
I set batch-size=16.
But I got this log from 59th batch: ... batch idx: 57 loss kt: 0.6516776084899902 auc: 0.5851154181184669 acc: 0.6264667535853976 cost time: 6.809619903564453 batch idx: 58 loss kt: 0.6162782907485962 auc: 0.5378101714226197 acc: 0.7085820127598271 cost time: 11.871150255203247 batch idx: 59 loss kt: nan auc: -1 acc: -1 cost time: 5.788291692733765 batch idx: 60 loss kt: nan auc: -1 acc: -1 cost time: 4.6798787117004395 ... batch idx: 112 loss kt: nan auc: -1 acc: -1 cost time: 1.3512508869171143 batch idx: 113 loss kt: nan auc: -1 acc: -1 cost time: 6.595600843429565
(Looks like it‘s not gonna be an end.)
Is there something wrong with the code? I am still working on this.I want to add your wechat,Because I also encountered this problem
Great!It's “waves99”.
from gkt.
The reason why the kt loss value is nan is:
- the learning rate is too big, making gradient vanish
- there is something wrong in the training data
Do you run this code on the data set we provide?
from gkt.
The reason why the kt loss value is nan is:
- the learning rate is too big, making gradient vanish
- there is something wrong in the training data
Do you run this code on the data set we provide?
Yes, I did use the dataset you provide.
And this is my hyper-parameter:("learning rate" is 0.01)
nohup: ignoring input
Namespace(attn_dim=32, batch_size=16, bias=True, binary=True, cuda=True, data_dir='data', data_file='skill_builder_data.csv', dkt_graph='dkt_graph.txt', dkt_graph_dir='dkt-graph', dropout=0, edge_types=2, emb_dim=32, epochs=50, factor=True, gamma=0.5, graph_save_dir='graphs', graph_type='Dense', hard=False, hid_dim=32, load_dir='', lr=0.01, lr_decay=200, model='GKT', no_cuda=False, no_factor=False, prior=False, result_type=12, save_dir='logs', seed=42, shuffle=True, temp=0.5, test=False, test_model_dir='logs/expDKT', train_ratio=0.6, vae_decoder_dim=32, vae_encoder_dim=32, val_ratio=0.2, var=1)
max seq_len: 6157
student num: 4047
feature_dim: 246
question_dim: 123
train_size: 2428 val_size: 809 test_size: 810
from gkt.
That's strange. I've run the code with the provided dataset, and it ran smoothly. Maybe you can check the versions of python dependencies?
from gkt.
Here's my dependencies:
torch 1.7.0+cu110
scikit-learn 1.0.1
scipy 1.7.3
pandas 1.2.2
numpy 1.18.5
And yours:
pip3 install numpy==1.17.4 pandas==1.1.2 scipy==1.5.2 scikit-learn==0.23.2 torch==1.4.0
AND By the way,“GKT” is working alright with the dataset “assistment_test15” :
Namespace(attn_dim=32, batch_size=64, bias=True, binary=True, cuda=True, data_dir='data', data_file='assistment_test15.csv', dkt_graph='dkt_graph.txt', dkt_graph_dir='dkt-graph', dropout=0, edge_types=2, emb_dim=32, epochs=50, factor=True, gamma=0.5, graph_save_dir='graphs', graph_type='Dense', hard=False, hid_dim=32, load_dir='', lr=0.001, lr_decay=200, model='GKT', no_cuda=False, no_factor=False, prior=False, result_type=12, save_dir='logs', seed=42, shuffle=True, temp=0.5, test=False, test_model_dir='logs/expDKT', train_ratio=0.6, vae_decoder_dim=32, vae_encoder_dim=32, val_ratio=0.2, var=1)
max seq_len: 368
student num: 15
feature_dim: 148
question_dim: 74
train_size: 9 val_size: 3 test_size: 3
……
……
Best Epoch: 0047
--------------------------------
--------Testing-----------------
--------------------------------
loss_test: 0.6181263328 auc_test: 0.5657202216 acc_test: 0.6813819578
Looks like there's something wrong with the dataset "skill_builder". Maybe
from gkt.
Maybe you can use my python library version, especially numpy, pandas and scipy.
from gkt.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gkt.