Comments (10)
同问
from chinese_text_cnn.
同问
from chinese_text_cnn.
多了个unk
from chinese_text_cnn.
可以将代码中:args.class_num = len(label_field.vocab) 换成:args.class_num = len(label_field.vocab) - 1。因为代码用的时候是label_field.build_vocab(train_dataset, dev_dataset) 用的制作词汇表的代码,而词汇表中有一个unk,就是没有出现在词汇表中单词的代表形式,所以会多一个unk。label_field只对标签数量产生影响,只要把标签数量改回原始数量就行了。
from chinese_text_cnn.
Batch[1800] - loss: 0.009499 acc: 100.0000%(128/128)
Evaluation - loss: 0.000026 acc: 94.0000%(6616/7000)
early stop by 1000 steps, acc: 94.0000% 这个是作者跑出的结果;
Batch[2200] - loss: 0.008443 acc: 100.0000%(128/128)
Evaluation - loss: 0.000025 acc: 94.7429%(6632/7000)
Saving best model, acc: 94.7429%
这个是我跑出的结果
from chinese_text_cnn.
请问这个结果,只是args.class_num = len(label_field.vocab) 换成:args.class_num = len(label_field.vocab) - 1吗?:
Batch[2200] - loss: 0.008443 acc: 100.0000%(128/128)
Evaluation - loss: 0.000025 acc: 94.7429%(6632/7000)
Saving best model, acc: 94.7429%
from chinese_text_cnn.
是的,其他的我记得我也没做修改,就搭建了环境!
from chinese_text_cnn.
@caoxiaopeng123 十分感谢
from chinese_text_cnn.
或者在配置label_field的时候可以设置关掉试试,:
label_field = data.Field(sequential=False, unk_token=None)
我自己写的时候发现这样len(label_field.vocab)
输出是正常的, 是2
from chinese_text_cnn.
大佬,那为什么我只有1000+条数据,为啥args.vocabulary_size = len(text_field.vocab)是4210多个?是因为我1000+条的数据构成了一个字典一样的东西,然后相当于字典里面存了4210个词汇吗?
from chinese_text_cnn.
Related Issues (15)
- 怎么预测呀 HOT 4
- ValueError: Expected input batch_size (52) to match target batch_size (128). HOT 1
- torch 1.0.0 not found HOT 1
- multichannel wrong
- 预测问题 HOT 1
- 第Batch[100]次时候报错 HOT 1
- from_pretrained函数 HOT 1
- About the option when using multichannel mode HOT 1
- OverflowError: Python int too large to convert to C long HOT 3
- feature.data.t_(), target.data.sub_(1) HOT 2
- Where is the dataset comes from?
- Traceback (most recent call last): File "main.py", line 70, in <module> train_iter, dev_iter = load_dataset(text_field, label_field, args, device=-1, repeat=False, shuffle=True) File "main.py", line 57, in load_dataset text_field.build_vocab(train_dataset, dev_dataset) File "/home/mli/.pyenv/versions/miniconda-latest/envs/li/lib/python3.7/site-packages/torchtext/data/field.py", line 298, in build_vocab for x in data: File "/home/mli/.pyenv/versions/miniconda-latest/envs/li/lib/python3.7/site-packages/torchtext/data/dataset.py", line 154, in __getattr__ yield getattr(x, attr) AttributeError: 'Example' object has no attribute 'text'
- 调用eval()后忘记恢复训练模式了?
- feature.data.t_()和target.data.sub_(1)报错
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chinese_text_cnn.