Giter VIP home page Giter VIP logo

Comments (8)

autoliuweijie avatar autoliuweijie commented on September 13, 2024

你好,我在复现您的实验(没有进行任何修改)的时候在主干网络的训练时准确率是逐渐提高的,在蒸馏阶段验证集和测试集的acc每一个epoch都和主干网络的最后一个epoch相同,请问是我哪里出错了吗?

你蒸馏时的speed设为多少,这么看上去像是speed=0.0, 导致所有样本都走到主干的最后一层。

from fastbert.

1125690278 avatar 1125690278 commented on September 13, 2024

你好,我在复现您的实验(没有进行任何修改)的时候在主干网络的训练时准确率是逐渐提高的,在蒸馏阶段验证集和测试集的acc每一个epoch都和主干网络的最后一个epoch相同,请问是我哪里出错了吗?

你蒸馏时的speed设为多少,这么看上去像是speed=0.0, 导致所有样本都走到主干的最后一层。

speed 为0.5 用的就是你提供的脚本

from fastbert.

autoliuweijie avatar autoliuweijie commented on September 13, 2024

你好,我在复现您的实验(没有进行任何修改)的时候在主干网络的训练时准确率是逐渐提高的,在蒸馏阶段验证集和测试集的acc每一个epoch都和主干网络的最后一个epoch相同,请问是我哪里出错了吗?

你蒸馏时的speed设为多少,这么看上去像是speed=0.0, 导致所有样本都走到主干的最后一层。

speed 为0.5 用的就是你提供的脚本

麻烦把执行的命令和print到终端的结果贴出来看一看哈。

from fastbert.

1125690278 avatar 1125690278 commented on September 13, 2024

你好,我在复现您的实验(没有进行任何修改)的时候在主干网络的训练时准确率是逐渐提高的,在蒸馏阶段验证集和测试集的acc每一个epoch都和主干网络的最后一个epoch相同,请问是我哪里出错了吗?

你蒸馏时的speed设为多少,这么看上去像是speed=0.0, 导致所有样本都走到主干的最后一层。

speed 为0.5 用的就是你提供的脚本

麻烦把执行的命令和print到终端的结果贴出来看一看哈。
脚本
CUDA_VISIBLE_DEVICES="0" python -u run_fastbert.py
--pretrained_model_path ./models/chinese_bert_base.bin
--vocab_path ./models/google_zh_vocab.txt
--train_path ./datasets/douban_book_review/train.tsv
--dev_path ./datasets/douban_book_review/dev.tsv
--epochs_num 3 --batch_size 32 --distill_epochs_num 5
--encoder bert --fast_mode --speed 0.5
--output_model_path ./models/douban_book_review_fastbert.bin
结果
Epoch id: 3, backbone fine-tuning steps: 100, Avg loss: 0.593
Epoch id: 3, backbone fine-tuning steps: 200, Avg loss: 0.462
Epoch id: 3, backbone fine-tuning steps: 300, Avg loss: 0.493
Epoch id: 3, backbone fine-tuning steps: 400, Avg loss: 0.451
Epoch id: 3, backbone fine-tuning steps: 500, Avg loss: 0.452
Epoch id: 3, backbone fine-tuning steps: 600, Avg loss: 0.449
The number of evaluation instances: 9811
Fast mode: False
Number of model parameters: 85198850.0
FLOPs per sample in average: 10892624128.0
Acc. (Correct/Total): 0.7755 (7608/9811)
Start self-distillation for student-classifiers.
Epoch id: 1, self-distillation steps: 100, Avg loss: 0.532
Epoch id: 1, self-distillation steps: 200, Avg loss: 0.058
Epoch id: 1, self-distillation steps: 300, Avg loss: 0.040
Epoch id: 1, self-distillation steps: 400, Avg loss: 0.033
Epoch id: 1, self-distillation steps: 500, Avg loss: 0.029
Epoch id: 1, self-distillation steps: 600, Avg loss: 0.028
The number of evaluation instances: 9811
Fast mode: True
Number of model parameters: 87192600.0
FLOPs per sample in average: 7352265517.297727
Acc. (Correct/Total): 0.7755 (7608/9811)
Epoch id: 2, self-distillation steps: 100, Avg loss: 0.031
Epoch id: 2, self-distillation steps: 200, Avg loss: 0.023
Epoch id: 2, self-distillation steps: 300, Avg loss: 0.021
Epoch id: 2, self-distillation steps: 400, Avg loss: 0.022
Epoch id: 2, self-distillation steps: 500, Avg loss: 0.022
Epoch id: 2, self-distillation steps: 600, Avg loss: 0.022
The number of evaluation instances: 9811
Fast mode: True
Number of model parameters: 87192600.0
FLOPs per sample in average: 7641473334.97829
Acc. (Correct/Total): 0.7755 (7608/9811)
Epoch id: 3, self-distillation steps: 100, Avg loss: 0.025
Epoch id: 3, self-distillation steps: 200, Avg loss: 0.019
Epoch id: 3, self-distillation steps: 300, Avg loss: 0.019
Epoch id: 3, self-distillation steps: 400, Avg loss: 0.017
Epoch id: 3, self-distillation steps: 500, Avg loss: 0.018
Epoch id: 3, self-distillation steps: 600, Avg loss: 0.019
The number of evaluation instances: 9811
Fast mode: True
Number of model parameters: 87192600.0
FLOPs per sample in average: 7627017668.168383
Acc. (Correct/Total): 0.7755 (7608/9811)
Epoch id: 4, self-distillation steps: 100, Avg loss: 0.023
Epoch id: 4, self-distillation steps: 200, Avg loss: 0.019
Epoch id: 4, self-distillation steps: 300, Avg loss: 0.018
Epoch id: 4, self-distillation steps: 400, Avg loss: 0.018
Epoch id: 4, self-distillation steps: 500, Avg loss: 0.017
Epoch id: 4, self-distillation steps: 600, Avg loss: 0.017
The number of evaluation instances: 9811
Fast mode: True
Number of model parameters: 87192600.0
FLOPs per sample in average: 7627017668.168383
Acc. (Correct/Total): 0.7755 (7608/9811)
Epoch id: 5, self-distillation steps: 100, Avg loss: 0.023
Epoch id: 5, self-distillation steps: 200, Avg loss: 0.018
Epoch id: 5, self-distillation steps: 300, Avg loss: 0.018
Epoch id: 5, self-distillation steps: 400, Avg loss: 0.018
Epoch id: 5, self-distillation steps: 500, Avg loss: 0.018
Epoch id: 5, self-distillation steps: 600, Avg loss: 0.018
The number of evaluation instances: 9811
Fast mode: True
Number of model parameters: 87192600.0
FLOPs per sample in average: 7627017668.168383
Acc. (Correct/Total): 0.7755 (7608/9811)

from fastbert.

autoliuweijie avatar autoliuweijie commented on September 13, 2024

从self-distilation的效果来看,确实是FLOPs下降,而Acc不变。

但是这个Acc在Book review数据集上差了很多,请确保./models/chinese_bert_base.bin是正确的?以及使用的是python3吗

from fastbert.

1125690278 avatar 1125690278 commented on September 13, 2024

从self-distilation的效果来看,确实是FLOPs下降,而Acc不变。

但是这个Acc在Book review数据集上差了很多,请确保./models/chinese_bert_base.bin是正确的?以及使用的是python3吗

从self-distilation的效果来看,确实是FLOPs下降,而Acc不变。

但是这个Acc在Book review数据集上差了很多,请确保./models/chinese_bert_base.bin是正确的?以及使用的是python3吗

确认没错的 都是按你的链接下载的

from fastbert.

autoliuweijie avatar autoliuweijie commented on September 13, 2024

从self-distilation的效果来看,确实是FLOPs下降,而Acc不变。
但是这个Acc在Book review数据集上差了很多,请确保./models/chinese_bert_base.bin是正确的?以及使用的是python3吗

从self-distilation的效果来看,确实是FLOPs下降,而Acc不变。
但是这个Acc在Book review数据集上差了很多,请确保./models/chinese_bert_base.bin是正确的?以及使用的是python3吗

确认没错的 都是按你的链接下载的

可以试试Pypi版本的:https://github.com/autoliuweijie/FastBERT/tree/master/pypi

from fastbert.

NovemberSun avatar NovemberSun commented on September 13, 2024

请问这个问题有解决吗?我的实验中self-distilation和主干网络的最后一个epoch结果不变,但是self-distilation过程中第5个epoch到第10个epoch的准确率都不变

from fastbert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.