Giter VIP home page Giter VIP logo

gpt2-ml's People

Contributors

dependabot[bot] avatar erjanmx avatar imcaspar avatar mymusise avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpt2-ml's Issues

[Bug] name your bug

Environment

  • Python version: Python 3.7.5 x64
  • OS: Win10 18363.476
  • (Optional) Other libraries and their versions:
    pandas==0.24.2
    regex==2019.4.14
    h5py==2.9.0
    numpy==1.16.2
    tensorboard==1.13.1
    tensorflow==1.13.1
    tensorflow-estimator==1.13.0
    tqdm==4.31.1
    requests==2.22.0

Error messages, stack traces, or logs

D:\gpt2-ml-master>python scripts/interactive_conditional_samples.py -model_config_fn configs/mega.json -model_ckpt models/mega/model.ckpt-100000 -eos_token 511 -min_len 200 -samples 10
Traceback (most recent call last):
File "scripts/interactive_conditional_samples.py", line 10, in
from train.modeling import GroverModel, GroverConfig, sample
ModuleNotFoundError: No module named 'train'

Additional context (optional)

我本地跑i下结果是这样 为什么呢?难道就是由于我没独显嘛 谢谢

词表问题

哥们你好,请问你发布的模型中,其对应的词表在哪里?库目录中提供的两个词表(tokenization)跟模型训练的词表单词数完全不一致。
能否告知?谢谢!

[Discussion] 词表文件

请问想利用30G训练数据的那个版本进行finetuning的话,目前的prepare_data.py好像词表不对应?请问prepare_data.py中需要做什么调整呢?把词表文件从bert的换成clue的就可以吗?还有什么其他地方要改动吗?
@imcaspar

[Discussion] GPT-3

Thank you for great work. The Appendix B of the GPT-3 paper mentions the following. I'm wondering whether the idea has been implemented in gpt2-ml. If not yet, what would you advise regarding how to implement it?

Appendix B.

....
During training we always train on sequences of the full nctx = 2048 token context window, packing multiple documents into a single sequence when documents are shorter than 2048, in order to increase computational efficiency. Sequences with multiple documents are not masked in any special way but instead documents within a sequence are delimited with a special end of text token, giving the language model the information necessary to infer that context separated by the end of text token is unrelated. This allows for efficient training without need for any special sequence-specific masking.
....

up主是不是把模型训练成近似恒等式了?

看了一下生成结果,来自问题1的引用:
【想要学习更多内容,请关注微信号:b##ms##h##200##1】【文章来源:艾锐文化】点击下
方阅读原文查看更多内容。↓↓↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓##↓点击"阅读原文"【查看更多内容】

回头特地看了一下训练参数和字典,不是分词,按理说“文章来源”训练出来可以接受,但“艾锐文化”输出就有恒等输出的可能了,这是命名词,不应该有这么强的联系。
这一点在Morizeyao/GPT2-Chinese中,也有一位**的朋友遇到,训练出来的模型总是倾向于输出原文。。。

所以,我的问题其实是openAI这个大力出奇迹的超大模型,究竟是在生成还是复现?

[Bug] 不能下载文件

Below is the issue template. You can fill each part then submit your issue.
Or you can just delete all of these and describe your questions in you-like style.
But please remember: the more detailed info you offered, the greater possibility your problem will be solved. 😜

Please write a clear and concise description of what the bug is.

fatal: destination path 'gpt2-ml' already exists and is not an empty directory.
/content/gpt2-ml
models/mega/model.c [ <=> ] 3.02K --.-KB/s in 0s
Couldn't download the file :-(

Please write a clear and concise description of what you expected to happen.
solve the problem

Environment

Colab

  • Python version:
  • OS:
  • (Optional) Other libraries and their versions:

Error messages, stack traces, or logs

fatal: destination path 'gpt2-ml' already exists and is not an empty directory.
/content/gpt2-ml
models/mega/model.c [ <=> ] 3.02K --.-KB/s in 0s
Couldn't download the file :-(

Steps to reproduce

1.starts when
2.error occurs

[Discussion] gpu预测速度

模型在v100 GPU上直接利用demo.py进行预测,发现在min_len设置为150时,每生成一个sample需要80-90s,即使把min_len缩短为10,每生成一个sample也需要30s左右,请问这是正常的吗?如果是的话有什么可以加速预测的设置或者方法吗?

谢谢。

生成效果

你好,请问这个预训练模型生成的效果是怎样的,我转成pytorch后,生成的内容非常差,想知道是否是因为转换的关系。
谢谢!

vocab_size 21130 vs 21128

mega.json中vocab_size是21130,bert-base-chinese-vocab.txt的行数是21128,跟模型中大小不匹配。 预训练模型是使用这个vocab训练的吗? 谢谢.

训练问题请教

作者你好!想跟你请教一下:
通过大数据量训练得到的预训练模型后再小范围适配训练,真的比直接将数据进行分类并清洗后,再训练成不同的模型效果好吗?

  • 模型提取输入的特征,但是模型所能表示的特征还是有限的,特别是自然语言处理方面的问题,一个模型真的能包罗万象吗?随着数据量的不断增加,特征组合总会出现冲突的,从而降低模型的准确性。
  • 不断的增加层数,增加数据量,扩大模型的体积,真的比提高数据质量专门训练的模型要好吗?就好比人来说,术业有专攻,一个人可以精通多种技能,但是最熟悉的还是当前正在使用的技能,突然要转向其它的,依旧还需要再复习一下。
  • 而模型中的体现是,训练得到的模型趋向于输出最近训练的内容,所以能不能将数据分类,如新闻数据,翻译数据,财经数据等,各自训练成一个预训练模型,再精准训练,最后提供一个适配层来接收所有前端输入,后端再由具体的模型处理。
  • 而不是不断增加数据量及模型层数,最终使得模型泛而不精,因为模型特征参数始终是有限的,通过加权求和得到的组合,最终还是难以表示全部可能性,随着数据量的不断增加,层数的不断增加,冲突出现的可能就越大,模型的适用性就在降低。
    以上是我个人的看法,请指教!

用GPT2对对子,效果很赞

以下上联是人写的(可以自行百度),下联都是机器对出来的,效果很赞,估计你都忘记原来的词句了:
【上联】悠悠柳岸落红霞 【下联】寂寂春窗落絮声
【上联】悠悠柳岸落红霞 【下联】袅潺松阴披晚节
【上联】悠悠柳岸落红霞 【下联】郁郁荷塘映彩虹
【上联】悠悠柳岸落红霞 【下联】冉冉荷塘生绿烟
【上联】悠悠柳岸落红霞 【下联】款款江堤荡碧波
【上联】漠漠水田飞白鹭 【下联】茫茫云路隐青山
【上联】悠悠柳岸落红霞 【下联】款款荷塘荡碧波
【上联】漠漠水田飞白鹭 【下联】滔滔波浪涌黄河
【上联】漠漠水田飞白鹭 【下联】悠悠柳岸荡轻舟
【上联】帝道真知,如今都成过去事 【下联】民心所向,自古都是往来人
【上联】旷古圣人才,能以逍遥通世法 【下联】平生名利事,自然淡泊得真如
【上联】公谊不妨私,平日政见分弛,肝胆至今推挚友 【下联】人生何足论?此时心怀坦荡,襟怀自古仰高风
【上联】公谊不妨私,平日政见分弛,肝胆至今推挚友 【下联】人间原是纸,吾意情犹激励,江山从古胜文章
【上联】公谊不妨私,平日政见分弛,肝胆至今推挚友 【下联】子孙何必羡?此时贤亲义士,风霜从古更哀师
【上联】英雄作事无它,只坚忍一心,能成世界能成我 【下联】壮志凌云有志,纵风流万里,不负春秋不负人
【上联】英雄作事无它,只坚忍一心,能成世界能成我 【下联】人间多情至此,在消磨半句,可笑天伦最是家
【上联】共和误民国?民国误共和?百世而后,再平是狱 【下联】以人为天下,天下为己任:万物之中,一物之间
【上联】共和误民国?民国误共和?百世而后,再平是狱 【下联】同心作主人,天地为公道:一尘不染,无欲则刚
【上联】共和误民国?民国误共和?百世而后,再平是狱 【下联】一语关己身,子孙成大孝!四德可先,既往以终
【上联】我以我血荐轩辕 【下联】自有英名垂宇宙
【上联】漠漠水田飞白鹭 【下联】盈盈竹坞醉秋风
image
PS:基于Uber提出的PPLM(Plug and Play Language Models: a Simple Approach to Controlled Text Generation),用词袋控制生成主题不偏离,比如下面的对联,我输入上联,下联紧扣“春天“这个主题,貌似效果还不错:

Uploading image.png…
山 抹 微 云 , 天 粘 衰 草 , 画 角 声 断 谯 门 |水 流 明 月 , 风 送 残 花 , 诗 心 韵 动 江 楼
桃 花 也 解 愁 , 点 点 飘 红 玉 | 柳 絮 才 知 春 , 丝 丝 缠 绿 烟
新 年 都 未 有 芳 华 , 二 月 初 惊 见 草 芽 |旧 友 已 然 成 故 事 , 三 更 又 起 闻 花 香
国 破 山 河 在 , 城 春 草 木 深 | 人 逢 盛 世 来 , 民 乐 天 地 新
红豆生南国,春来发几枝 |青山在北疆,秋去又一年
我看青山多妩媚 | 谁知碧水不妖娆
*天阶夜色凉如水 | 人面春风醉若泥

【上联】春山暖日和风,阑干楼阁帘栊 【下联】暮水朝云细雨,别院花木兰芳
【上联】京口瓜洲一水间,钟山只隔数重山 【下联】巴人竹叶千杯里,花雨不沾半缕尘
【上联】春风又绿江南岸,明月何时照我还 【下联】柳岸常依燕子楼,繁花不处有谁家
【上联】碧玉妆成一树高,万条垂下绿丝绦 【下联】红日照亮千畴艳,百鸟争鸣金缕机
【上联】碧玉妆成一树高,万条垂下绿丝绦 【下联】红云照亮千峰翠,百鸟唤来金雀巢
【上联】天阶夜色凉如水 【下联】曲径花声艳若霞
【上联】春风又绿江南岸 【下联】细雨还红陌上桃
【上联】春风又绿江南岸 【下联】旭日重临岭上林
【上联】春风又绿江南岸 【下联】细雨还红陌上桃
【上联】我看青山多妩媚 【下联】谁言梅水不芬芳
名篇《春江花月夜》的下句改写,估计你都忘记实际的下句了,哈哈哈:
【上联】春江潮水连海平,海上明月共潮生 【下联】古塔风云绕山青,峰顶彩霞共霭晖
【上联】滟滟随波千万里,何处春江无月明 【下联】幽幽如幻五十年,此间尘世有风流
【上联】江流宛转绕芳甸,月照花林皆似霰 【下联】山势氤氲浮翠霭,风梳杨柳欲飞烟
【上联】斜月沉沉藏海雾,碣石潇湘无限路 【下联】春江漫卷漾烟雨,长河日夜有缘人
【上联】不知江月待何人,但见长江送流水 【下联】只向渔家寻旧迹,试问何处答耕桑
【上联】白云一片去悠悠,青枫浦上不胜愁 【下联】碧水千层横漠漠,赤雁南来正飞歌
【上联】鸿雁长飞光不度,鱼龙潜跃水成文 【下联】渔家小住月相随,鸥鹭往来船载歌
【上联】此时相望不相闻,愿逐月华流照君 【下联】何处觅人犹有约?欲邀仙子去还家
【上联】江畔何人初见月?江月何年初照人 【下联】天下有地可藏龙,海风哪处可观心
【上联】江天一色无纤尘,皎皎空中孤月轮 【下联】杨柳千丝犹有梦,蒙蒙地上满船灯

【上联】转朱阁,低绮户,照无眠,不应有恨 【下联】举大樽,邀清风,斟古月,何必无言
【上联】举杯邀明月,对影成三人 【下联】临屏赏白梅,随心咏一声
【上联】胸无点墨 【下联】口有余香
【上联】十口心思,思君思国思社稷 【下联】一腔热泪,恨己恨他恨人民
【上联】破釜沉舟,三千越甲可吞吴 【下联】乘风破浪,六十回头能啸秦
【上联】卧薪尝胆,三千越甲可吞吴 【下联】伏釜沉舟,四海扬波能缚蛟
【上联】三杯两盏淡酒,怎敌他晚来风急 【下联】一醉几巡闲言,便知是梦了诗成
【上联】燕草如碧丝,秦桑低绿枝 【下联】夕阳似金锦,柳枝悬红线
【上联】庭院深深深几许 【下联】春山隐隐隐孤身
【上联】庭院深深深几许 【下联】梅花点点点春香

【上联】漠漠水田飞白鹭,阴阴夏木啭黄鹂 【下联】绵绵秋雨润禾苗,处处山林映玉蝉
【上联】漠漠水田飞白鹭,阴阴夏木啭黄鹂 【下联】依依山径落斜阳,袅袅炊烟绕翠竹

【上联】五岭横云一片青,衡山积雪终年白 【下联】九州铺锦八方绿,林海堆红满眼春
【上联】五岭横云一片青,衡山积雪终年白 【下联】双峰映日十分碧,函谷春深几度红

【上联】洞庭浩渺通长江,春来水涨连天碧 【下联】华阁崔巍迎盛会,时过车奔遍地红
【上联】慷慨歌燕市,从容作楚囚 【下联】清音醉杏坛,寂寞成唐诗
【上联】赤橙黄绿青蓝紫,谁持彩练当空舞 【下联】柴米油盐酱醋茶,客为力军作秀言

【上联】茫茫九派流**,沉沉一线穿南北 【下联】滔滔三江溯太行,滚浩万年贯古今
【上联】茫茫九派流**,沉沉一线穿南北 【下联】浩浩千春耀上京,滚滚三江舞东西
【上联】茫茫九派流**,沉沉一线穿南北 【下联】荡荡东洋过大江,滚滚千流过古今
【上联】茫茫九派流**,沉沉一线穿南北 【下联】悠悠三湾锁西洋,滚滚千商过东西

【上联】横空出世,莽昆仑,阅尽人间春色 【下联】下笔挥毫,成巨匠,绘出生态蓝图
【上联】飞起玉龙三百万,搅得周天寒彻 【下联】勾起渭河数千年,汇来秦土风情
【上联】飞起玉龙三百万,搅得周天寒彻 【下联】携来佛法五千言,引来妙谛玄机
【上联】我失骄杨君失柳,杨柳轻扬直上重霄九 【下联】春回大地雪迎雷,地冰乍起先苏万物锦
【上联】夜长天色总难明,寂寞披衣起坐数寒星 【下联】人静心灰都是幻,相思枕席轻敲几尺丝
【上联】夜长天色总难明,寂寞披衣起坐数寒星 【下联】人在画图空有趣,从容拄杖谈玄一醉开
【上联】夜长天色总难明,寂寞披衣起坐数寒星 【下联】金风玉露悄然冷,相思倚枕听啼无归人
【上联】六月天兵征腐恶,万丈长缨要把鲲鹏缚 【下联】一支玉笔写春秋,千年遗恨永铭将士碑

【上联】百万工农齐踊跃,席卷江西直捣湘和鄂 【下联】五千兵戈俱喧阗,水归淮北空余屈子魂
【上联】百万工农齐踊跃,席卷江西直捣湘和鄂 【下联】千余老幼尽欢颜,联惊海外长怀赵与韩
【上联】百万工农齐踊跃,席卷江西直捣湘和鄂 【下联】三千志士竞登攀,功成世界全凭武与文 【上联】百万工农齐踊跃,席卷江西直捣湘和鄂 【下联】三千豪杰竞风流,舟行淮北犹歌宋与唐
【上联】百万工农齐踊跃,席卷江西直捣湘和鄂 【下联】七旬岁月竞风流,梦牵河北长扬楚与吴
【上联】此去泉台招旧部,旌旗十万斩阎罗 【下联】我来江畔谒崇陵,遗烈八年壮国威 注:崇陵是光绪皇帝爱新觉罗.载湉和叶赫那拉·静芬(隆裕太后)的合葬陵寝 【上联】此去泉台招旧部,旌旗十万斩阎罗 【下联】回首江山换新颜,春花一样笑杨梅
【上联】此去泉台招旧部,旌旗十万斩阎罗 【下联】重归海岛护同盟,风雨千秋祭祖师

【上联】此去泉台招旧部,旌旗十万斩阎罗 【下联】且邀明月饮新醪,玉液千杯酹泰山 注:醪, 新酿的酒;酹,把酒浇在地上,表示祭奠
【上联】此去泉台招旧部,旌旗十万斩阎罗 【下联】今观国粹谱新篇,联对千家颂党恩
【上联】此去泉台招旧部,旌旗十万斩阎罗 【下联】重归梓里吊忠魂,天地三千开后人
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】帝无前事佐明主,却图一战扫烟云
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】人生向上多自力,始终一战统山河
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】死犹未免埋白骨,亦如一抔葬忠魂
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】人心臣首怀赤胆,未能一死竟平生
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】人思此处访遗庙,未经千载是英雄
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】不由后死赴江汉,岂独一败论英雄
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】谁似臣心报国门?至今百战识英雄
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】自悲吾辈失青鬓,竟成千古痛元君
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】我来仙界横紫气,定披七彩绣河山
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】不为世界哭元老,且持三字表心胸
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】世风余韵遗青史,岂惟五代识君臣
【上联】天命吾身踏黄泉,定起万军夺阴曹 【下联】世间此地留忠骨,长教千载祀英雄

加载模型错误

我下载了开源的模型,显示如下错误:
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file ./model.ckpt-100000: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
请问是什么问题?谢谢

No module named 'train'

from train.modeling import GroverModel, GroverConfig, sample

ModuleNotFoundError: No module named 'train'

Im using conda create -n ml2 python=3.7
here is pip list:
Package Version


absl-py 0.9.0
astor 0.8.1
attrs 19.3.0
backcall 0.1.0
bleach 3.1.0
certifi 2019.11.28
chardet 3.0.4
colorama 0.4.3
decorator 4.4.1
defusedxml 0.6.0
entrypoints 0.3
gast 0.3.3
grpcio 1.27.2
h5py 2.10.0
idna 2.8
importlib-metadata 1.5.0
ipykernel 5.1.4
ipython 7.12.0
ipython-genutils 0.2.0
ipywidgets 7.5.1
jedi 0.16.0
Jinja2 2.11.1
joblib 0.14.1
jsonschema 3.2.0
jupyter 1.0.0
jupyter-client 5.3.4
jupyter-console 6.1.0
jupyter-core 4.6.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.0
Markdown 3.2.1
MarkupSafe 1.1.1
mistune 0.8.4
mkl-fft 1.0.15
mkl-random 1.1.0
mkl-service 2.3.0
mock 4.0.1
nbconvert 5.6.1
nbformat 5.0.4
notebook 6.0.3
numpy 1.18.1
pandas 0.24.2
pandocfilters 1.4.2
parso 0.6.1
pickleshare 0.7.5
pip 20.0.2
prometheus-client 0.7.1
prompt-toolkit 3.0.3
protobuf 3.11.4
Pygments 2.5.2
pyreadline 2.1
pyrsistent 0.15.7
python-dateutil 2.8.1
pytz 2019.3
pywin32 227
pywinpty 0.5.7
pyzmq 18.1.1
qtconsole 4.6.0
regex 2019.4.14
requests 2.22.0
scikit-learn 0.22.1
scipy 1.4.1
Send2Trash 1.5.0
setuptools 45.2.0.post20200210
six 1.14.0
tensorboard 1.13.1
tensorflow 1.13.1
tensorflow-estimator 1.13.0
termcolor 1.1.0
terminado 0.8.3
testpath 0.4.4
tornado 6.0.3
tqdm 4.31.1
traitlets 4.3.3
urllib3 1.25.8
wcwidth 0.1.8
webencodings 0.5.1
Werkzeug 1.0.0
wheel 0.34.2
widgetsnbextension 3.5.1
wincertstore 0.2
zipp 2.2.0

使用方法

谢谢分享
@imcaspar 能写一下怎样使用吗?

  1. 使用该模型继续进行预训练的步骤
  2. 使用该模型进行inference的步骤

谢谢

怎么转pytorch模型

我使用transformers里面的gpt2 convert不能转换成功. 命令如下:
transformers gpt2 ./gpt-ml/mega/model.ckpt-10000 ./output ./mega.json
接连出现 GPT2Model no attribute "_step" 和 "shape" 问题.

有转换成功的能不能分享一下方法,多谢!

[Discussion] models/mega/model.ckpt-100000.data的问题

Could not open models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
有没有类似问题呀?下载后sha是对的。
...
...
mlp_ln1
mlp_ln0
mlp_ln1
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-12-04 11:03:11.980040: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2019-12-04 11:03:11.980242: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2019-12-04 11:03:11.980286: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_tensor.cc:175 : Data loss: Unable to open table file models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "scripts/interactive_conditional_samples.py", line 171, in
saver.restore(sess, args.model_ckpt)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1276, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[node save/RestoreV2 (defined at scripts/interactive_conditional_samples.py:170) ]]

Caused by op 'save/RestoreV2', defined at:
File "scripts/interactive_conditional_samples.py", line 170, in
saver = tf.train.Saver()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 832, in init
self.build()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 844, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 881, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 513, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps
restore_sequentially)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

DataLossError (see above for traceback): Unable to open table file models/mega/model.ckpt-100000.data: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[node save/RestoreV2 (defined at scripts/interactive_conditional_samples.py:170) ]]

[Bug] bug of sequence length

    for article in buffered_and_sliding_window_article_iterator(tokenizer,
                           final_desired_size=max(args.max_seq_length + 1, 1025)):
        writer2use = train_writer
        assert len(article['input_ids']) == (args.max_seq_length + 1)

上面是prepare.py 的第188到191行,第189和191行所认为的sequence长度是不是矛盾的?在188行当max_seq_length<1024时总是取sequence长度为1025

[Discussion] Vocabulary file

Hi all,

Thanks for the cool contribution :) I'm trying to use your repo to pre-train GPT-2 from scratch on other languages (not English nor Chinese). Could you say a bit more on how you generated your vocabulary (which library you used? sentencepiece?)? Also, what kind of vocabulary is it (BPE, etc...)?

If I understood well, when I have my vocab file for my language, I can then create the tfrecords for my data and run the train.py file?

Thanks a lot in advance!

The release time of 1.5B model trained on 50G corpus

First of all, GREAT THANKS for the release of the big Chinese GPT-2 model! I have tested with some primes and the results look very good.

So I am very looking forward to the release of the model trained with even more data and more epochs, could I know the exact release time of it? since the author said it should be due on the early Dec.

Thanks very much!

小的GPT

有没有小的模型能在gpu上finetune的?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.