Comments (11)
参考这里的解决方案
#78
from fengshenbang-lm.
修改了jieba临时文件路径
但是结果并没有变化
@ganzhiruyi 请抽空帮忙看看,谢谢~
from fengshenbang-lm.
hi, @jiangliqin 你好,你tokenizer的使用代码能否贴详细一点,方便我帮你定位问题。目前看你的截图,jieba分词已经导入成功了。
from fengshenbang-lm.
你好,完全是示例代码 @dongxqm
`from transformers import PegasusForConditionalGeneration
from tokenizers_pegasus import PegasusTokenizer
model = PegasusForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese")
tokenizer = PegasusTokenizer.from_pretrained("IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese")
text = "据微信公众号“界面”报道,4日上午10点左右,**发改委反垄断调查小组突击查访奔驰上海办事处,调取数据材料,并对多名奔驰高管进行了约谈。截止昨日晚9点,包括北京梅赛德斯-奔驰销售服务有限公司东区总经理在内的多名管理人员仍留在上海办公室内"
inputs = tokenizer(text, max_length=1024, return_tensors="pt")
summary_ids = model.generate(inputs["input_ids"])
tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
from fengshenbang-lm.
上面的“老鼠老鼠……”是模型的输出?还是输入?有什么其他报错吗?是切词不对吗?
from fengshenbang-lm.
是输出,没有其他报错
from fengshenbang-lm.
from fengshenbang-lm.
Randeng-Pegasus-238M-Summary-Chinese模型就结果正常,Randeng-Pegasus-523M-Summary-Chinese还是结果不对
from fengshenbang-lm.
model = PegasusForConditionalGeneration.from_pretrained(
"IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese",
output_hidden_states=True,
cache_dir= './test_pegasus_dir' # 指定hugging face缓存目录
)
from fengshenbang-lm.
我是在windows上将tokenizers_pegasus.py中的tmp.dir替换程一个本地文件夹,运行之后会自动下载。
from fengshenbang-lm.
@jiangliqin 问题解决了吗?
from fengshenbang-lm.
Related Issues (20)
- 封神榜
- Ziya2-13B-Chat怎么流式输出 HOT 1
- 怎么微调UniMC模型
- There is an error installing fengshen package.
- Erlangshen embedding模型推理使用
- 执行uniex的example.py报错
- 原先transformers的训练脚本中,
- 导入 fengshen module 后trainer的进度不显示了
- finetune_taiyi_stable_diffusion 多卡怎么设置 HOT 1
- fintune后加载推理报错,用法是标准的用法 HOT 1
- UniMC对于选项位置敏感的问题
- Ziya-Coding-15B-v1 预测输出也会把问题重新输出一遍,影响预测速度,请问这改怎么解决? HOT 1
- 关于Uniex 的span HOT 1
- TypeError: not a string
- Erlangshen-Roberta-110M-Similarity训练用数据集 HOT 2
- Ziya-Reader-13B-v1.0长度限制4-8K,请问LongBench中文多文档QA问题都是8K-27K,是如何测评的,采用首尾截断,还是重新构造了数据集? HOT 1
- from modeling_deltalm import DeltalmForConditionalGeneration报错 no known parent package HOT 2
- ziya2预训练的语料拼接是如何通过attention mask规避的 HOT 5
- 配合ziya-reader使用的ziya-embedding和ziya-searching-agent在哪 HOT 1
- Anyone can help me with the "Randeng-BART-139M-SUMMARY" fine-tuning? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fengshenbang-lm.