使用Randeng-Pegasus-523M-Summary-Chinese <a target="_blank" rel="noopener noreferrer

参考这里的解决方案 <a class="issue-link js-issue-link" data-error-text="Failed to load titl

修改了jieba临时文件路径但是结果并没有变化 <a target="_blank" rel="noopener noreferrer nofollow"

hi， <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

你好，完全是示例代码 <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

使用Rangdeng-Pegasus系列，提示文件not found /cognitive_comp/dongxiaoqun/software/jieba/tmp/tmpk8ungvhc about fengshenbang-lm HOT 11 CLOSED

idea-ccnl commented on June 8, 2024

使用Rangdeng-Pegasus系列，提示文件not found /cognitive_comp/dongxiaoqun/software/jieba/tmp/tmpk8ungvhc

from fengshenbang-lm.

Comments (11)

ganzhiruyi commented on June 8, 2024

参考这里的解决方案
#78

from fengshenbang-lm.

jiangliqin commented on June 8, 2024

修改了jieba临时文件路径
但是结果并没有变化

 @ganzhiruyi 请抽空帮忙看看，谢谢~

from fengshenbang-lm.

dongxqm commented on June 8, 2024

hi， @jiangliqin 你好，你tokenizer的使用代码能否贴详细一点，方便我帮你定位问题。目前看你的截图，jieba分词已经导入成功了。

from fengshenbang-lm.

jiangliqin commented on June 8, 2024

你好，完全是示例代码 @dongxqm
`from transformers import PegasusForConditionalGeneration
from tokenizers_pegasus import PegasusTokenizer

model = PegasusForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese")
tokenizer = PegasusTokenizer.from_pretrained("IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese")

text = "据微信公众号“界面”报道，4日上午10点左右，**发改委反垄断调查小组突击查访奔驰上海办事处，调取数据材料，并对多名奔驰高管进行了约谈。截止昨日晚9点，包括北京梅赛德斯-奔驰销售服务有限公司东区总经理在内的多名管理人员仍留在上海办公室内"
inputs = tokenizer(text, max_length=1024, return_tensors="pt")

summary_ids = model.generate(inputs["input_ids"])
tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

from fengshenbang-lm.

dongxqm commented on June 8, 2024

上面的“老鼠老鼠……”是模型的输出？还是输入？有什么其他报错吗？是切词不对吗？

from fengshenbang-lm.

jiangliqin commented on June 8, 2024

是输出，没有其他报错

from fengshenbang-lm.

dongxqm commented on June 8, 2024

我这边测试是没有问题的哈，建议再确认下model是否正常加载，tokenizer的切词是否正确，每一步都再打印出来看下

from fengshenbang-lm.

jiangliqin commented on June 8, 2024

Randeng-Pegasus-238M-Summary-Chinese模型就结果正常，Randeng-Pegasus-523M-Summary-Chinese还是结果不对

from fengshenbang-lm.

dongxqm commented on June 8, 2024

我这边无法复现你说的问题，推测是没有正确的加载模型。清理一下hugging face的缓存试试，或者指定一个新的模型缓存地址再试试？如以下代码所示

model = PegasusForConditionalGeneration.from_pretrained(
        "IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese",
        output_hidden_states=True,
        cache_dir= './test_pegasus_dir' # 指定hugging face缓存目录
    )

from fengshenbang-lm.

oblialbum commented on June 8, 2024

我是在windows上将tokenizers_pegasus.py中的tmp.dir替换程一个本地文件夹，运行之后会自动下载。

from fengshenbang-lm.

dongxqm commented on June 8, 2024

@jiangliqin 问题解决了吗？

from fengshenbang-lm.

使用Rangdeng-Pegasus系列，提示文件not found /cognitive_comp/dongxiaoqun/software/jieba/tmp/tmpk8ungvhc about fengshenbang-lm HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent