Comments (6)
transformers 版本 4.40.2
无论升级和降级都是这种情况
from knowlm.
您好,我最近写一个单个文件的gradio给您,预计明后天
from knowlm.
您好,下面是取消流式输出的代码:
import gradio as gr
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# @torch.no_grad()
def generate_response(instruction, text="", temperature=1.0, top_p=0.9, top_k=50, num_beams=1, max_new_tokens=50, repetition_penalty=1.0):
with torch.no_grad():
if text != "":
input_text = f"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Input:\n{text}\n\n### Response:\n"
else:
input_text = f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n"
input_ids = tokenizer.encode(input_text, return_tensors='pt').to('cuda')
output_ids = model.generate(
input_ids,
max_length=input_ids.shape[1] + max_new_tokens,
temperature=temperature,
top_k=top_k,
top_p=top_p,
num_beams=num_beams,
repetition_penalty=repetition_penalty,
)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
return output_text[len(input_text):]
if __name__ == '__main__':
model_name = "zjunlp/knowlm-13b-zhixi"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name, torch_dtype=torch.bfloat16, device_map="auto",
load_in_8bit=True
)
interface = gr.Interface(
fn=generate_response,
inputs=[
gr.Textbox(label="Instruction", placeholder="Enter instruction here...", lines=2, value="""从给定的文本中提取出可能的实体和实体类型,可选的实体类型为['地点', '人名'],以(实体,实体类型)的格式回答。"""),
gr.Textbox(label="Optional Text", placeholder="Enter optional text here...", lines=2, optional=True, value="""John昨天在纽约的咖啡馆见到了他的朋友Merry。他们一起喝咖啡聊天,计划着下周去加利福尼亚(California)旅行。他们决定一起租车并预订酒店。他们先计划在下周一去圣弗朗西斯科参观旧金山大桥,下周三去洛杉矶拜访Merry的父亲威廉。"""),
gr.Slider(label="Temperature", minimum=0.1, maximum=2.0, value=1.0, step=0.1),
gr.Slider(label="Top p", minimum=0.0, maximum=1.0, value=0.9, step=0.01),
gr.Slider(label="Top k", minimum=0, maximum=100, value=50, step=1),
gr.Slider(label="Number of Beams", minimum=1, maximum=10, value=1),
gr.Slider(label="Max New Tokens", minimum=1, maximum=512, value=50),
gr.Slider(label="Repetition Penalty", minimum=0.1, maximum=1.6, value=1.0, step=0.1)
],
outputs="text",
title="Zhixi",
description="<center>https://github.com/zjunlp/knowlm</center>"
)
interface.launch()
from knowlm.
大佬还是不行,也不知道是环境问题还是啥问题:
如果用OneKE模型,就报错如下:
(factory) aeye@aeye-176:/raid/liulei/KnowLM/examples$ python test.py
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:05<00:00, 1.98s/it]
Traceback (most recent call last):
File "/raid/liulei/KnowLM/examples/test.py", line 40, in <module>
gr.Textbox(label="Optional Text", placeholder="Enter optional text here...", lines=2, optional=True, value="""John昨天在纽约的咖啡馆见到了他的朋友Merry。他们一起喝咖啡聊天,计划着下周去加利福尼亚(California)旅行。他们决定一起租车并预订酒店。他们先计划在下周一去圣弗朗西斯科参观旧金山大桥,下周三去洛杉矶拜访Merry的父亲威廉。"""),
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/gradio/component_meta.py", line 163, in wrapper
return fn(self, **kwargs)
TypeError: Textbox.__init__() got an unexpected keyword argument 'optional'
如果用knowlm-13b-zhixi 模型
就报错:
(factory) aeye@aeye-176:/raid/liulei/KnowLM/examples$ python test.py
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Traceback (most recent call last):
File "/raid/liulei/KnowLM/examples/test.py", line 31, in <module>
tokenizer = AutoTokenizer.from_pretrained(model_name)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 889, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2163, in from_pretrained
return cls._from_pretrained(
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2397, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 173, in __init__
self.update_post_processor()
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 186, in update_post_processor
bos_token_id = self.bos_token_id
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1187, in bos_token_id
return self.convert_tokens_to_ids(self.bos_token)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 349, in convert_tokens_to_ids
return self._convert_token_to_id_with_added_voc(tokens)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 356, in _convert_token_to_id_with_added_voc
return self.unk_token_id
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1206, in unk_token_id
return self.convert_tokens_to_ids(self.unk_token)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 349, in convert_tokens_to_ids
return self._convert_token_to_id_with_added_voc(tokens)
File "/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 356, in _convert_token_to_id_with_added_voc
但是如果用generate_lora.py ,虽然有些浸膏,但是 就能正常推理出来。
/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:540: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.4` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:545: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.75` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:562: UserWarning: `do_sample` is set to `False`. However, `top_k` is set to `40` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_k`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
GenerationConfig {
"num_beams": 2,
"repetition_penalty": 1.3,
"temperature": 0.4,
"top_k": 40,
"top_p": 0.75
}
/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:540: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.4` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
warnings.warn(
/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:545: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.75` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
warnings.warn(
/home/aeye/anaconda3/envs/factory/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:562: UserWarning: `do_sample` is set to `False`. However, `top_k` is set to `40` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_k`.
warnings.warn(
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
{"症状": ["咳嗽", "咳痰"], "总病程": "15年", "每次发病时长": "4-5个月", "本次发病时长": "4-5个月"}
from knowlm.
如果使用oneke,请执行下面的命令:
CUDA_VISIBLE_DEVICES=0 python examples/generate_lora_web.py --base_model zjunlp/oneke --model_tag oneke
如果使用zhixi,请执行下面的命令:
CUDA_VISIBLE_DEVICES=0 python examples/generate_lora_web.py --base_model zjunlp/knowlm-13b-zhixi --model_tag zhixi
from knowlm.
请问您的问题解决了吗
from knowlm.
Related Issues (20)
- generate_lora_web.py报错求助 HOT 5
- 训练数据集 HOT 4
- 模型加载 HOT 8
- 环境配置pip下载不成功 HOT 2
- 使用knowlm-13b-zhix不能复现效果 HOT 4
- 复现信息抽取时运行代码报错 HOT 2
- 请问pretrain用了什么计算资源? HOT 2
- python examples/generate_lora_web.py --base_model zjunlp/knowlm-13b-zhixi命令报错 HOT 2
- TypeError: __init__() got an unexpected keyword argument 'load_in_4bit' HOT 8
- ValueError: Can't read templates/./bloom_deploy.json.json HOT 4
- lora微调为什么会出现RuntimeError: Numpy is not available HOT 1
- 请教lora微调的时候loss一直是0 HOT 5
- 从checkpoint继续lora微调报错 HOT 5
- 运行指令“python examples/generate_finetune_web.py --base_model zjunlp/knowlm-13b-base-v1.0”出现"嗯… 无法访问此页面网址为http://0.0.0.0:7860/ 的页面可能存在问题,或者已永久移动到新的网址。"报错 HOT 7
- 在执行“python examples/generate_lora_web.py --base_model knowlm-13b-zhixi”进行基于web的交互效果测试时,输入instruction和input,点击submit之后,output总是出现Error的报错,并且python的运行终端直接结束运行。运行截图如下图: HOT 24
- 关于vllm服务部署oneke模型 参数设置问题 HOT 4
- 请问我执行generate_lora.py文件抽取三元组时候非常耗时,这种情况正常吗? HOT 2
- 单机多卡lora微调模型的显卡利用率交替 HOT 1
- 微调Qwen模型失败 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from knowlm.