modelscope / modelscope-agent Goto Github PK

View Code? Open in Web Editor NEW

2.3K 32.0 260.0 65.3 MB

ModelScope-Agent: An agent framework connecting models in ModelScope with the world

Home Page: https://modelscope-agent.readthedocs.io/en/latest/

License: Apache License 2.0

Python 96.58% CSS 2.27% Shell 0.78% Dockerfile 0.30% JavaScript 0.04% Makefile 0.03%

agent gpts chatglm-4 llm qwen open-gpts multi-agents mobile-agent assistantapi chatbot

modelscope-agent's Introduction

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

Modelscope Hub ｜ Paper ｜ Demo
中文｜ English

Introduction

Modelscope-Agent is a customizable and scalable Agent framework. A single agent has abilities such as role-playing, LLM calling, tool usage, planning, and memory. It mainly has the following characteristics:

Simple Agent Implementation Process: Simply specify the role instruction, LLM name, and tool name list to implement an Agent application. The framework automatically arranges workflows for tool usage, planning, and memory.
Rich models and tools: The framework is equipped with rich LLM interfaces, such as Dashscope and Modelscope model interfaces, OpenAI model interfaces, etc. Built in rich tools, such as code interpreter, weather query, text to image, web browsing, etc., make it easy to customize exclusive agents.
Unified interface and high scalability: The framework has clear tools and LLM registration mechanism, making it convenient for users to expand more diverse Agent applications.
Low coupling: Developers can easily use built-in tools, LLM, memory, and other components without the need to bind higher-level agents.

🎉 News

🔥July 17, 2024: Parallel tool calling on Modelscope-Agent-Server, please find detail in doc.
🔥June 17, 2024: Upgrading RAG flow based on LLama-index, allow user to hybrid search knowledge by different strategies and modalities, please find detail in doc.
🔥June 6, 2024: With Modelscope-Agent-Server, Qwen2 could be used by OpenAI SDK with tool calling ability, please find detail in doc.
🔥June 4, 2024: Modelscope-Agent supported Mobile-Agent-V2arxiv，based on Android Adb Env, please check in the application.
🔥May 17, 2024: Modelscope-Agent supported multi-roles room chat in the gradio.
May 14, 2024: Modelscope-Agent supported image input in RolePlay agents with latest OpenAI model GPT-4o. Developers can experience this feature by specifying the image_url parameter.
May 10, 2024: Modelscope-Agent launched a user-friendly Assistant API, and also provided a Tools API that executes utilities in isolated, secure containers, please find the document
Apr 12, 2024: The Ray version of multi-agent solution is on modelscope-agent, please find the document
Mar 15, 2024: Modelscope-Agent and the AgentFabric (opensource version for GPTs) is running on the production environment of modelscope studio.
Feb 10, 2024: In Chinese New year, we upgrade the modelscope agent to version v0.3 to facilitate developers to customize various types of agents more conveniently through coding and make it easier to make multi-agent demos. For more details, you can refer to #267 and #293 .
Nov 26, 2023: AgentFabric now supports collaborative use in ModelScope's Creation Space, allowing for the sharing of custom applications in the Creation Space. The update also includes the latest GTE text embedding integration.
Nov 17, 2023: AgentFabric released, which is an interactive framework to facilitate creation of agents tailored to various real-world applications.
Oct 30, 2023: Facechain Agent released a local version of the Facechain Agent that can be run locally. For detailed usage instructions, please refer to Facechain Agent.
Oct 25, 2023: Story Agent released a local version of the Story Agent for generating storybook illustrations. It can be run locally. For detailed usage instructions, please refer to Story Agent.
Sep 20, 2023: ModelScope GPT offers a local version through gradio that can be run locally. You can navigate to the demo/msgpt/ directory and execute bash run_msgpt.sh.
Sep 4, 2023: Three demos, demo_qwen, demo_retrieval_agent and demo_register_tool, have been added, along with detailed tutorials provided.
Sep 2, 2023: The preprint paper associated with this project was published.
Aug 22, 2023: Support accessing various AI model APIs using ModelScope tokens.
Aug 7, 2023: The initial version of the modelscope-agent repository was released.

Installation

clone repo and install dependency：

git clone https://github.com/modelscope/modelscope-agent.git
cd modelscope-agent && pip install -r requirements.txt

ModelScope notebook【recommended】

The ModelScope Notebook offers a free-tier that allows ModelScope user to run the FaceChain application with minimum setup, refer to ModelScope Notebook

# Step1: 我的notebook -> PAI-DSW -> GPU环境

# Step2: Download the [demo file](https://github.com/modelscope/modelscope-agent/blob/master/demo/demo_qwen_agent.ipynb) and upload it to the GPU.

# Step3:  Execute the demo notebook in order.

Quickstart

The agent incorporates an LLM along with task-specific tools, and uses the LLM to determine which tool or tools to invoke in order to complete the user's tasks.

To start, all you need to do is initialize an RolePlay object with corresponding tasks

This sample code uses the qwen-max model, drawing tools and weather forecast tools.
- Using the qwen-max model requires replacing YOUR_DASHSCOPE_API_KEY in the example with your API-KEY for the code to run properly. YOUR_DASHSCOPE_API_KEY can be obtained here. The drawing tool also calls DASHSCOPE API (wanx), so no additional configuration is required.
- When using the weather forecast tool, you need to replace YOUR_AMAP_TOKEN in the example with your AMAP weather API-KEY so that the code can run normally. YOUR_AMAP_TOKEN is available here.

# 配置环境变量；如果您已经提前将api-key提前配置到您的运行环境中，可以省略这个步骤
import os
os.environ['DASHSCOPE_API_KEY']=YOUR_DASHSCOPE_API_KEY
os.environ['AMAP_TOKEN']=YOUR_AMAP_TOKEN

# 选用RolePlay 配置agent
from modelscope_agent.agents.role_play import RolePlay  # NOQA

role_template = '你扮演一个天气预报助手，你需要查询相应地区的天气，并调用给你的画图工具绘制一张城市的图。'

llm_config = {'model': 'qwen-max', 'model_server': 'dashscope'}

# input tool name
function_list = ['amap_weather', 'image_gen']

bot = RolePlay(
    function_list=function_list, llm=llm_config, instruction=role_template)

response = bot.run('朝阳区天气怎样？')

text = ''
for chunk in response:
    text += chunk

Result

Terminal runs

# 第一次调用llm的输出
Action: amap_weather
Action Input: {"location": "朝阳区"}

# 第二次调用llm的输出
目前，朝阳区的天气状况为阴天，气温为1度。

Action: image_gen
Action Input: {"text": "朝阳区城市风光", "resolution": "1024*1024"}

# 第三次调用llm的输出
目前，朝阳区的天气状况为阴天，气温为1度。同时，我已为你生成了一张朝阳区的城市风光图，如下所示：

![](https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/1d/45/20240204/3ab595ad/96d55ca6-6550-4514-9013-afe0f917c7ac-1.jpg?Expires=1707123521&OSSAccessKeyId=LTAI5tQZd8AEcZX6KZV4G8qL&Signature=RsJRt7zsv2y4kg7D9QtQHuVkXZY%3D)

modules

Agent

An Agent object consists of the following components:

LLM: A large language model that is responsible to process your inputs and decide calling tools.
function_list: A list consists of available tools for agents.

Currently, configuration of Agent may contain following arguments:

llm: The llm config of this agent
- When Dict: set the config of llm as {'model': '', 'api_key': '', 'model_server': ''}
- When BaseChatModel: llm is sent by another agent
function_list: A list of tools
- When str: tool names
- When Dict: tool cfg
storage_path: If not specified otherwise, all data will be stored here in KV pairs by memory
instruction: the system instruction of this agent
name: the name of agent
description: the description of agent, which is used for multi_agent
kwargs: other potential parameters

Agent, as a base class, cannot be directly initialized and called. Agent subclasses need to inherit it. They must implement function _run, which mainly includes three parts: generation of messages/propmt, calling of llm(s), and tool calling based on the results of llm. We provide an implement of these components in RolePlay for users, and you can also custom your components according to your requirement.

from modelscope_agent import Agent
class YourCustomAgent(Agent):
    def _run(self, user_request, **kwargs):
        # Custom your workflow

LLM

LLM is core module of agent, which ensures the quality of interaction results.

Currently, configuration of `` may contain following arguments:

model: The specific model name will be passed directly to the model service provider.
model_server: provider of model services.

BaseChatModel, as a base class of llm, cannot be directly initialized and called. The subclasses need to inherit it. They must implement function _chat_stream and _chat_no_stream, which correspond to streaming output and non-streaming output respectively. Optionally implement chat_with_functions and chat_with_raw_prompt for function calling and text completion.

Currently we provide the implementation of three model service providers: dashscope (for qwen series models), zhipu (for glm series models) and openai (for all openai api format models). You can directly use the models supported by the above service providers, or you can customize your llm.

For more information please refer to docs/modules/llm.md

`Tool`

We provide several multi-domain tools that can be configured and used in the agent.

You can also customize your tools with set the tool's name, description, and parameters based on a predefined pattern by inheriting the base tool. Depending on your needs, call() can be implemented. An example of a custom tool is provided in demo_register_new_tool

You can pass the tool name or configuration you want to use to the agent.

# by tool name
function_list = ['amap_weather', 'image_gen']
bot = RolePlay(function_list=function_list, ...)

# by tool configuration
from langchain.tools import ShellTool
function_list = [{'terminal':ShellTool()}]
bot = RolePlay(function_list=function_list, ...)

# by mixture
function_list = ['amap_weather', {'terminal':ShellTool()}]
bot = RolePlay(function_list=function_list, ...)

Built-in tools

image_gen: Wanx Image Generation. DASHSCOPE_API_KEY needs to be configured in the environment variable.
code_interpreter: Code Interpreter
web_browser: Web Browsing
amap_weather: AMAP Weather. AMAP_TOKEN needs to be configured in the environment variable.
wordart_texture_generation: Word art texture generation. DASHSCOPE_API_KEY needs to be configured in the environment variable.
web_search: Web Searching. []
qwen_vl: Qwen-VL image recognition. DASHSCOPE_API_KEY needs to be configured in the environment variable.
style_repaint: Character style redrawn. DASHSCOPE_API_KEY needs to be configured in the environment variable.
image_enhancement: Chasing shadow-magnifying glass. DASHSCOPE_API_KEY needs to be configured in the environment variable.
text-address: Geocoding. MODELSCOPE_API_TOKEN needs to be configured in the environment variable.
speech-generation: Speech generation. MODELSCOPE_API_TOKEN needs to be configured in the environment variable.
video-generation: Video generation. MODELSCOPE_API_TOKEN needs to be configured in the environment variable.

Multi-agent

Please refer the multi-agent readme.

Share Your Agent

We appreciate your enthusiasm in participating in our open-source ModelScope-Agent project. If you encounter any issues, please feel free to report them to us. If you have built a new Agent demo and are ready to share your work with us, please create a pull request at any time! If you need any further assistance, please contact us via email at [email protected] or communication group!

Facechain Agent

Facechain is an open-source project for generating personalized portraits in various styles using facial images uploaded by users. By integrating the capabilities of Facechain into the modelscope-agent framework, we have greatly simplified the usage process. The generation of personalized portraits can now be done through dialogue with the Facechain Agent.

FaceChainAgent Studio Application Link: https://modelscope.cn/studios/CVstudio/facechain_agent_studio/summary

You can run it directly in a notebook/Colab/local environment: https://www.modelscope.cn/my/mynotebook

! git clone -b feat/facechain_agent https://github.com/modelscope/modelscope-agent.git

! cd modelscope-agent && ! pip install -r requirements.txt
! cd modelscope-agent/demo/facechain_agent/demo/facechain_agent && ! pip install -r requirements.txt
! pip install http://dashscope-cn-beijing.oss-cn-beijing.aliyuncs.com/zhicheng/modelscope_agent-0.1.0-py3-none-any.whl
! PYTHONPATH=/mnt/workspace/modelscope-agent/demo/facechain_agent && cd modelscope-agent/demo/facechain_agent/demo/facechain_agent && python app_v1.0.py

License

This project is licensed under the Apache License (Version 2.0).

Star History

modelscope-agent's People

Contributors

Stargazers

Watchers

Forkers

contropist lcl6679292 xingyun52 yhyu13 tomchapin myan123 peytontolbert codeaudit penghanyuan tang4109 shanmu-raoyunfei ruanrongman githungdang ai-natural-language-processing-lab autogyro deebtibi ai-mou hazeblue comedian1926 ai-jie01 perfmjs zhutony lsrsaga igmainc maohangyu zirenlegend luomor-ai tqjason andyyesiyu kangcaijun eltociear sufiyanpk7 bingozzbzz qianyouliang cyyc1221 wgc20 ksksks2222 zengxishenggmail syhleo ryanicer omahs natavidad chilli-rao arcmoon-hu azure-arc-0 204806672 ruoshuixuelabi seraph1188v feeeengym f901107 bodhihu kevinwang20110210 huenwei111 heepengpeng vvytt skytodmoon yaospacetim lu-lucifer jxzhangjhu sharktal barryyin sloth2012 riverfor coderwpf hihihihigohisense agentup yoghur rayrayraykk justmywyw suluyana shaynec lylalala mrdavidlover mtcto kekewind richiesh beimingmaster taoseekai huggingaha ccyhxg gotta-this zhenruzhang adambear zhaoganglxh jjjxusbx wangyijunlyy sunnf anandaddyyyy yigeor tangent-90c mrchengmo hoongyi xtaiyang garyxj kevinwang676 davidhefan xwjim fankidark fairyworld hsaigroup

modelscope-agent's Issues

下载模型后找不到模型地址报错

你好，在运行demo时出现了这个错误：---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[11], line 17
5 model_name = 'modelscope-agent-7b'
6 model_cfg = {
7 'modelscope-agent-7b':{
8 'type': 'modelscope',
(...)
13 }
14 }
---> 17 llm = LLMFactory.build_llm(model_name, model_cfg)

File ~/Downloads/modelscope-agent/demo/../modelscope_agent/llm/llm_factory.py:22, in LLMFactory.build_llm(model_name, cfg)
20 llm_cls = get_llm_cls(llm_type)
21 llm_cfg = cfg[model_name]
---> 22 return llm_cls(cfg=llm_cfg)

File ~/Downloads/modelscope-agent/demo/../modelscope_agent/llm/modelscope_llm.py:43, in ModelScopeLLM.init(self, cfg)
40 self.end_token = self.cfg.get('end_token', '<|endofthink|>')
41 self.include_end = self.cfg.get('include_end', True)
---> 43 self.setup()

File ~/Downloads/modelscope-agent/demo/../modelscope_agent/llm/modelscope_llm.py:49, in ModelScopeLLM.setup(self)
46 model_cls = self.model_cls
47 tokenizer_cls = self.tokenizer_cls
---> 49 self.model = model_cls.from_pretrained(
50 self.model_dir,
51 device_map=self.device_map,
52 # device='cuda:0',
53 torch_dtype=torch.float16,
54 trust_remote_code=True)
55 self.tokenizer = tokenizer_cls.from_pretrained(
56 self.model_dir, trust_remote_code=True)
57 self.model = self.model.eval()

File ~/anaconda3/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:558, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
556 else:
557 cls.register(config.class, model_class, exist_ok=True)
--> 558 return model_class.from_pretrained(
559 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
560 )
561 elif type(config) in cls._model_mapping.keys():
562 model_class = _get_model_class(config, cls._model_mapping)

File ~/anaconda3/lib/python3.10/site-packages/modelscope/utils/hf_util.py:72, in patch_model_base..from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
70 else:
71 model_dir = pretrained_model_name_or_path
---> 72 return ori_from_pretrained(cls, model_dir, *model_args, **kwargs)

File ~/anaconda3/lib/python3.10/site-packages/transformers/modeling_utils.py:3187, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3177 if dtype_orig is not None:
3178 torch.set_default_dtype(dtype_orig)
3180 (
3181 model,
3182 missing_keys,
3183 unexpected_keys,
3184 mismatched_keys,
3185 offload_index,
3186 error_msgs,
-> 3187 ) = cls._load_pretrained_model(
3188 model,
3189 state_dict,
3190 loaded_state_dict_keys, # XXX: rename?
3191 resolved_archive_file,
3192 pretrained_model_name_or_path,
3193 ignore_mismatched_sizes=ignore_mismatched_sizes,
3194 sharded_metadata=sharded_metadata,
3195 _fast_init=_fast_init,
3196 low_cpu_mem_usage=low_cpu_mem_usage,
3197 device_map=device_map,
3198 offload_folder=offload_folder,
3199 offload_state_dict=offload_state_dict,
3200 dtype=torch_dtype,
3201 is_quantized=(getattr(model, "quantization_method", None) == QuantizationMethod.BITS_AND_BYTES),
3202 keep_in_fp32_modules=keep_in_fp32_modules,
3203 )
3205 model.is_loaded_in_4bit = load_in_4bit
3206 model.is_loaded_in_8bit = load_in_8bit

File ~/anaconda3/lib/python3.10/site-packages/transformers/modeling_utils.py:3308, in PreTrainedModel._load_pretrained_model(cls, model, state_dict, loaded_keys, resolved_archive_file, pretrained_model_name_or_path, ignore_mismatched_sizes, sharded_metadata, _fast_init, low_cpu_mem_usage, device_map, offload_folder, offload_state_dict, dtype, is_quantized, keep_in_fp32_modules)
3306 is_safetensors = archive_file.endswith(".safetensors")
3307 if offload_folder is None and not is_safetensors:
-> 3308 raise ValueError(
3309 "The current device_map had weights offloaded to the disk. Please provide an offload_folder"
3310 " for them. Alternatively, make sure you have safetensors installed if the model you are using"
3311 " offers the weights in this format."
3312 )
3313 if offload_folder is not None:
3314 os.makedirs(offload_folder, exist_ok=True)

ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format.

仓库中没有utils啊

自定义tools的时候，如果参数是多层字典怎么办

问题：自定义tools的时候，如果参数是多层字典怎么办

自定义tools时，params的格式是参数信息的列表，那如果我的参数需要这样的：
{ 'name':'', 'id':{'id':'1', 'position':'a'} }
请问这样的参数应该如何添加？

如何在一次运行中多步调用工具？

很棒的工作～有几个问题想要咨询一下，望解答～

我理解目前的逻辑是，每次agent.run需要执行多次交互来获取最终结果，通常第一次交互是用LLM提取工具的json参数信息(llm_result)，并调用插件返回执行结果(exec_result)，第二次交互是可以将这些信息一同交给LLM，使其整合信息给出标准语言的回复。

但是，这里面的多轮交互不涉及多步调用多个工具，比如当用户输入“请帮我计算2与3的和的5次方等于多少？”，尽管加法与幂运算的工具均存在，它也只会调用一个工具，给我返回2的5次方作为答案。

我的问题是，可否实现用户一次输入多步调用工具给出答案？如果暂时不可以，那么有什么成熟的方案可以拆分用户的输入，使其分步地给到agent吗？

期待您的回复～

MODELSCOPE_API_TOKEN在哪里申请

调用插件进行图像生成。

{"api_name": "modelscope_image-generation",
"parameters": {"text": "一只可爱的小狗"}}

很抱歉，我无法直接绘制出图像。但是，我可以帮助您生成一张与您的指令相关的图像描述。您可以尝试在搜索引擎上搜索这个描述以找到相应的图像。

能否使用本地的CHATGLM2模型？

能否使用本地的CHATGLM2模型，需要修改哪里的配置呢，有具体修改实例教程吗

baichuan13b运行出错：BaichuanForCausalLM.chat() got an unexpected keyword argument 'history'

修改为baichuan13b参数
运行demo_qwen_agent.ipynb：

agent.reset()

agent.run('使用地址识别模型，从下面的地址中找到省市区等元素，地址：浙江杭州市江干区九堡镇三村村一区', remote=False)

报错：
BaichuanForCausalLM.chat() got an unexpected keyword argument 'history'

demo_modelscopegpt_agent.ipynb 报错

按照b站官方推荐视频https://www.bilibili.com/video/BV1cu4y1k7Pg/，

运行demo报错了，好像是dashscope py package中的代码？请问能复现吗

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[6], line 2
      1 agent.reset()
----> 2 agent.run("调用插件，生成一段两只小猫一起跳舞的视频", remote=True)

File ~/Documents/Git-repoMy/AIResearchVault/repo/LLMApp/modelscope-agent/demo/../modelscope_agent/agent.py:143, in AgentExecutor.run(self, task, remote, print_info)
    141 # generate prompt and call llm
    142 prompt = self.prompt_generator.generate(llm_result, exec_result)
--> 143 llm_result = self.llm.generate(prompt)
    144 if print_info:
    145     print(f'|prompt{idx}: {prompt}')

File ~/Documents/Git-repoMy/AIResearchVault/repo/LLMApp/modelscope-agent/demo/../modelscope_agent/llm/ms_gpt.py:23, in ModelScopeGPT.generate(self, prompt)
     20 def generate(self, prompt):
     22     total_response = ''
---> 23     responses = Generation.call(
     24         model=self.model, prompt=prompt, stream=False, **self.generate_cfg)
     26     if responses.status_code == HTTPStatus.OK:
     27         total_response = responses.output['text']

TypeError: dashscope.aigc.generation.Generation.call() got multiple values for keyword argument 'model'

另外运行demo的时候提示安装faiss-gpu或-cpu，这个requirement里面也没有。

pip install faiss-gpu

这是我本地的conda env

name: modelscope
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - aiofiles=22.1.0=py310h06a4308_0
  - aiosqlite=0.18.0=py310h06a4308_0
  - anyio=3.5.0=py310h06a4308_0
  - argon2-cffi=21.3.0=pyhd3eb1b0_0
  - argon2-cffi-bindings=21.2.0=py310h7f8727e_0
  - babel=2.11.0=py310h06a4308_0
  - backcall=0.2.0=pyhd3eb1b0_0
  - beautifulsoup4=4.12.2=py310h06a4308_0
  - bleach=4.1.0=pyhd3eb1b0_0
  - brotlipy=0.7.0=py310h7f8727e_1002
  - bzip2=1.0.8=h7b6447c_0
  - ca-certificates=2023.05.30=h06a4308_0
  - cchardet=2.1.7=py310h6a678d5_0
  - certifi=2023.7.22=py310h06a4308_0
  - cffi=1.15.1=py310h5eee18b_3
  - chardet=4.0.0=py310h06a4308_1003
  - comm=0.1.2=py310h06a4308_0
  - debugpy=1.6.7=py310h6a678d5_0
  - defusedxml=0.7.1=pyhd3eb1b0_0
  - entrypoints=0.4=py310h06a4308_0
  - icu=73.1=h6a678d5_0
  - idna=3.4=py310h06a4308_0
  - ipykernel=6.25.0=py310h2f386ee_0
  - ipython_genutils=0.2.0=pyhd3eb1b0_1
  - ipywidgets=8.0.4=py310h06a4308_0
  - jinja2=3.1.2=py310h06a4308_0
  - json5=0.9.6=pyhd3eb1b0_0
  - jsonschema=4.17.3=py310h06a4308_0
  - jupyter_client=7.4.9=py310h06a4308_0
  - jupyter_core=5.3.0=py310h06a4308_0
  - jupyter_events=0.6.3=py310h06a4308_0
  - jupyter_server=1.23.4=py310h06a4308_0
  - jupyter_server_fileid=0.9.0=py310h06a4308_0
  - jupyter_server_ydoc=0.8.0=py310h06a4308_1
  - jupyter_ydoc=0.2.4=py310h06a4308_0
  - jupyterlab=3.6.3=py310h06a4308_0
  - jupyterlab_pygments=0.1.2=py_0
  - jupyterlab_server=2.22.0=py310h06a4308_0
  - jupyterlab_widgets=3.0.5=py310h06a4308_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.4.4=h6a678d5_0
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libsodium=1.0.18=h7b6447c_0
  - libstdcxx-ng=11.2.0=h1234567_1
  - libuuid=1.41.5=h5eee18b_0
  - libxml2=2.10.4=hf1b16e4_1
  - libxslt=1.1.37=h5eee18b_1
  - lxml=4.9.2=py310h5eee18b_0
  - matplotlib-inline=0.1.6=py310h06a4308_0
  - mistune=0.8.4=py310h7f8727e_1000
  - nbclassic=0.5.5=py310h06a4308_0
  - nbclient=0.5.13=py310h06a4308_0
  - nbconvert=6.5.4=py310h06a4308_0
  - nbformat=5.7.0=py310h06a4308_0
  - ncurses=6.4=h6a678d5_0
  - nest-asyncio=1.5.6=py310h06a4308_0
  - notebook=6.5.4=py310h06a4308_1
  - notebook-shim=0.2.2=py310h06a4308_0
  - openssl=3.0.10=h7f8727e_2
  - packaging=23.1=py310h06a4308_0
  - pandocfilters=1.5.0=pyhd3eb1b0_0
  - parso=0.8.3=pyhd3eb1b0_0
  - pexpect=4.8.0=pyhd3eb1b0_3
  - pickleshare=0.7.5=pyhd3eb1b0_1003
  - pip=23.2.1=py310h06a4308_0
  - platformdirs=3.10.0=py310h06a4308_0
  - prometheus_client=0.14.1=py310h06a4308_0
  - ptyprocess=0.7.0=pyhd3eb1b0_2
  - pure_eval=0.2.2=pyhd3eb1b0_0
  - pycparser=2.21=pyhd3eb1b0_0
  - pyopenssl=23.2.0=py310h06a4308_0
  - pyrsistent=0.18.0=py310h7f8727e_0
  - pysocks=1.7.1=py310h06a4308_0
  - python=3.10.12=h955ad1f_0
  - python-dateutil=2.8.2=pyhd3eb1b0_0
  - python-fastjsonschema=2.16.2=py310h06a4308_0
  - python-json-logger=2.0.7=py310h06a4308_0
  - pyzmq=23.2.0=py310h6a678d5_0
  - readline=8.2=h5eee18b_0
  - requests=2.31.0=py310h06a4308_0
  - rfc3339-validator=0.1.4=py310h06a4308_0
  - rfc3986-validator=0.1.1=py310h06a4308_0
  - send2trash=1.8.0=pyhd3eb1b0_1
  - setuptools=68.0.0=py310h06a4308_0
  - six=1.16.0=pyhd3eb1b0_1
  - sniffio=1.2.0=py310h06a4308_1
  - soupsieve=2.4=py310h06a4308_0
  - sqlite=3.41.2=h5eee18b_0
  - stack_data=0.2.0=pyhd3eb1b0_0
  - terminado=0.17.1=py310h06a4308_0
  - tinycss2=1.2.1=py310h06a4308_0
  - tk=8.6.12=h1ccaba5_0
  - tomli=2.0.1=py310h06a4308_0
  - tornado=6.3.2=py310h5eee18b_0
  - typing-extensions=4.7.1=py310h06a4308_0
  - typing_extensions=4.7.1=py310h06a4308_0
  - urllib3=1.26.16=py310h06a4308_0
  - webencodings=0.5.1=py310h06a4308_1
  - websocket-client=0.58.0=py310h06a4308_4
  - wheel=0.38.4=py310h06a4308_0
  - widgetsnbextension=4.0.5=py310h06a4308_0
  - xz=5.4.2=h5eee18b_0
  - y-py=0.5.9=py310h52d8a92_0
  - yaml=0.2.5=h7b6447c_0
  - ypy-websocket=0.8.2=py310h06a4308_0
  - zeromq=4.3.4=h2531618_0
  - zlib=1.2.13=h5eee18b_0
  - pip:
      - absl-py==1.4.0
      - accelerate==0.22.0
      - addict==2.4.0
      - aiohttp==3.8.5
      - aiosignal==1.3.1
      - aliyun-python-sdk-core==2.13.36
      - aliyun-python-sdk-kms==2.16.1
      - asttokens==2.4.0
      - async-timeout==4.0.3
      - attrs==23.1.0
      - cachetools==5.3.1
      - charset-normalizer==3.2.0
      - cmake==3.27.2
      - crcmod==1.7
      - cryptography==41.0.3
      - dashscope==1.8.1
      - dataclasses-json==0.5.14
      - datasets==2.13.0
      - decorator==4.4.2
      - diffusers==0.20.2
      - dill==0.3.6
      - einops==0.6.1
      - exceptiongroup==1.1.3
      - executing==1.2.0
      - faiss-gpu==1.7.2
      - filelock==3.12.3
      - frozenlist==1.4.0
      - fsspec==2023.9.0
      - gast==0.5.4
      - google-auth==2.22.0
      - google-auth-oauthlib==1.0.0
      - greenlet==2.0.2
      - grpcio==1.57.0
      - huggingface-hub==0.16.4
      - imageio==2.31.3
      - imageio-ffmpeg==0.4.8
      - importlib-metadata==6.8.0
      - iniconfig==2.0.0
      - ipython==8.15.0
      - jedi==0.19.0
      - jmespath==0.10.0
      - langchain==0.0.283
      - langsmith==0.0.33
      - lit==16.0.6
      - markdown==3.4.4
      - markupsafe==2.1.3
      - marshmallow==3.20.1
      - modelscope==1.9.0
      - moviepy==1.0.3
      - mpmath==1.3.0
      - ms-swift==1.0.0
      - multidict==6.0.4
      - multiprocess==0.70.14
      - mypy-extensions==1.0.0
      - networkx==3.1
      - numexpr==2.8.5
      - numpy==1.25.2
      - nvidia-cublas-cu11==11.10.3.66
      - nvidia-cuda-cupti-cu11==11.7.101
      - nvidia-cuda-nvrtc-cu11==11.7.99
      - nvidia-cuda-runtime-cu11==11.7.99
      - nvidia-cudnn-cu11==8.5.0.96
      - nvidia-cufft-cu11==10.9.0.58
      - nvidia-curand-cu11==10.2.10.91
      - nvidia-cusolver-cu11==11.4.0.1
      - nvidia-cusparse-cu11==11.7.4.91
      - nvidia-nccl-cu11==2.14.3
      - nvidia-nvtx-cu11==11.7.91
      - oauthlib==3.2.2
      - openai==0.28.0
      - opencv-python==4.8.0.76
      - oss2==2.18.1
      - pandas==2.1.0
      - peft==0.5.0
      - pillow==10.0.0
      - pluggy==1.3.0
      - proglog==0.1.10
      - prompt-toolkit==3.0.39
      - protobuf==4.24.2
      - psutil==5.9.5
      - pyarrow==13.0.0
      - pyasn1==0.5.0
      - pyasn1-modules==0.3.0
      - pycryptodome==3.18.0
      - pydantic==1.10.8
      - pygments==2.16.1
      - pytest==7.4.1
      - python-dotenv==1.0.0
      - pytz==2023.3.post1
      - pyyaml==6.0.1
      - regex==2023.8.8
      - requests-oauthlib==1.3.1
      - rsa==4.9
      - safetensors==0.3.3
      - scipy==1.11.2
      - simplejson==3.19.1
      - sortedcontainers==2.4.0
      - soundfile==0.12.1
      - sqlalchemy==2.0.20
      - stack-data==0.6.2
      - sympy==1.12
      - tenacity==8.2.3
      - tensorboard==2.14.0
      - tensorboard-data-server==0.7.1
      - tokenizers==0.13.3
      - torch==2.0.1
      - tqdm==4.66.1
      - traitlets==5.9.0
      - transformers==4.33.0
      - transformers-stream-generator==0.0.4
      - triton==2.0.0
      - typing-inspect==0.9.0
      - tzdata==2023.3
      - wcwidth==0.2.6
      - werkzeug==2.3.7
      - xxhash==3.3.0
      - yapf==0.40.1
      - yarl==1.9.2
      - zipp==3.16.2
prefix: /home/hangyu5/anaconda3/envs/modelscope

这个项目可以使用Qwen-14B作为核心模型吗？

如果可以的话，要怎么修改配置？

cannot import name 'ModelScopeEmbeddings' from 'langchain.embeddings'

安装的是langchain==0.0.27

调用同一个工具重复多次

response = agent.run("从下面的地址中找到省市区等元素，地址：上海市静安区大宁音乐广场三期A3栋403室", print_info=True)

最后的输出如下：

[{'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}, {'result': {'city': '上海市', 'district': '静安区', 'poi': '大宁音乐广场', 'subpoi': '三期', 'houseno': 'A3栋'}}]

模型没有正确结束，这种情况是模型训练的问题吗？

请问可以CPU执行吗，或者模型如何设置量化推理

前端接不到回复的问题

问第一个问题后台没回应，问第二个问题才会回复第一个问题的答案，然后问第三个问题也是才会回复第二个问题的答案，以此类推，然后前端也是没有收到回复。

安装dashscope失败

ERROR: Could not find a version that satisfies the requirement dashscope (from versions: none)
ERROR: No matching distribution found for dashscope
安装requirements文件时，安装dashscope失败。如何解决

生成性能问题

在3090上用cuda跑，在agent的run函数中添加了打印结果和显示时间的代码

generate prompt and call llm

prompt = self.prompt_generator.generate(llm_result, exec_result)
start = time.time()
llm_result = self.llm.generate(prompt)
end = time.time()
print(f"===llm_result: {llm_result}, time: {end-start}")

结果发现生成下面几个字符都要20s时间。
===llm_result: 明天杭州的天气预计是晴天，需要注意防晒哦。, time: 20.340392589569092

性能这么慢不正常吧，完全没法使用，请问可能是什么原因导致。
另外flash_attn安装成功，但是rms_norm 和rotary两个库没有编成功，编译的时候一直卡住。

请问有没有自动化评估的脚本

运行demo_modelscopegpt_agent时出错

当我运行demo_modelscopegpt_agent代码运行到以下步骤时出现错误，请问一下如何解决？

Webui

可以提供一个webUI吗？

quickstart启动报错

AttributeError: partially initialized module 'cv2' has no attribute 'dnn' (most likely due to a circular import)

'Message': '获取模型版本失败，信息：获取模失败',

基于Qwen-7B-Base模型微调后，推理生成均为乱码

我利用该框架基于Qwen-7B模型进行了SFT微调，之后以Agent的形式进行加载、推理，之后发现模型输出的均为乱码。

我观察到Qwen的base版与Agent版的modeling.py等代码以及config.json均有较大差异，我也尝试更换为base版的config但是会报错AttributeError: 'QWenConfig' object has no attribute 'padded_vocab_size'，请问如果我要自己基于Qwen训练一个agent模型该如何正确使用呢？

近期没有更新了吗

近期关于agent没有更新了吗

CUDA报错，gpu报错

s/transformers_modules/ModelScope-Agent-7B/modeling.py", line 1208, in forward
output = self._norm(x.float()).type_as(x)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

训练Agent模型的数据格式是怎么样的？

我尝试在Swift框架的基础上微调一个Agent模型，阅读源码之后我理解模型的输入为完整的文本内容包括（system、user、assistant），而label为所有assistant内容，这是为什么呢？

输入里为什么依旧存在assistant的内容？如果我去掉这一部分会有什么影响吗？希望能够得到解答！谢谢！

还有就是，Agent模型的微调可以直接用其他框架的SFT脚本吗（比如DeepSpeed）？还是说在优化任务上是有区别的？

这个训练微调对自定义功能API tool的调用识别极低

基于PAI-DSW-GPU环境执行命令报错

测试demo：demo_qwen_agent.ipynb
用的实例自带的python版本：3.8

项目目前是没有适配chatglm3吗

利用swift微调出来的glm3-6b在这个项目上的表现很差, 基本没有效果, 但是微调时的infer效果看着还行

dashscope上没有通义千问7b trial apply的申请入口。

Apply to activate the DashScope service, go to 模型广场 -> 通义千问开源系列 -> apply for a trial of 通义千问7B. The free quota is 100,000 tokens. 这条道路并走不通。没有申请试用入口，acess denied.

demo_chatgpt.ipynb运行报错

cannot import name 'DEFAULT_CHATGPT_PROMPT_TEMPLATE' from 'modelscope_agent.prompt'

demo_modelscopegpt_agent.ipynb 报错

运行agent.run时提示如下报错：
Code: 403, status: AccessDenied, message: Access denied.

已经提供两个key

【错误】在运行demo里的demo_qwen_agent.ipynb时出现问题

部署在本地的jupyter。
完全是跟着demo走的。
在运行到run阶段得到如下错误

请问可以提供demo的界面代码吗？

这个demo的演示代码可以开源吗？

百川2运行demo报错：AttributeError: 'BaichuanModel' object has no attribute 'future_mask'

修改配置后，在demo_qwen_agent.ipynb之前一切正常，但在运行：

重置对话，清空对话历史
agent.reset()
agent.run('调用插件，查询北京市明天的天气', remote=False)

报错：
File ~/modelscope-agent/demo/../modelscope_agent/agent.py:143, in AgentExecutor.run(self, task, remote, print_info)
141 # generate prompt and call llm
142 prompt = self.prompt_generator.generate(llm_result, exec_result)
--> 143 llm_result = self.llm.generate(prompt)
144 if print_info:
145 print(f'|prompt{idx}: {prompt}')

File ~/modelscope-agent/demo/../modelscope_agent/llm/local_llm.py:72, in LocalLLM.generate(self, prompt)
69 response = self.model.chat(
70 self.tokenizer, prompt, history=[], system='')[0]
71 else:
---> 72 response = self.chat(prompt)
74 end_idx = response.find(self.end_token)
75 if end_idx != -1:

File ~/modelscope-agent/demo/../modelscope_agent/llm/local_llm.py:95, in LocalLLM.chat(self, prompt)
91 input_ids = self.tokenizer(
92 prompt, return_tensors='pt').input_ids.to(device)
93 input_len = input_ids.shape[1]
---> 95 result = self.model.generate(
96 input_ids=input_ids, generation_config=self.generation_cfg)
98 result = result[0].tolist()[input_len:]
99 response = self.tokenizer.decode(result)

File ~/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py:1648, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs)
1640 input_ids, model_kwargs = self._expand_inputs_for_generation(
1641 input_ids=input_ids,
1642 expand_size=generation_config.num_return_sequences,
1643 is_encoder_decoder=self.config.is_encoder_decoder,
1644 **model_kwargs,
1645 )
1647 # 13. run sample
-> 1648 return self.sample(
1649 input_ids,
1650 logits_processor=logits_processor,
1651 logits_warper=logits_warper,
1652 stopping_criteria=stopping_criteria,
1653 pad_token_id=generation_config.pad_token_id,
1654 eos_token_id=generation_config.eos_token_id,
1655 output_scores=generation_config.output_scores,
1656 return_dict_in_generate=generation_config.return_dict_in_generate,
1657 synced_gpus=synced_gpus,
1658 streamer=streamer,
1659 **model_kwargs,
1660 )
1662 elif generation_mode == GenerationMode.BEAM_SEARCH:
1663 # 11. prepare beam search scorer
1664 beam_scorer = BeamSearchScorer(
1665 batch_size=batch_size,
1666 num_beams=generation_config.num_beams,
(...)
1671 max_length=generation_config.max_length,
1672 )

File ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py:2730, in GenerationMixin.sample(self, input_ids, logits_processor, stopping_criteria, logits_warper, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
2727 model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
2729 # forward pass to get next token
-> 2730 outputs = self(
2731 **model_inputs,
2732 return_dict=True,
2733 output_attentions=output_attentions,
2734 output_hidden_states=output_hidden_states,
2735 )
2737 if synced_gpus and this_peer_finished:
2738 continue # don't waste resources running the code we don't need

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.conda/envs/LLM/lib/python3.10/site-packages/accelerate/hooks.py:165, in add_hook_to_module..new_forward(*args, **kwargs)
163 output = old_forward(*args, **kwargs)
164 else:
--> 165 output = old_forward(*args, **kwargs)
166 return module._hf_hook.post_forward(module, output)

File ~/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Base/modeling_baichuan.py:688, in BaichuanForCausalLM.forward(self, input_ids, attention_mask, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict, **kwargs)
683 return_dict = (
684 return_dict if return_dict is not None else self.config.use_return_dict
685 )
687 # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn)
--> 688 outputs = self.model(
689 input_ids=input_ids,
690 attention_mask=attention_mask,
691 past_key_values=past_key_values,
692 inputs_embeds=inputs_embeds,
693 use_cache=use_cache,
694 output_attentions=output_attentions,
695 output_hidden_states=output_hidden_states,
696 return_dict=return_dict,
697 )
699 hidden_states = outputs[0]
700 logits = self.lm_head(hidden_states)

File ~/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Base/modeling_baichuan.py:401, in BaichuanModel.forward(self, input_ids, attention_mask, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
399 alibi_mask = self.alibi_mask
400 else:
--> 401 alibi_mask = self.get_alibi_mask(inputs_embeds, seq_length_with_past)
403 if attention_mask is not None:
404 if len(attention_mask.shape) == 2:

File ~/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Base/modeling_baichuan.py:351, in BaichuanModel.get_alibi_mask(self, tensor, seq_length_with_past)
343 self.max_cache_pos = seq_length_with_past
344 self.register_buffer(
345 "future_mask",
346 _gen_alibi_mask(tensor, self.n_head, self.max_cache_pos).to(
(...)
349 persistent=False,
350 )
--> 351 mask = self.future_mask[
352 : self.n_head, :seq_length_with_past, :seq_length_with_past
353 ]
354 return mask

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1614, in Module.getattr(self, name)
1612 if name in modules:
1613 return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
1615 type(self).name, name))

AttributeError: 'BaichuanModel' object has no attribute 'future_mask'

config/cfg_tool_template.json文件格式错误

调用dashscope模型时，模型无法按照json格式回答工具调用信息

demo_register_new_tool.md
附件为.ipynb文件，请修改扩展名后查看。

问题描述：
本地没有GPU，所以采用dashscope调用模式，调试用的notebook请见附件。
模型没有返回json格式的工具信息，而是一段文本。导致后续无法实现工具调用。

请问如何调用远程的llm

文档和代码里好像都是针对本地调用，请问怎么调用部署在远程的llm（比如部署在远程服务器上的qwen和modelscope-agent）呢？

默认的脚本运行不通吗

{'error': 'Action call error: modelscope_text-address: {'input': '浙江杭州市江干区九堡镇三村村一区'}. \n Error message: Remote call failed with error code: 401, error message: {"Code":10990101002,"Message":"接口调用参数错误，信息：record not found","RequestId":"9d345990-f5e4-4fdb-b0d9-3dfb66a477fa","Success":false}'}

在没有给出参数的情况下，modelscope-agent模型凭空捏造了参数并执行了tool

你好，我自定义了一个简单的tool，代码如下：

from .tool import Tool

class HighPrivilegeTool(Tool):
    description = '创建一个高权限借用工单'
    name = 'create_high_privilege_tool'
    parameters: list = [{
        'name': 'user_id',
        'description': '用户名',
        'required': True
    }, {
        'name': 'ip',
        'description': 'ip地址',
        'required': True
    }]
    def _local_call(self, *args, **kwargs):
        user_id = kwargs['user_id']
        ip = kwargs['ip']
        return {'result': f'已为{user_id}创建了一个ip为{ip}的高权限借用工单'}

然后测试了一下agent.run的效果，发现如果我传入了完整的参数，执行没有问题，但是如果没有给出参数，大模型在让我给出参数的同时，自己凭空捏造了两个参数并执行了tool。

修改模型文件默认下载地址

尝试运行demo_qwen_agent.ipynb会触发模型文件的下载，默认下载地址是/home下的.cache文件夹，请问能修改这个默认下载地址到任意路径吗？

"addmm_impl_cpu_" not implemented for 'Half'

我在执行例子的时候 agent.run('pip install的时候有些包下载特别慢怎么办')
报如下错误：
File ~/modelscope-agent/modelscope_agent/agent.py:143, in AgentExecutor.run(self, task, remote, print_info)
141 # generate prompt and call llm
142 prompt = self.prompt_generator.generate(llm_result, exec_result)
--> 143 llm_result = self.llm.generate(prompt)
144 if print_info:
145 print(f'|prompt{idx}: {prompt}')

File ~/modelscope-agent/modelscope_agent/llm/local_llm.py:69, in LocalLLM.generate(self, prompt)
66 def generate(self, prompt):
68 if self.custom_chat and self.model.chat:
---> 69 response = self.model.chat(
70 self.tokenizer, prompt, history=[], system='')[0]
71 else:
72 response = self.chat(prompt)

File ~/.cache/huggingface/modules/transformers_modules/MSAgent-Qwen-7B/modeling_qwen.py:1010, in QWenLMHeadModel.chat(self, tokenizer, query, history, system, append_history, stream, stop_words_ids, **kwargs)
1006 stop_words_ids.extend(get_stop_words_ids(
1007 self.generation_config.chat_format, tokenizer
1008 ))
1009 input_ids = torch.tensor([context_tokens]).to(self.device)
-> 1010 outputs = self.generate(
1011 input_ids,
1012 stop_words_ids = stop_words_ids,
1013 return_dict_in_generate = False,
1014 **kwargs,
1015 )
1017 response = decode_tokens(
1018 outputs[0],
1019 tokenizer,
(...)
1024 errors='replace'
1025 )
1027 if append_history:

File ~/.cache/huggingface/modules/transformers_modules/MSAgent-Qwen-7B/modeling_qwen.py:1119, in QWenLMHeadModel.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, **kwargs)
1116 else:
1117 logits_processor.append(stop_words_logits_processor)
-> 1119 return super().generate(
1120 inputs,
1121 generation_config=generation_config,
1122 logits_processor=logits_processor,
1123 stopping_criteria=stopping_criteria,
1124 prefix_allowed_tokens_fn=prefix_allowed_tokens_fn,
1125 synced_gpus=synced_gpus,
1126 assistant_model=assistant_model,
1127 streamer=streamer,
1128 **kwargs,
1129 )

File ~/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File ~/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/generation/utils.py:1588, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, **kwargs)
1580 input_ids, model_kwargs = self._expand_inputs_for_generation(
1581 input_ids=input_ids,
1582 expand_size=generation_config.num_return_sequences,
1583 is_encoder_decoder=self.config.is_encoder_decoder,
1584 **model_kwargs,
1585 )
1587 # 13. run sample
-> 1588 return self.sample(
1589 input_ids,
1590 logits_processor=logits_processor,
1591 logits_warper=logits_warper,
1592 stopping_criteria=stopping_criteria,
1593 pad_token_id=generation_config.pad_token_id,
1594 eos_token_id=generation_config.eos_token_id,
1595 output_scores=generation_config.output_scores,
1596 return_dict_in_generate=generation_config.return_dict_in_generate,
1597 synced_gpus=synced_gpus,
1598 streamer=streamer,
1599 **model_kwargs,
1600 )
1602 elif is_beam_gen_mode:
1603 if generation_config.num_return_sequences > generation_config.num_beams:

File ~/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/generation/utils.py:2642, in GenerationMixin.sample(self, input_ids, logits_processor, stopping_criteria, logits_warper, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
2639 model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
2641 # forward pass to get next token
-> 2642 outputs = self(
2643 **model_inputs,
2644 return_dict=True,
2645 output_attentions=output_attentions,
2646 output_hidden_states=output_hidden_states,
2647 )
2649 if synced_gpus and this_peer_finished:
2650 continue # don't waste resources running the code we don't need

File ~/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.cache/huggingface/modules/transformers_modules/MSAgent-Qwen-7B/modeling_qwen.py:925, in QWenLMHeadModel.forward(self, input_ids, past_key_values, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, labels, use_cache, output_attentions, output_hidden_states, return_dict)
903 def forward(
904 self,
905 input_ids: Optional[torch.LongTensor] = None,
(...)
918 return_dict: Optional[bool] = None,
919 ) -> Union[Tuple, CausalLMOutputWithPast]:
921 return_dict = (
922 return_dict if return_dict is not None else self.config.use_return_dict
923 )
--> 925 transformer_outputs = self.transformer(
926 input_ids,
927 past_key_values=past_key_values,
928 attention_mask=attention_mask,
929 token_type_ids=token_type_ids,
930 position_ids=position_ids,
931 head_mask=head_mask,
932 inputs_embeds=inputs_embeds,
933 encoder_hidden_states=encoder_hidden_states,
934 encoder_attention_mask=encoder_attention_mask,
935 use_cache=use_cache,
936 output_attentions=output_attentions,
937 output_hidden_states=output_hidden_states,
938 return_dict=return_dict,
939 )
940 hidden_states = transformer_outputs[0]
942 lm_logits = self.lm_head(hidden_states)

File ~/.cache/huggingface/modules/transformers_modules/MSAgent-Qwen-7B/modeling_qwen.py:766, in QWenModel.forward(self, input_ids, past_key_values, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, use_cache, output_attentions, output_hidden_states, return_dict)
756 outputs = torch.utils.checkpoint.checkpoint(
757 create_custom_forward(block),
758 hidden_states,
(...)
763 encoder_attention_mask,
764 )
765 else:
--> 766 outputs = block(
767 hidden_states,
768 layer_past=layer_past,
769 attention_mask=attention_mask,
770 head_mask=head_mask[i],
771 encoder_hidden_states=encoder_hidden_states,
772 encoder_attention_mask=encoder_attention_mask,
773 use_cache=use_cache,
774 output_attentions=output_attentions,
775 )
777 hidden_states = outputs[0]
778 if use_cache is True:

File ~/.cache/huggingface/modules/transformers_modules/MSAgent-Qwen-7B/modeling_qwen.py:523, in QWenBlock.forward(self, hidden_states, layer_past, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, use_cache, output_attentions)
510 def forward(
511 self,
512 hidden_states: Optional[Tuple[torch.FloatTensor]],
(...)
519 output_attentions: Optional[bool] = False,
520 ):
521 layernorm_output = self.ln_1(hidden_states)
--> 523 attn_outputs = self.attn(
524 layernorm_output,
525 layer_past=layer_past,
526 attention_mask=attention_mask,
527 head_mask=head_mask,
528 use_cache=use_cache,
529 output_attentions=output_attentions,
530 )
531 attn_output = attn_outputs[0]
533 outputs = attn_outputs[1:]

File ~/.cache/huggingface/modules/transformers_modules/MSAgent-Qwen-7B/modeling_qwen.py:367, in QWenAttention.forward(self, hidden_states, layer_past, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions, use_cache)
355 def forward(
356 self,
357 hidden_states: Optional[Tuple[torch.FloatTensor]],
(...)
364 use_cache: Optional[bool] = False,
365 ):
--> 367 mixed_x_layer = self.c_attn(hidden_states)
368 query, key, value = mixed_x_layer.split(self.split_size, dim=2)
370 query = self._split_heads(query, self.num_heads, self.head_dim)

File ~/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

启动WEBUI直接退出

启动WEBUI 直接退出，这个谁碰到过？
D:\anaconda3\envs\modelscope-agent\python.exe D:\langchain\modelscope-agent-master\demo\msgpt\app.py
2023-10-16 14:23:04,813 - modelscope - INFO - PyTorch version 2.1.0 Found.
2023-10-16 14:23:04,814 - modelscope - INFO - Loading ast index from C:\Users\myisa.cache\modelscope\ast_indexer
2023-10-16 14:23:04,890 - modelscope - INFO - Loading done! Current index file version is 1.9.2, with md5 ca4e0c0b233f8f58f1eb5d0245d03938 and a total number of 941 components indexed
2023-10-16 14:23:06,884 - modelscope - INFO - Use user-specified model revision: v1.0.0
Flash attention will be disabled because it does NOT support fp32.
Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention

Process finished with exit code -1073741819 (0xC0000005)

使用dashscope设置为qwen-14b-chat报错

Unknown action: 'modelscope_text-generation'.

当添加配置信息时，报错：File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 647, in gen_wrapper yield from f(args, **kwargs) File "/content/modelscope-agent/apps/agentfabric/app.py", line 386, in preview_send_message user_agent = _state['user_agent'] KeyError: 'user_agent'

Error:weather api token must be acquired through https://lbs.amap.com/api/webservice/guide/create-project/get-key and set by AMAP_TOKEN, with detail: Traceback (most recent call last):
File "/content/modelscope-agent/apps/agentfabric/app.py", line 20, in init_user
user_agent = init_user_chatbot_agent(uuid_str)
File "/content/modelscope-agent/apps/agentfabric/user_core.py", line 57, in init_user_chatbot_agent
agent = AgentExecutor(
File "/content/modelscope-agent/modelscope_agent/agent.py", line 55, in init
self._init_tools(tool_cfg, additional_tool_list)
File "/content/modelscope-agent/modelscope_agent/agent.py", line 89, in _init_tools
self.tool_list[tool_name] = tool_class(tool_cfg)
File "/content/modelscope-agent/modelscope_agent/tools/amap_weather.py", line 27, in init
assert self.token != '', 'weather api token must be acquired through '
AssertionError: weather api token must be acquired through https://lbs.amap.com/api/webservice/guide/create-project/get-key and set by AMAP_TOKEN

|LLM inputs in round 1:
[{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': '你现在要扮演一个制造 AI 角色（ AI-Agent ）的 AI 助手（ QwenBuilder ）。\n 你需要和用户进行对话，明确用户对 AI-Agent 的要求。并根据已有信息和你的联想能力，尽可能填充完整的配置文件： \n\n 配置文件为 json 格式： \n{"name": "... # AI-Agent 的名字", "description": "... # 对 AI-Agent 的要求，简单描述", "instructions": "... # 分点描述对 AI-Agent 的具体功能要求，尽量详细一些，类型是一个字符串数组，起始为 []", "prompt_recommend": "... # 推荐的用户将对 AI-Agent 说的指令，用于指导用户使用 AI-Agent ，类型是一个字符串数组，请尽可能补充 4 句左右，起始为 ["你可以做什么？"]", "logo_prompt": "... # 画 AI-Agent 的 logo 的指令，不需要画 logo 或不需要更新 logo 时可以为空，类型是 string"}\n\n 在接下来的对话中，请在回答时严格使用如下格式，先作出回复，再生成配置文件，不要回复其他任何内容： \nAnswer: ... # 你希望对用户说的话，用于询问用户对 AI-Agent 的要求，不要重复确认用户已经提出的要求，而应该拓展出新的角度来询问用户，尽量细节和丰富，禁止为空 \nConfig: ... # 生成的配置文件，严格按照以上 json 格式 \nRichConfig: ... # 格式和核心内容和 Config 相同，但是保证 name 和 description 不为空； instructions 需要在 Config 的基础上扩充字数，使指令更加详尽，如果用户给出了详细指令，请完全保留；补充 prompt_recommend ，并保证 prompt_recommend 是推荐的用户将对 AI-Agent 说的指令。请注意从用户的视角来描述 prompt_recommend、description 和 instructions。\n\n 一个优秀的 RichConfig 样例如下： \n{"name": "小红书文案生成助手", "description": "一个专为小红书用户设计的文案生成助手。", "instructions": "1. 理解并回应用户的指令； 2. 根据用户的需求生成高质量的小红书风格文案； 3. 使用表情提升文本丰富度", "prompt_recommend": ["你可以帮我生成一段关于旅行的文案吗？", "你会写什么样的文案？", "可以推荐一个小红书文案模版吗？"], "logo_prompt": "一个写作助手 logo ，包含一只羽毛钢笔"}\n\n\n 明白了请说“好的。”，不要说其他的。'}, {'role': 'assistant', 'content': '好的。'}, {'role': 'user', 'content': '请实现一个数据库数据校验功能的单元测试用例，如果数据库不存在该数据，则可以新增。\n 如果数据库已经存在该数据，则新增失败。\n 输入:\n 该功能的接口类 com.xwbank.test.analysis.service.task.IntegrationExecConfigService。\n 该功能的接口方法 saveOrUpdateConfig。\n 该方法接收 1 个参数，方法参数： IntegrationExecConfigEntity entity。\n 参数 entity 的属性都具有 get/set 方法。\nget/set 方法包含的属性有 Long taskId ， Long userId ， String caseType ， Integer testlinkProjectId ， testlinkProjectName testlinkProjectName,String testlinkProjectPrefix,Integer testlinkSuiteId ， String testlinkSuitePath ， Integer testlinkPlanId ， String testlinkPlanName ， Integer testlinkBuildId ， String testlinkBuildName ， Long id ， Date createdTime ， Date updatedTime。\n\n\nWorkflows:\n\n 输出:\n 该方法 saveOrUpdateConfig 的返回值类型为： void 。\n\n\n 该方法 saveOrUpdateConfig 包含如下逻辑:\n 参数 entity 不能为空。\n 参数 entity 的 taskId 和 userId 不能为空 \n 参数 entity 的 userId 和 taskId 在数据库中组合唯一，当参数 entity 的 userId 和 taskId 在数据库中存在相同记录时，取出该记录的 id 复制给 entity 的 id。\n 参数 entity 的 caseType 只能为 1 或者 2.\n 参数 entity 的 testlinkSuitePath ，字符串格式为/xxx/xxx/xxx ， xxx 代表非空的字符串。\n 取出第一个 xxx 作为 testlinkProjectName ， testlinkProjectName 赋值给参数 entity 的 testlinkProjectName。通过 TestProject testlinkProject = testlinkService.api()\n.getTestProjectByName(testlinkProjectName)查询 testProject 对象， testlinkProject.getPrefix()赋值给参数 entity 的 testlinkProjectPrefix ， testlinkProject.getId()赋值给参数 entityt 的 testlinkProjectId ， \n 完成赋值操作后执行保存或更新 entity。\n 参数 entity 的 id 属性为空时，执行保存 entity 操作， id 属性不为空时执行更新 entity 操作。\n 结合自己的代码经验和该场景特点, 撰写代码, 需注意如下要点:\n 理解用户输入的关键词对应的异常场景, 思考该场景的单元测试用例。\n 注意不用使用 java 语言实现该功能，只需使用 spring-boot 和 junit 写单元测试用例即可。'}]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 406, in call_prediction
output = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 226, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1554, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1206, in call_function
prediction = await utils.async_iteration(iterator)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 517, in async_iteration
return await iterator.anext()
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 510, in anext
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 493, in run_sync_iterator_async
return next(iterator)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 647, in gen_wrapper
yield from f(args, **kwargs)
File "/content/modelscope-agent/apps/agentfabric/app.py", line 386, in preview_send_message
user_agent = _state['user_agent']
KeyError: 'user_agent'

案例运行出错

import os
from modelscope.utils.config import Config
from modelscope_agent.llm import LLMFactory
from modelscope_agent.agent import AgentExecutor
from modelscope_agent.prompt import MSPromptGenerator

get cfg from file, refer the example in config folder

model_cfg_file = os.getenv('MODEL_CONFIG_FILE', 'config/cfg_model_template.json')
model_cfg = Config.from_file(model_cfg_file)
tool_cfg_file = os.getenv('TOOL_CONFIG_FILE', 'config/cfg_tool_template.json')
tool_cfg = Config.from_file(tool_cfg_file)

instantiation LLM

model_name = 'modelscope-agent-7b'
llm = LLMFactory.build_llm(model_name, model_cfg)

prompt generator

prompt_generator = MSPromptGenerator()

instantiation agent

agent = AgentExecutor(llm, tool_cfg, prompt_generator=prompt_generator)

运行这个的时候出现了如下错误，怎么解决

运行例子报错

File "/home/xs/.local/lib/python3.8/site-packages/modelscope/utils/config.py", line 43, in missing
raise KeyError(name)
KeyError: 'modelscope-agent-qwen-7b'

如何配置 offload_folder 参数？

运行 demo_qwen_agent.ipynb 报下面这个错误，请问配置 offload_folder 能解决吗？怎么配置？

The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.

ModelScopeGPT和modelscope-agent-7b是一个模型吗？

您好，我在本地测试基于modelscope-agent-7b作为大脑的Agent时，其效果与魔搭创空间上ModelScopeGPT的在线测试差距明显。

比如本地测试时模型并不能处理“写一个20字以内的故事，再念出来”这样需要多步调用工具的单句请求，只有多次请求才能实现；而ModelScopeGPT却可以执行得很好，同时速度也要快很多。

请问ModelScopeGPT所基于的LLM是什么呢？和modelscope-agent-7b同样是在Qwen-7B基础上微调的吗？

如果尚不能透露具体信息，也想了解一下底座模型以及参数规模是什么，期待您的回复！