Comments (8)
Got it. Will look into it and get back to you.
from ufo.
Hi @yzhao666 , "WinCOM is not supported" is only an warning, so it will not crash your program. I think in your case, the error is because the input token out of range so it does not generate the response. You may need to reduce the prompt size for QWEN.
I think Qwen is still weak for this task, we will try it optimize the prompt to make it doable, but GPT-4V is for sure the best choice.
Hi @vyokky , thanks for your reply.
OK i see. Yes indeed the warning will not crash my program.
I thought the warning would not make my programe use the correct api to take action. Good to hear that it is not a problem.
So I will try to use GPT-4V instead.
Thank you and your team's effort again for making this great project :)
from ufo.
Got it. Will look into it and get back to you.
Many thanks for your quick feedback!!!
from ufo.
Can you show us the configuration so we can reproduce it?
from ufo.
Can you show us the configuration so we can reproduce it?
Sure. Thanks for your quick reply.
By the way yesterday I met annother error with the code you released . After adding 'api_type' and 'prices' in 'llm\qwen.py' as follows, that problem solved.
class QwenService(BaseService):
def __init__(self, config, agent_type: str):
self.config_llm = config[agent_type]
self.config = config
**self.api_type = self.config_llm["API_TYPE"].lower()**
self.max_retry = self.config["MAX_RETRY"]
**self.prices = self.config["PRICES"]**
self.timeout = self.config["TIMEOUT"]
dashscope.api_key = self.config_llm["API_KEY"]
self.tmp_dir = None
The api request error still exist:
My 'config.yaml':
I didn't change anything else in the config.yaml file.
[Update]:
- I've checked the description in Qwen-vl developer reference:
https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.i0
It says that "Important: VL model currently does not recommend customizing system role"
Is this why qwen-vl can not follow the response examples listed in the system role's prompt message?
-
I've switched to the lite version config in 'ufo/config/config_dev.yaml' file:
Meet the 'open_app_guideline' and 'open_app_comment' key error: solved by copying the open_app_guidline and open_app_comment content from 'base/host_agent.yaml' file to 'lite/host_agent.yaml '. But the "Error making API request' problem" not solved yet. -
Good news: I've switched to 'qwen-vl-max' and increased 'MAX_TOKENS' to 6000 in 'config.yaml' file, finally get some reasonable results.
Bad news: but due to the format of Qwen response is not consistent, always encounter different problems during 'parse_qwen_response’ process.
--config
--result1: without '\n' after 'Obeservation'、‘Thoughts’、etc.:
--result2: content of the 'Plan' in the following figure is not in the same line with 'Plan', thus cannot be splited
Now I understand why you say "The lite version of the prompt is not fully optimized. To achieve better results, it is recommended that users adjust the prompt according to performance!!!" in the model_worker/readme file. I'll try it. Thanks!
from ufo.
Hi @Mac0q ,
Thanks for quickly fixing the previous bugs so that I have the chance to taste more ablility of UFO.
Now I am stucked in the "WIN32COM is not supported" problem as follows:
In line 44 of 'ufo\automator\app_apis\factory.py', The 'win_com_client_mapping' list only cantains 'WINWORD' key, but no keys for other apps like EXCEL, POWERPNT, etc. It will not work when the task is not word-only. It seems that you only released WIN32 API service for WORD for now. Would you release API support for other apps?
As for the "Error making API request: Range of input length should be [1, 6000]" error, qwen-vl only support 6k context, so even if the WIN32 API issue were fixed, it seems that I have to use GPT-4V to test the complete functionality right?
from ufo.
Hi @yzhao666 , "WinCOM is not supported" is only an warning, so it will not crash your program. I think in your case, the error is because the input token out of range so it does not generate the response. You may need to reduce the prompt size for QWEN.
I think Qwen is still weak for this task, we will try it optimize the prompt to make it doable, but GPT-4V is for sure the best choice.
from ufo.
@yzhao666 我现在也在研究基于qwen的ufo部署,请问方便微信沟通下么(wx:hungtien)
from ufo.
Related Issues (12)
- Local models? HOT 16
- Error making API request: Invalid URL 'YOUR_ENDPOINT': No scheme supplied. HOT 13
- Error for replace() when self.plan is a list HOT 4
- Question: Does it only work with GPT-Vision? Or can it be made to use other visual-input-accepting models as well, like LLaVA?
- Azure API base instruction wrong? HOT 1
- Error making API request: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) HOT 4
- Train or fine-tune models for computer automation agents HOT 4
- Connection not working with AOAI HOT 1
- How to get all user requests HOT 2
- Cost should be different for different models HOT 1
- Will you add API access to support the Google Gemini model? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ufo.