Giter VIP home page Giter VIP logo

Comments (8)

vyokky avatar vyokky commented on June 12, 2024 1

Got it. Will look into it and get back to you.

from ufo.

yzhao666 avatar yzhao666 commented on June 12, 2024 1

Hi @yzhao666 , "WinCOM is not supported" is only an warning, so it will not crash your program. I think in your case, the error is because the input token out of range so it does not generate the response. You may need to reduce the prompt size for QWEN.

I think Qwen is still weak for this task, we will try it optimize the prompt to make it doable, but GPT-4V is for sure the best choice.

Hi @vyokky , thanks for your reply.
OK i see. Yes indeed the warning will not crash my program.
I thought the warning would not make my programe use the correct api to take action. Good to hear that it is not a problem.

So I will try to use GPT-4V instead.

Thank you and your team's effort again for making this great project :)

from ufo.

yzhao666 avatar yzhao666 commented on June 12, 2024

Got it. Will look into it and get back to you.

Many thanks for your quick feedback!!!

from ufo.

Mac0q avatar Mac0q commented on June 12, 2024

Can you show us the configuration so we can reproduce it?

from ufo.

yzhao666 avatar yzhao666 commented on June 12, 2024

Can you show us the configuration so we can reproduce it?

Sure. Thanks for your quick reply.
By the way yesterday I met annother error with the code you released . After adding 'api_type' and 'prices' in 'llm\qwen.py' as follows, that problem solved.

class QwenService(BaseService):
    def __init__(self, config, agent_type: str):
        self.config_llm = config[agent_type]
        self.config = config
        **self.api_type = self.config_llm["API_TYPE"].lower()**
        self.max_retry = self.config["MAX_RETRY"]
        **self.prices = self.config["PRICES"]**
        self.timeout = self.config["TIMEOUT"]
        dashscope.api_key = self.config_llm["API_KEY"]
        self.tmp_dir = None

The api request error still exist:

My 'config.yaml':
image
I didn't change anything else in the config.yaml file.

[Update]:

  1. I've checked the description in Qwen-vl developer reference:
    https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.i0
    It says that "Important: VL model currently does not recommend customizing system role"
    image

Is this why qwen-vl can not follow the response examples listed in the system role's prompt message?

  1. I've switched to the lite version config in 'ufo/config/config_dev.yaml' file:
    1715404666518
    Meet the 'open_app_guideline' and 'open_app_comment' key error: solved by copying the open_app_guidline and open_app_comment content from 'base/host_agent.yaml' file to 'lite/host_agent.yaml '. But the "Error making API request' problem" not solved yet.

  2. Good news: I've switched to 'qwen-vl-max' and increased 'MAX_TOKENS' to 6000 in 'config.yaml' file, finally get some reasonable results.
    Bad news: but due to the format of Qwen response is not consistent, always encounter different problems during 'parse_qwen_response’ process.
    --config
    image
    --result1: without '\n' after 'Obeservation'、‘Thoughts’、etc.:
    1715412114745

--result2: content of the 'Plan' in the following figure is not in the same line with 'Plan', thus cannot be splited
image
image

Now I understand why you say "The lite version of the prompt is not fully optimized. To achieve better results, it is recommended that users adjust the prompt according to performance!!!" in the model_worker/readme file. I'll try it. Thanks!

from ufo.

yzhao666 avatar yzhao666 commented on June 12, 2024

Hi @Mac0q
Thanks for quickly fixing the previous bugs so that I have the chance to taste more ablility of UFO.

Now I am stucked in the "WIN32COM is not supported" problem as follows:
image

In line 44 of 'ufo\automator\app_apis\factory.py', The 'win_com_client_mapping' list only cantains 'WINWORD' key, but no keys for other apps like EXCEL, POWERPNT, etc. It will not work when the task is not word-only. It seems that you only released WIN32 API service for WORD for now. Would you release API support for other apps?
image

As for the "Error making API request: Range of input length should be [1, 6000]" error, qwen-vl only support 6k context, so even if the WIN32 API issue were fixed, it seems that I have to use GPT-4V to test the complete functionality right?

from ufo.

vyokky avatar vyokky commented on June 12, 2024

Hi @yzhao666 , "WinCOM is not supported" is only an warning, so it will not crash your program. I think in your case, the error is because the input token out of range so it does not generate the response. You may need to reduce the prompt size for QWEN.

I think Qwen is still weak for this task, we will try it optimize the prompt to make it doable, but GPT-4V is for sure the best choice.

from ufo.

NielHung avatar NielHung commented on June 12, 2024

@yzhao666 我现在也在研究基于qwen的ufo部署,请问方便微信沟通下么(wx:hungtien)

from ufo.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.