Giter VIP home page Giter VIP logo

baidubce / app-builder Goto Github PK

View Code? Open in Web Editor NEW
339.0 28.0 88.0 29.26 MB

appbuilder-sdk, 千帆AppBuilder-SDK帮助开发者灵活、快速的搭建AI原生应用

Home Page: https://appbuilder.cloud.baidu.com/

License: Apache License 2.0

Python 77.45% Jupyter Notebook 12.10% Shell 0.29% Go 3.64% Java 6.52%
ai-native erniebot large-language-models llm llms qianfan agent appbuilder assistant-api rag

app-builder's Issues

query改写效果异常

为什么appbuilder sdk 的query改写不是对最后的query进行改写,而是直接回答了。
27F120F0C6E85524AFFF25A42FA5BDBF

文档解析功能返回401

文档解析在使用手册中是免费使用的,但我使用示例代码进行文档解析,返回权限错误。这是什么原因呢?而我使用同样的API KEY进行TTS是可以正常返回结果的。
以下是具体报错信息:
Traceback (most recent call last):
File "/home/yxliu56/code/work/app-builder/docprase.py", line 42, in
parse_result = parser(msg, return_raw=True)
File "/home/yxliu56/code/work/app-builder/appbuilder/core/component.py", line 94, in call
return self.run(*inputs, **kwargs)
File "/home/yxliu56/code/work/app-builder/appbuilder/core/components/doc_parser/doc_parser.py", line 126, in run
self.check_response_header(response)
File "/home/yxliu56/code/work/app-builder/appbuilder/core/component.py", line 149, in check_response_header
raise BaseRPCException(message)
appbuilder.core._exception.BaseRPCException: request_id=30d05399-ea69-4b10-93be-0dcebb70e255 , http status code is 401, body is {"requestId":"30d05399-ea69-4b10-93be-0dcebb70e255","code":216003,"message":"Authentication error: ( Unsupported authorization header type )"}

以下是我的调用代码
`import os
os.environ["APPBUILDER_TOKEN"] = "我的百度apikey,是复制的SDK密钥中的API KEY"
from appbuilder.core.components.doc_parser.doc_parser import DocParser
from appbuilder.core.components.doc_splitter.doc_splitter import DocSplitter
from appbuilder.core.message import Message

进行文档内容解析

file_path = "text.pptx" # 待解析的文件路径
msg = Message(file_path)
parser = DocParser()
parse_result = parser(msg)
print(parse_result.content)`

README中,Message几类字写错,应该是Message基类

当前面向开发者提供开放的数据结构,包括Message和Component,方便开发者融入个人已有的大模型应用程序。此部分仍在不断建设中。

消息(Message)
构建大模型应用的统一数据结构,基于Pydantic构建,在不同的Component之间流动。Message几类的默认字段是content,类型是Any。

增加模型列表获取的SDK

import appbuilder

os.environ["APPBUILDER_TOKEN"] = "bce-YOURTOKEN"

print(appbuilder.show_model_list())

期望结果

当前可以使用的模型列表如下:
Erniebot4-8k
Llama2-13b
...

【DocSplitter】设置max_segment_length未生效

  1. 使用的是txt文本文档

  2. 设置的参数如下

splitter = DocSplitter(splitter_type="split_by_chunk", max_segment_length=50, overlap=0,
                       separators=["。", "!", "?", ".", "!", "?", "……", "|\n", "\\n", "\n", ";"])

最终返回结果始终只有一段,且超过设置的50字

cookbook中代码存在bug

cookbooks/text_generation.ipynb
1、cookbook中的模型初始化参数名称,有误;
code:
image
error:

1. model_name参数名称错误
2. 模型名称错误

2、问答对结果不对
code:
image

output:
image

MRC component中的MRC Examples不对

class MRC(CompletionBaseComponent):
"""
阅读理解问答组件,基于大模型进行阅读理解问答,支持拒答、澄清、重点强调、友好性提升、溯源等多种功能,可用于回答用户提出的问题。

Examples:

    .. code-block:: python

        import appbuilder
        os.environ["APPBUILDER_TOKEN"] = '...'

        mrc_component = appbuilder.MRC()

        # 获取功能说明
        instructions = mrc_component.get_instruction_set()

        # 输出功能说明
        for key, value in instructions.items():
            print(f"{key}: {value}")

        # 模拟运行MRC组件,开启澄清和友好性提升功能
        result = mrc_component.run(appbuilder.Message("什么是人工智能?"), clarify=True, friendly=True)

        # 输出运行结果
        print(result)

困扰-文档内嵌套的图片流程图

尊敬的开发者您好,对于文档中的文本类的流程图,没有想到也能够提取文字,但是带来了新的困扰,文档将流程图文本也解析为para_type=text,导致我无法区分哪些是流程图文本,如下图所示,请问有考虑将图片内容提取的文本和其他文本做区分吗,流程图内的文本太碎片,无法区分导致,文档解析后的文本无法阅读
image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.