duxiaoman-di / xuanyuan Goto Github PK

View Code? Open in Web Editor NEW

1.0K 1.0K 92.0 14.64 MB

轩辕：度小满中文金融对话大模型

Shell 10.97% Python 89.03%

xuanyuan's People

Contributors

Stargazers

Watchers

Forkers

2132660698 sakuraentropia botoai openai5 18106574249 liukuancn enhaofrank cstty williamdeve mars-wei ai-jie01 flowerinwater csujklinqing flowbywind 2youngmx02 knowledgehacker jackyin68 eric-doug markqiu maolala233 legendzx1986 amarone nuaa quantjia eltociear zhongpei luluxiu yiran4github xingyucn expert68 gaohui7141 zouhuigang crazyivanz wjcwzx danteone1991 tomszhou studypython2016 cartersyc gaoke966 lokicui zsjtiger anyzm fundoop daryayang songym2020 dadao99 157459387 xingchaoet keithplay statsmind eunion dogpandacat liujingmao aqini2006 aiedward cffjfz thinkpeace yinxx shanshu1015 yxg2020 yunshangcao26 tangsonghuai jeremiah0425 aimicm rovedream funsmith jorson-chen trancywang leagend dmldmldmldml yuanxiaoming8899 disguiser9 jakubik2023 adamchentuo littlemosquito123 anshiquanshu66 gderic doctor0 guangchao-wang xtaiyang duoweili dongtianqi1125 heshengjun811 tkohp lcyluke heroman2space huoliangyu yuqisun

xuanyuan's Issues

输入: 介绍下你自己
输出: 猖玺巢玺毡毡磅猖晷帖apis帖湍殄晷毡磅pec刺磅惚毡刺帖殄殄毡磅磅夙窠湍殄惚湍盛玺蒂锥帖磅锥楔湍磅毡磅毡蒂鞣踪锥楔毡疯毡毡帖锥磅窠磅毡蒂磅鞣蒂晷绝毡磅刺锥uga窠dn磅绝玺锥雉毡蒂蒂刺毡雉磅雉窠毡鞣窠窠孤刺晷刺毡蒂蒂绝刺孤猖帖毡磅绝磅磅毡蒂雉晷泪踪毡毡刺骧磅窠蒂盛湍绝疯毡毡盛磅磅刺磅毡夙磅磅GO磅盛uga蒂绝磅磅盛磅毡毡锥ten雉蒂锥锥锥湍雉鞣鞣猖鞣刺玺刺绝蒂惚雉磅窠鞣猖雉磅毡磅绝鹊鞣磅鞣鞣磅绝蒂磅磅帖晷盛毡磅雉磅盛刺晷骧盛磅磅盛帖磅妥毡鞣盛晷绝雉鞣帖夙雉骧磅泪磅猖猖磅磅绝盛磅毡雉晷绝磅磅夙鞣磅晷鞣蒂磅鞣鞣窠酣uga盛毡磅鞣盛妥鞣刺鹊

有遇到相似问题的朋友吗？可以解决吗？

微信二维码已经过期

如标题

When will Ollama be supported

When will Ollama be supported ，because lots of developers run ollama at mac

在FinanceIQ上XUANYUAN-13B指标比XUANYUAN2-70B指标高不少

很好奇这里是什么原因，以及如果XUANYUAN-13B指标这么高，那XUANYUAN2-70B的意义是什么呢？

如何stream方式输出？

我用了transformer的textstreamiter，但是不像其他llama模型可以正常stream输出，xuanyuan似乎输出为空。

请问xuanyuan的stream输出是不是和其他llama模型不一样？

金融评测缺少LLaMA-2-70B的对比

你好，在本工作中金融评测部分虽然选择了许多当前具备代表性的多个开源和可访问模型进行评测，但是并没有看到LLaMA-2-70B模型（而LLaMA-2-7B、13B都有）的评测结果，这是令人诧异的。因为，XuanYuan-70B是基于LLaMA-2-70B继续训练的，从消融研究的角度来看，LLaMA-2-70B的评测是必不可少的。所以想了解这个评测的缺失是否有特殊的原因？

国内已经不能上HuggingFace，请提供一个国内的模型下载地址。

打个包？增加一下文件pdf 或 txt 的解读功能如何？
谢谢！

最低 CUDA 版本要求

在两台不同的机器部署时遇到问题，其中 cuda11.7 环境的机器会报错

CUDA 11.7 运行示例报错

输入: 介绍下你自己
/opt/conda/conda-bld/pytorch_1695392020201/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [19,0,0], thread: [96,0
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[5], line 9
      7 print(f"输入: {content}")
      8 inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
----> 9 outputs = model.generate(**inputs, max_new_tokens=256, do_sample=True, temperature=0.7, top_p=0.95)
     10 outputs = tokenizer.decode(outputs.cpu()[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
     11 print(f"输出: {outputs}")

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py:1652, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs)
   1644     input_ids, model_kwargs = self._expand_inputs_for_generation(
   1645         input_ids=input_ids,
   1646         expand_size=generation_config.num_return_sequences,
   1647         is_encoder_decoder=self.config.is_encoder_decoder,
   1648         **model_kwargs,
   1649     )
   1651     # 13. run sample
-> 1652     return self.sample(
   1653         input_ids,
   1654         logits_processor=logits_processor,
   1655         logits_warper=logits_warper,
   1656         stopping_criteria=stopping_criteria,
   1657         pad_token_id=generation_config.pad_token_id,
   1658         eos_token_id=generation_config.eos_token_id,
   1659         output_scores=generation_config.output_scores,
   1660         return_dict_in_generate=generation_config.return_dict_in_generate,
   1661         synced_gpus=synced_gpus,
   1662         streamer=streamer,
   1663         **model_kwargs,
   1664     )
   1666 elif generation_mode == GenerationMode.BEAM_SEARCH:
   1667     # 11. prepare beam search scorer
   1668     beam_scorer = BeamSearchScorer(
   1669         batch_size=batch_size,
   1670         num_beams=generation_config.num_beams,
   (...)
   1675         max_length=generation_config.max_length,
   1676     )

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py:2734, in GenerationMixin.sample(self, input_ids, logits_processor, stopping_criteria, logits_warper, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
   2731 model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
   2733 # forward pass to get next token
-> 2734 outputs = self(
   2735     **model_inputs,
   2736     return_dict=True,
   2737     output_attentions=output_attentions,
   2738     output_hidden_states=output_hidden_states,
   2739 )
   2741 if synced_gpus and this_peer_finished:
   2742     continue  # don't waste resources running the code we don't need

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
   1516     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1517 else:
-> 1518     return self._call_impl(*args, **kwargs)

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
   1522 # If we don't have any hooks, we want to skip the rest of the logic in
   1523 # this function, and just call forward.
   1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1525         or _global_backward_pre_hooks or _global_backward_hooks
   1526         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527     return forward_call(*args, **kwargs)
   1529 try:
   1530     result = None

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/hooks.py:164, in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    162         output = module._old_forward(*args, **kwargs)
    163 else:
--> 164     output = module._old_forward(*args, **kwargs)
    165 return module._hf_hook.post_forward(module, output)

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:1038, in LlamaForCausalLM.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict)
   1035 return_dict = return_dict if return_dict is not None else self.config.use_return_dict
   1037 # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn)
-> 1038 outputs = self.model(
   1039     input_ids=input_ids,
   1040     attention_mask=attention_mask,
   1041     position_ids=position_ids,
   1042     past_key_values=past_key_values,
   1043     inputs_embeds=inputs_embeds,
   1044     use_cache=use_cache,
   1045     output_attentions=output_attentions,
   1046     output_hidden_states=output_hidden_states,
   1047     return_dict=return_dict,
   1048 )
   1050 hidden_states = outputs[0]
   1051 if self.config.pretraining_tp > 1:

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
   1516     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1517 else:
-> 1518     return self._call_impl(*args, **kwargs)

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
   1522 # If we don't have any hooks, we want to skip the rest of the logic in
   1523 # this function, and just call forward.
   1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1525         or _global_backward_pre_hooks or _global_backward_hooks
   1526         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527     return forward_call(*args, **kwargs)
   1529 try:
   1530     result = None

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:925, in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
    921     layer_outputs = torch.utils.checkpoint.checkpoint(
    922         create_custom_forward(decoder_layer), hidden_states, attention_mask, position_ids
    923     )
    924 else:
--> 925     layer_outputs = decoder_layer(
    926         hidden_states,
    927         attention_mask=attention_mask,
    928         position_ids=position_ids,
    929         past_key_value=past_key_value,
    930         output_attentions=output_attentions,
    931         use_cache=use_cache,
    932         padding_mask=padding_mask,
    933     )
    935 hidden_states = layer_outputs[0]
    937 if use_cache:

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
   1516     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1517 else:
-> 1518     return self._call_impl(*args, **kwargs)

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
   1522 # If we don't have any hooks, we want to skip the rest of the logic in
   1523 # this function, and just call forward.
   1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1525         or _global_backward_pre_hooks or _global_backward_hooks
   1526         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527     return forward_call(*args, **kwargs)
   1529 try:
   1530     result = None

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/hooks.py:159, in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    158 def new_forward(module, *args, **kwargs):
--> 159     args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
    160     if module._hf_hook.no_grad:
    161         with torch.no_grad():

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/hooks.py:290, in AlignDevicesHook.pre_forward(self, module, *args, **kwargs)
    285                 fp16_statistics = self.weights_map[name.replace("weight", "SCB")]
    286         set_module_tensor_to_device(
    287             module, name, self.execution_device, value=self.weights_map[name], fp16_statistics=fp16_statistics
    288         )
--> 290 return send_to_device(args, self.execution_device), send_to_device(
    291     kwargs, self.execution_device, skip_keys=self.skip_keys
    292 )

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/utils/operations.py:160, in send_to_device(tensor, device, non_blocking, skip_keys)
    157     elif skip_keys is None:
    158         skip_keys = []
    159     return type(tensor)(
--> 160         {
    161             k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)
    162             for k, t in tensor.items()
    163         }
    164     )
    165 elif hasattr(tensor, "to"):
    166     try:

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/utils/operations.py:161, in <dictcomp>(.0)
    157     elif skip_keys is None:
    158         skip_keys = []
    159     return type(tensor)(
    160         {
--> 161             k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)
    162             for k, t in tensor.items()
    163         }
    164     )
    165 elif hasattr(tensor, "to"):
    166     try:

File ~/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/utils/operations.py:167, in send_to_device(tensor, device, non_blocking, skip_keys)
    165 elif hasattr(tensor, "to"):
    166     try:
--> 167         return tensor.to(device, non_blocking=non_blocking)
    168     except TypeError:  # .to() doesn't accept non_blocking as kwarg
    169         return tensor.to(device)

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

在 CUDA 11.8 环境运行正常。
可能本项目最低需要 CUDA 11.8 ？

群链接已经失效了，可以再更新下群吗

如题

你好，开源60G高质量中文金融数据有国内下载地址吗？

如题

提示模板的例子？

输入“我叫克拉拉，我是”之后，生成的结果是

“我叫克拉拉，我是小微企业主，想申请一笔贷款，请问银行会考虑哪些方面？

Assistant:作为银行审核贷款申请的一个主要因素，您的企业信用记录以及财务状况通常是考虑的重点。您需要提供一些关键的财务信息，如财务报表、现金流状况、营业额和利润等等，以证明您的企业的稳健经营状况。此外，您的信用历史记录也很重要，包括您曾经向其他银行申请过贷款、信用卡或者其他的财务服务，以及您的信用评分和还款记录等。此外，银行会考虑您的还款能力和还款意愿，以确保您可以按时还款并避免不良的还款记录。最后，您的经营计划、贷款用途和担保措施也会对银行审核贷款申请产生影响。”

看起来像是在随机续写，并没有实现问答？

祝好

可否提供一个gradio的运行demo？

HuggingFace下载XuanYuan-6B-Chat-4bit模型，执行示例代码报错：OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory model/XuanYuan-6B-Chat-4bit.

报错信息：

OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory model/XuanYuan-6B-Chat-4bit.

执行示例代码：

import torch
from transformers import LlamaForCausalLM, AutoTokenizer

model_name_or_path = "model/XuanYuan-6B-Chat-4bit"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = LlamaForCausalLM.from_pretrained(model_name_or_path, device_map="auto")
model.eval()

seps = [" ", "</s>"]
roles = ["Human", "Assistant"]

content = "介绍下你自己"
prompt = seps[0] + roles[0] + ": " + content + seps[0] + roles[1] + ":"
print(f"输入: {content}")
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs, max_new_tokens=256, do_sample=True, temperature=0.7, top_p=0.95
)
outputs = tokenizer.decode(
    outputs.cpu()[0][len(inputs.input_ids[0]) :], skip_special_tokens=True
)
print(f"输出: {outputs}")

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.