syno8 Goto Github PK
Type: User
Location: Beijing, China
Type: User
Location: Beijing, China
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Code for ALBEF: a new vision-language pre-training method
alpaca中文指令微调数据集
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据,目前仍不断扩充)、多种训练效率方法(如lora,p-tuning)以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究平台。同时tabular_llm分支构建了面向表格智能任务的LLM。
This repo includes ChatGPT prompt curation to use ChatGPT better.
BELLE: Bloom-Enhanced Large Language model Engine(开源中文对话大模型-70亿参数)
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
ChatGLM-6B:开源双语对话语言模型
chatglm 6b 大模型微调
骆驼:A Chinese finetuned instruction LLaMA. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Making big AI models cheaper, easier, and more scalable
This repository contains the code to reproduce the experiments of the poster "Supervised Contrastive Learning for Product Matching"
Python package for performing Entity and Text Matching using Deep Learning.
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Firefly(流萤): 中文对话式大语言模型(全量微调+QLoRA),支持微调Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya、Bloom等大模型
Apply Iprompt on GLM with innovative new methods. Currently support Chinese QA, English QA and Chinese poem generation.
This repository contains the code and data download links to reproduce the experiments of the PVLDB paper "Dual-Objective Fine-Tuning of BERT for Entity Matching" by Ralph Peeters and Christian Bizer.
LAVIS - A One-stop Library for Language-Vision Intelligence
Inference code for LLaMA models
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Question and Answer based on Anything.
Your GenAI Second Brain 🧠 A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented generation.
Transformers with Arbitrarily Large Context
SC-Block is a supervised contrastive blocking method which combines supervised contrastive learning for positioning records in an embedding space and nearest neighbour search for candidate set building.
Aligning pretrained language models with instruction data generated by themselves.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.