Giter VIP home page Giter VIP logo

nlpforua / ua-llm Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 0.0 146 KB

The entry point for adapting, training, evaluating, and leveraging various Large Language Models (LLMs) for a wide range of Ukrainian NLP tasks.

License: Apache License 2.0

Python 100.00%
benchmark data-annotation evaluation gpt large-language-models llama llm mistral natural-language-processing natural-language-understanding

ua-llm's Introduction

UA-LLM: LLMs for Ukrainian NLP

Codacy grade GitHub License GitHub last commit GitHub Repo stars GitHub watchers

This repository contains complete pipelines, as well as scripts and individual components, that help to adapt, train, evaluate, and leverage various Large Language Models (LLMs) for a wide range of Ukrainian NLP tasks.

Installation

To use the code in this repository, you need to have Conda installed on your system. You can then create a new Conda environment with the required dependencies by running the following command:

conda env create -f environment.yml

This will create a new Conda environment called ua-llm with all the required dependencies installed. You can then activate this environment by running the following command:

conda activate ua-llm
pre-commit install  # required only for contributors

Usage

Get your access credentials from LLM provider.

Go to the source code directory:

cd src

Fill in the model configs with obtained credentials. See example command below for OpenAI provider.

python tools/set_auth.py --openai_org_id <your_org> --openai_api_key <your_api_key>

or use the command similar to the following one for any other supported LLM provider (run with --help to see the arguments list).

python tools/set_auth.py --cohere_api_key <your_api_key>

Run example task to get predictions and evaluate them (Note, that you will be charged by LLM provider for each request, so it's better to run the next command with trial subscription or api key):

python main.py +task=qa_eval_task

As a result, you will get the following output:

eval_task

Feel free to explore evaluation task config and other files in configs/ directory to get more details about the task and how it works. You may also want to learn more about Hydra framework and OmegaConf.

Now you are ready to use your own datasets for evaluation or adapt any supported task for your specific needs.

Supported LLMs

  • OpenAI GPT models
  • Cohere Command
  • Replicate.com models (LLaMA, Mistral, etc.)

Supported tasks

  • Context-based Question Answering (CBQA)
python main.py +task=qa_predict_task
  • Context-based Question Answering data generation/annotation
python main.py +task=qa_annotate_task

Contributing

If you would like to contribute to this repository, please create a new branch and submit a pull request with your changes. Any contributions are welcome!

License

This repository is licensed under the Apache 2.0 License. See the LICENSE file for more details.

ua-llm's People

Contributors

niksyromyatnikov avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.