Giter VIP home page Giter VIP logo

modelscope's Introduction



PyPI

license open issues GitHub pull-requests GitHub latest commit Leaderboard

English | 中文 |

Introduction

ModelScope is a “Model-as-a-Service” (MaaS) platform that seeks to bring together most advanced machine learning models from the AI community, and to streamline the process of leveraging AI models in real applications. The core ModelScope library enables developers to perform inference, training and evaluation, through rich layers of API designs that facilitate a unified experience across state-of-the-art models from different AI domains.

The Python library offers the layered-APIs necessary for model contributors to integrate models from CV, NLP, Speech, Multi-Modality, as well as Scientific-computation, into the ModelScope ecosystem. Implementations for all these different models are encapsulated within the library in a way that allows easy and unified access. With such integration, model inference, finetuning, and evaluations can be done with only a few lines of codes. In the meantime, flexibilities are provided so that different components in the model applications can be customized as well, where necessary.

Apart from harboring implementations of various models, ModelScope library also enables the necessary interactions with ModelScope backend services, particularly with the Model-Hub and Dataset-Hub. Such interactions facilitate management of various entities (models and datasets) to be performed seamlessly under-the-hood, including entity lookup, version control, cache management, and many others.

Models and Online Demos

Hundreds of models are made publicly available on ModelScope (600+ and counting), covering the latest development in areas such as NLP, CV, Audio, Multi-modality, and AI for Science, etc. Many of these models represent the SOTA in the fields, and made their open-sourced debut on ModelScope. Users can visit ModelScope(modelscope.cn) and experience first-hand how these models perform via online experience, with just a few clicks. Immediate developer-experience is also possible through the ModelScope Notebook, which is backed by ready-to-use cloud CPU/GPU development environment, and is only a click away on ModelScope website.

Some of the representative examples include:

NLP:

Audio:

CV:

Multi-Modal:

AI for Science:

QuickTour

We provide unified interface for inference using pipeline, finetuning and evaluation using Trainer for different tasks.

For any given task with any type of input (image, text, audio, video...), inference pipeline can be implemented with only a few lines of code, which will automatically load the associated model to get inference result, as is exemplified below:

>>> from modelscope.pipelines import pipeline
>>> word_segmentation = pipeline('word-segmentation',model='damo/nlp_structbert_word-segmentation_chinese-base')
>>> word_segmentation('今天天气不错,适合出去游玩')
{'output': '今天 天气 不错 , 适合 出去 游玩'}

Given an image, you can use following code to cut out the human.

image

>>> import cv2
>>> from modelscope.pipelines import pipeline

>>> portrait_matting = pipeline('portrait-matting')
>>> result = portrait_matting('https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/image_matting.png')
>>> cv2.imwrite('result.png', result['output_img'])

The output image is image

For finetuning and evaluation, you need ten more lines of code to construct dataset and trainer, and by calling traner.train() and trainer.evaluate() you can finish finetuning and evaluating a certain model.

For example, we use the gpt3 1.3B model to load the chinese poetry dataset and finetune the model, the resulted model can be used for poetry generation.

>>> from modelscope.metainfo import Trainers
>>> from modelscope.msdatasets import MsDataset
>>> from modelscope.trainers import build_trainer

>>> train_dataset = MsDataset.load('chinese-poetry-collection', split='train'). remap_columns({'text1': 'src_txt'})
>>> eval_dataset = MsDataset.load('chinese-poetry-collection', split='test').remap_columns({'text1': 'src_txt'})
>>> max_epochs = 10
>>> tmp_dir = './gpt3_poetry'

>>> kwargs = dict(
     model='damo/nlp_gpt3_text-generation_1.3B',
     train_dataset=train_dataset,
     eval_dataset=eval_dataset,
     max_epochs=max_epochs,
     work_dir=tmp_dir)

>>> trainer = build_trainer(name=Trainers.gpt3_trainer, default_args=kwargs)
>>> trainer.train()

Why should I use ModelScope library

  1. A unified and concise user interface is abstracted for different tasks and different models. Three lines of code complete the inference, and 10 lines of code complete the model training. It is convenient for users to use different models in multiple fields in the ModelScope community. It is ready to use and easy to get started with AI. and teaching.

  2. Construct a model-centric development and application experience, support model training, inference, export and deployment, and facilitate users to build their own MLOps based on the ModelScope Library.

  3. For the model inference and training process, a modular design is carried out, and a wealth of functional module implementations are provided, which is convenient for users to customize development to customize their own model inference, training and other processes.

  4. For distributed model training, especially for large models, it provides rich training strategy support, including data parallel, model parallel, hybrid parallel and so on.

Installation

Docker

ModelScope Library currently supports tensorflow and pytorch deep learning framework for model training and inference, and it is tested and run on Python 3.7+, Pytorch 1.8+, Tensorflow1.15 or Tensorflow2.0+.

In order to allow everyone to directly use all the models on the ModelScope platform without configuring the environment, ModelScope provides official docker image for developers who need it. Based on the official image, you can skip all environment installation and configuration and use it directly. Currently, the latest version of the CPU image and GPU image we provide can be obtained from the following address

CPU docker image

registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-py37-torch1.11.0-tf1.15.5-1.3.0

GPU docker image

registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.3.0-py37-torch1.11.0-tf1.15.5-1.3.0

Setup Local Python Environment

Also you can setup your local python environment using pip and conda. We suggest to use anaconda to create your python environment:

conda create -n modelscope python=3.7
conda activate modelscope

Then you can install pytorch or tensorflow according to your model requirements.

  • Install pytorch doc
  • Install tensorflow doc

After installing the necessary framework, you can install modelscope library as follows:

If you only want to download models and datasets, install modelscope framework

pip install modelscope

If you want to use multi-modal models:

pip install modelscope[multi-modal]

If you want to use nlp models:

pip install modelscope[nlp] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

If you want to use cv models:

pip install modelscope[cv] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

If you want to use audio models:

pip install modelscope[audio] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

If you want to use science models:

pip install modelscope[science] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

Notes:

  1. Currently, some audio-task models only support python3.7, tensorflow1.15.4 Linux environments. Most other models can be installed and used on windows and Mac (x86).

  2. Some models in the audio field use the third-party library SoundFile for wav file processing. On the Linux system, users need to manually install libsndfile of SoundFile(doc link). On Windows and MacOS, it will be installed automatically without user operation. For example, on Ubuntu, you can use following commands:

    sudo apt-get update
    sudo apt-get install libsndfile1
  3. Some models in computer vision need mmcv-full, you can refer to mmcv installation guide, a minimal installation is as follows:

    pip uninstall mmcv # if you have installed mmcv, uninstall it
    pip install -U openmim
    mim install mmcv-full

Learn More

We provide additional documentations including:

License

This project is licensed under the Apache License (Version 2.0).

modelscope's People

Contributors

wenmengzhou avatar tastelikefeet avatar firmament-cyou avatar bincard avatar wangxingjun778 avatar cathy0908 avatar jiangyuxzy avatar yingdachen avatar huangshenno1 avatar ganjinzero avatar shuaigezhu avatar he350181 avatar pengzhendong avatar xianzhexu avatar co63oc avatar afalf avatar zhanninggao avatar lyblsgo avatar siyang1992 avatar jxst539246 avatar biwen147 avatar dingkun-ldk avatar xu-wenqing avatar gxd1994 avatar jhuang1207 avatar ttcoding avatar roleone123 avatar shenweichaoswc avatar robertsheng avatar qize-yqz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.