Giter VIP home page Giter VIP logo

finetune-anything's Introduction

Introduction

The Segment Anything Model (SAM) has revolutionized computer vision. Relying on fine-tuning of SAM will solve a large number of basic computer vision tasks. We are designing a class-aware one-stage tool for training fine-tuning models based on SAM.

You need to supply the datasets for your tasks and the supported task name, this tool will help you to get a finetuned model for your task. You are also allowed to design your own extend-SAM model, and FA supply the training, testing and deploy process for you.

Design

Finetune-Anything further encapsulates the three parts of the original SAM, i.e., Image Encoder Adapter, Prompt Encoder Adapter, and Mask Decoder Adatper. We will support the base extend-SAM model for each task. Users also could design your own customized modules in each adapter, use FA to design different adapters, and set whether the parameters of any module are fixed. For modules with unfixed parameters, parameters such as lr, weight decay can be set to coordinate with the fine-tuning of the model. check details in How_to_use. For example, MaskDecoder is encapsulated as MaskDecoderAdapter. The current MaskDecoderAdatper contains two parts, DecoderNeck and DecoderHead.

Supported Tasks

  • Semantic Segmentation
    • train
    • eval
    • test
  • Matting
  • Instance Segmentation
  • Detection

Supported Datasets

  • TorchVOCSegmentation
  • BaseSemantic
  • BaseInstance
  • BaseMatting

Deploy

  • Onnx export

Support Plan

FA will be updated in the following order,

  • Mattng (task)
  • Prompt Part (structure)
  • MobileSAM (model)
  • Instance Segmentation (task)

Usage

finetune-anything(FA) supports the entire training process of SAM model fine-tuning, including the modification of the model structure, as well as the model training, verification, and testing processes. For details, check the How_to_use, the Quick Start gives an example of quickly using FA to train a custom semantic segmentation model.

Quick Start

Install

  • Step1
git clone https://github.com/ziqi-jin/finetune-anything.git
cd finetune-anything
pip install -r requirements.txt
  • Step2 Download the SAM weights from SAM repository

  • Step3 Modify the contents of yaml file for the specific task in /config, e.g., ckpt_path, model_type ...

Train

CUDA_VISIBLE_DEVICES=${your GPU number} python train.py --task_name semantic_seg

One more thing

If you need to use loss, dataset, or other functions that are not supported by FA, please submit an issue, and I will help you to implement them. At the same time, developers are also welcome to develop new loss, dataset or other new functions for FA, please submit your PR (pull requests).

Related Resources

how to use with docker

# build
docker build -t whuzfb/fa .
# ipc=host is for dataloader, this will enable the dataloader to use GPU shared memory (only works on Windows subsystem for Linux 2)
# shm-size=20gb is similar to ipc=host, but it is more safe
docker run --rm -it --gpus all -u $(id -u):$(id -g) --ipc=host -v $(pwd):/workspace --entrypoint /bin/bash whuzfb/fa
docker run --rm -it --gpus all -u $(id -u):$(id -g) --shm-size=20gb -v $(pwd):/workspace --entrypoint /bin/bash whuzfb/fa
# run in docker
python3 train.py --task_name semantic_seg

finetune-anything's People

Contributors

zfb132 avatar zhaoxiaodong789 avatar ziqi-jin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.