Giter VIP home page Giter VIP logo

recognize-any-regions's Introduction

Recognize Any Regions

teaser

Recognize Any Regions
Haosen Yang, Chuofan Ma, Bin Wen, Yi Jiang, Zehuan Yuan, Xiatian Zhu

Updates

  • 2023/11/7: Checkpoints are available on both Google Drive and OneDrive.
  • 2023/11/6: Code is available Now!

Models

Method Box AP_rare Box AP_all Mask AP_rare Mask AP_all Download
RegionSpot-BB 19.1 20.9 17.5 17.8 model
RegionSpot-BL 26.0 23.7 22.8 20.2 model
RegionSpot-BL@336px 26.3 25.0 23.4 21.3 model

Getting Started

The installation instruction and usage are in Getting Started with Recognize Any Regions.

Demo

First download a model checkpoint. Then the model can be used in just a few lines to get masks from a given prompt:

from regionspot.modeling.regionspot import build_regionspot_model
from regionspot import RegionSpot_Predictor
custom_vocabulary =  ['<custom>']
clip_type = <clip_type>
regionspot = build_regionspot_model(checkpoint="<path/to/checkpoint>", custom_vocabulary=custom_vocabulary, clip_type=clip_type)
predictor = RegionSpot_Predictor(regionspot)
predictor.set_image(<your_image>)
masks, mask_iou_score, class_score, class_index = predictor.predict(<input_prompts>)

See the demo.py on using RegionSpot with box prompts for more details. teaser

Citing Recognize Any Regions

If you use Recognize Any Regions in your research or wish to refer to the baseline results published here, please use the following BibTeX entry.

@inproceedings{RegionSpot,
  title={Recognize Any Regions},
  author={Yang, Haosen and Ma, Chuofan and Wen, Bin and Jiang, Yi and Yuan, Zehuan and Zhu, Xiatian},
  journal={arXiv preprint arXiv:2311.01373},
  year={2023}
}

recognize-any-regions's People

Contributors

happy-hsy avatar surrey-uplab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

recognize-any-regions's Issues

关于SAM的Prompt

您好,请问你们在实验中,SAM的prompt输入是point还是bbox呀

Open-source plan

Amazing work! When could the training code be open-sourced? I'm eager to follow your work! Thanks!

询问offline_token

在使用自己的数据集进行训练时,我发现需要一个offline_token:
#read pth
pth_file = os.path.join(self.mask_tokens_dir, os.path.join(dataset_name, str(image_id)+'.pth'))
offline_token = torch.load(pth_file)
请问是需要提前在SAM中得到每个物体的分割embeddings吗?

How to get the annotation file of openimages

_PREDEFINED_SPLITS_OPENIMAGES = {
"openimages_train": ("openimages/detection/", "re_openimages_v6_train_bbox_splitdir_int_ids.json"),
"openimages_val": ("openimages/detection/", "re_openimages_v6_train_bbox_splitdir_int_ids.json"),
}

关于训练脚本

你好,请问你们最近会公布训练脚本吗
我在研究train_net的代码时,好像没有发现从RegionSpot()方法中调用模型的部分

using different sam models

Thanks for the great work!

Have you done any experiment on different sam model such as same_vit_h_4b8939.pth? What would the result be? Any improvement?

Thanks
George

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.