Comments (8)
Hi, @zhangxgu @drahmad89 , I had included the script to extract concept embeddings for customized concept pool, please have a check. Also, @zhangxgu , your code for extracting concept embeddings seems pretty correct except that you may want to use all prompt templates and do the normalization to the output text features.
from regionclip.
Hi, @drahmad89
You can refer to our demo code here:
https://huggingface.co/spaces/CVPR/regionclip-demo/blob/main/detectron2/modeling/meta_arch/clip_rcnn.py#L755
And you can build a text encoder here:
https://huggingface.co/spaces/CVPR/regionclip-demo/blob/main/detectron2/modeling/meta_arch/clip_rcnn.py#L593
from regionclip.
@jwyang
I wanted to get only bbox on specific classes.
in the documentation:" put it in the folder ./datasets/lvis/lvis_v1_val.json. The file is used to specify object class names."
I replaced this "lvis_v1_val.json" with custom annotation json file that only contains 6 different classes. I ran the zeroshot detection and always getting classes from LVIS dataset. (section:Visualization on custom images)
from regionclip.
@drahmad89 , currently, visualize on custom images does not support user-specific queries. We had built a huggingface demo as shared above which takes one category as the query. Per your request, we will build a customized demo into this repo to support user-specific queries given an image. Please stay tuned!
from regionclip.
Nice work! @jwyang
I also need the code to generate text embeddings of my own dataset. By now I write a code following CLIP like these:
import torch
import clip
from PIL import Image
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("RN50", device=device)
categories = ['XXX','XXX',XXX',]
text = clip.tokenize(["a photo of a %s"%c for c in categories]).to(device)
with torch.no_grad():
text_features = model.encode_text(text).cpu()
torch.save(text_features,'xxx.pth')
Hope you can give me some advice on this code.
from regionclip.
The script output at the end had a dim of (class_len, 1, emb_size)
, it is also still on gpu. I suggest to add torch.squeeze(concept_feats).cpu()
after
RegionCLIP/tools/extract_concept_features.py
Line 101 in 0e5e958
I wasn't too sure of the format of the classes to input in concepts.txt. If it is listed line by line then shouldn't
for line in f:
concept = line.strip()
replace
Just some suggestions :)
from regionclip.
Hi, @Jawing, good suggestion!
from regionclip.
There seems to be a bug with generating concept embeddings in extract_concept_features.py
.
The generated embeddings seem to correspond to the first letter of each concepts, which can be duplicate and isn't what we want. I propose the correction below.
concept_feats = []
with open(concept_file, 'r') as f:
concepts = []
for line in f:
concept = line.strip()
concepts.append(concept)
with torch.no_grad():
token_embeddings_concepts = pre_tokenize(concepts).to(model.device)
for token_embeddings in token_embeddings_concepts:
text_features = model.lang_encoder.encode_text(token_embeddings)
# average over all templates
text_features = text_features.mean(0, keepdim=True)
concept_feats.append(text_features)
from regionclip.
Related Issues (20)
- How to zero-shot inference my own label class instead of COCO or LVIS HOT 3
- The result of RPN is close to 0 for zero-shot inference of own dataset, however, the result of GT is very good. What should I do? HOT 5
- How to train the RPN? HOT 1
- How to apply my own dataset in zero-shot inference HOT 1
- About custom data set RPN training. HOT 3
- Version 'RegionCLIP' is not valid according to PEP 440. HOT 1
- 'Non-existent config key: MODEL.CLIP'. HOT 3
- Reproduction of Region classification in Fig.1 HOT 1
- Pretraining dataset HOT 1
- Demo on Hugging Face not working HOT 4
- 迁移学习训练新类结果很低 HOT 1
- Transfer learning training novel classes results are very low HOT 4
- could you share the scripts spliting coco datasets into base and novel class datasets? and the contents of 'concepts.txt' file? Thanks advance! HOT 1
- How much GPU memory do we need to run RegionCLIP HOT 1
- I found that no 'TEXT_EMB_PATH' and no 'OPENSET_TEST_TEXT_EMB_PATH' are not effective for transferring learning HOT 3
- [Testing Transfer Learning] I cannot reproduce results on Novel classes only HOT 15
- Question on zero-shot inference with ViT based model
- runtime_error, when MODEL.ROI_HEADS.SOFT_NMS_ENABLED is True. HOT 1
- how to train Fully Supervised Object Detection using my own dataset? What is the specific process of training? HOT 1
- May you provide pretrained checkpoints for more backbones? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from regionclip.