Giter VIP home page Giter VIP logo

grounding-dino-batch-inference's People

Contributors

yuwenmichael avatar

Stargazers

 avatar  avatar

Watchers

 avatar

grounding-dino-batch-inference's Issues

non-deterministic behavior with batch size > 1

I noticed this strange behavior: the logits outputs of the model vary slightly with respect to different runs, resulting in non-deterministic behavior. This effect does not happen in case of batch size = 1. Do you know what the reason could be? The difference between the logits (comparing two separate script executions) increases as the batch size increases

Examples in the batch are not processed independently?

Hi everyone, I discovered this strange behavior.

I have 4 images (img_1, img_2, img_3, img_4)

if I run groundingDINO with the following prompts:
["distance . mountains . valley . view .", "man . snow board . trick", "lunch . pizza . they .", "kitchen . refrigerator . "]

then I obtain the following probabilities per class:

{0: {'distance': 0.19821932911872864, 'mountains': 0.7135314345359802, 'valley': 0.42435237765312195, 'view': 0.38242971897125244}, 1: {'man': 0.3701115548610687, 'snow board': 0.31612950563430786, 'trick': 0.21027937531471252}, 2: {'lunch': 0.38231441378593445, 'pizza': 0.7270074486732483, 'they': 0.19436779618263245}, 3: {'kitchen': 0.6813028454780579, 'refrigerator': 0.5736187100410461}}

but if I add the class "pineapple" to the third prompt:

["distance . mountains . valley . view .", "man . snow board . trick", "lunch . pizza . they . pineapple . ", "kitchen . refrigerator . "]

then the probabilities associated with other elements in the batch also change.

{0: {'distance': 0.22729776799678802, 'mountains': 0.7141298651695251, 'valley': 0.43764322996139526, 'view': 0.367383748292923}, 1: {'man': 0.3758210241794586, 'snow board': 0.3222990036010742, 'trick': 0.21733753383159637}, 2: {'lunch': 0.3865318298339844, 'pineapple': 0.040617868304252625, 'pizza': 0.6494675278663635, 'they': 0.21683959662914276}, 3: {'kitchen': 0.6852126717567444, 'refrigerator': 0.5792219042778015}}

It seems the samples in the batch are not processed independently...
Has anyone encountered the same problem or have any suggestions to fix it?
Thanks in advance

Issue with Batch Processing in predict_batch Function

The predict_batch function seems to only process the first image in a batch when generating predictions. Specifically, the lines:

prediction_logits = outputs["pred_logits"].cpu().sigmoid()[0]
prediction_boxes = outputs["pred_boxes"].cpu()[0]

These lines appear to only handle the logits and boxes for the first image in the batch, ignoring the rest.

issues using cuda

Hi does this work with cuda? I'm using:

https://github.com/HumanSignal/label-studio-ml-backend/tree/master/label_studio_ml/examples/grounding_dino

and getting the following:

label-studio-ml-backend | /app/Grounding-DINO-Batch-Inference/GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only! label-studio-ml-backend | warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")

and

output = _C.ms_deform_attn_forward(\\nNameError: name \'_C\' is not defined

None of the images were detected?

I ran:

!python3 inference_gdino.py

got message:

/usr/local/lib/python3.10/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
final text_encoder_type: bert-base-uncased
loading image list file from:  image_paths.txt
total images:13, need detect: 0, skip images: 13
detect: : : 0it [00:00, ?it/s]

image_path.txt:

/workspace/data/dog.jpeg
/workspace/data/dog-2.jpeg
/workspace/data/dog-3.jpeg
/workspace/data/dog-4.jpeg
/workspace/data/dog-5.jpeg
/workspace/data/dog-6.jpeg
/workspace/data/dog-7.jpeg
/workspace/data/dog-8.jpeg
/workspace/data/dogs.jpg
/workspace/data/fox.jpg
/workspace/data/frog.jpg
/workspace/data/panda.jpg
/workspace/data/seal.jpg

i did not use multiple directories as shared by you

let me know if you need any further information from my side
Best,
Andy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.